Hi there folks .. i am creating a graph where i link the main entity ( title of the document / paragraph ) to child nodes ( all related entities in the doc / para ) ..Simple example
( Albert Einstein ) [ : Discovered ] ( space time )
( Albert Einstein ) [ : Discovered ] ( photoelectric effect )
i am storing some details ( a short summary of the relation between the main node and the child ) and in order to be able to search the relationship vectors i am creating an index on the relationship. So far so good. The problem however is that there could be a whole lot of information about "space time" that i would like to store as the property of the child node
that could mean multiple lines and in order to store them as vectors i will have to either
a) keep a fixed size ( truncate / pad ..since different child nodes will have diff lengths of context )..will lead to loss of info in a lot of cases
b) have "array_vector" as a property where i store the array of embedded vectors of chunked text.
the problem with (b) is the index is specified with a fixed size while creating it and hence , theoretically it will error out when i use an array of lets say vectors of sz 384 instead of a single vector of size 384
c) a terrible hack would be to separate the child nodes further by adding some additional info about each chunk ( for e.g. if the child node has 3 sentences as its property and assuming every sentence turns into a vector of size 384, i will need to create 3 separate child nodes with the same entity + some additional info and separate 384 sized vectors ..this way they will all be indexed )
the only advantage of the above would be that since all 3 would be connected to the main and child entity they should show up in a semantic search and i could then combine all the info
sadly i cant think of any other way to do this, unless the ninja's come to my rescue ..appreciate all the patience and opinions