Gosforth
(Gosforth)
January 5, 2024, 1:07pm
1
To use this new feature of Neo4j I need vector that is generated outside of Neo4j? Neo4j is not able to generate vectors, right?
Are the vectors generated as a 'set'; If, for example, I have 20 records with product titles, then the vectors are calculated in one session, or can I calculate each next vector independently when adding another title?
Is there any opensource software (offering good quality) that I can use to generate vectors?
Regards
Gosforth
(Gosforth)
January 5, 2024, 4:25pm
2
OK, I found https://sbert.net to generate embeddings. It surprised me how long it is for just one phrase:
Sentence: This framework generates embeddings for each input sentence
Embedding: [-1.37173459e-02 -4.28515263e-02 -1.56286247e-02 1.40537536e-02
3.95537727e-02 1.21796325e-01 2.94333640e-02 -3.17523852e-02
3.54959480e-02 -7.93139860e-02 1.75878443e-02 -4.04369384e-02
4.97259498e-02 2.54912544e-02 -7.18700439e-02 8.14968571e-02
1.47069583e-03 4.79627103e-02 -4.50336039e-02 -9.92175341e-02
-2.81769466e-02 6.45046085e-02 4.44670431e-02 -4.76216935e-02
-3.52952629e-02 4.38671745e-02 -5.28565943e-02 4.33011563e-04
1.01921491e-01 1.64072290e-02 3.26996557e-02 -3.45986746e-02
1.21339597e-02 7.94871002e-02 4.58340580e-03 1.57778189e-02
-9.68208816e-03 2.87625697e-02 -5.05805984e-02 -1.55793950e-02
-2.87906844e-02 -9.62282252e-03 3.15556303e-02 2.27348972e-02
8.71449485e-02 -3.85027491e-02 -8.84718448e-02 -8.75498727e-03
-2.12343019e-02 2.08923742e-02 -9.02077705e-02 -5.25732450e-02
-1.05638448e-02 2.88311020e-02 -1.61455069e-02 6.17842888e-03
-1.23234615e-02 -1.07337013e-02 2.83353925e-02 -5.28567508e-02
-3.58618200e-02 -5.97989149e-02 -1.09055387e-02 2.91566588e-02
7.97979087e-02 -3.27868591e-04 6.83501037e-03 1.32718273e-02
-4.24619652e-02 1.87656768e-02 -9.89234298e-02 2.09049843e-02
-8.69605467e-02 -1.50152165e-02 -4.86202240e-02 8.04414675e-02
-3.67700052e-03 -6.65044412e-02 1.14556760e-01 -3.04228794e-02
2.96631809e-02 -2.80695297e-02 4.64990251e-02 -2.25513801e-02
8.54222998e-02 3.15446891e-02 7.34541416e-02 -2.21862067e-02
-5.29678762e-02 1.27130104e-02 -5.27339615e-02 -1.06188789e-01
7.04731569e-02 2.76736729e-02 -8.05531591e-02 2.39649378e-02
-2.65124813e-02 -2.17330642e-02 4.35275547e-02 4.84711789e-02
-2.37067193e-02 2.85767801e-02 1.11846134e-01 -6.34936020e-02
-1.58318263e-02 -2.26169582e-02 -1.31027587e-02 -1.62071991e-03
-3.60928923e-02 -9.78297070e-02 -4.67729047e-02 1.76271871e-02
-3.97492461e-02 -1.76453308e-04 3.39627527e-02 -2.09633447e-02
6.33663731e-03 -2.59411316e-02 8.10410231e-02 6.14393353e-02
-5.44594601e-03 6.48276061e-02 -1.16844095e-01 2.36860868e-02
-1.32058822e-02 -1.12476468e-01 1.90049298e-02 -1.74659474e-34
5.58949970e-02 1.94244906e-02 4.65438738e-02 5.18645607e-02
3.89390588e-02 3.40541080e-02 -4.32114378e-02 7.90637061e-02
-9.79530066e-02 -1.27441362e-02 -2.91870628e-02 1.02051981e-02
1.88115537e-02 1.08942538e-01 6.63465112e-02 -5.35295196e-02
-3.29228677e-02 4.69827242e-02 2.28882935e-02 2.74114758e-02
-2.91982945e-02 3.12706493e-02 -2.22850777e-02 -1.02282204e-01
-2.79117078e-02 1.13793556e-02 9.06308442e-02 -4.75414395e-02
-1.00718938e-01 -1.23232026e-02 -7.96928555e-02 -1.44636175e-02
-7.76400790e-02 -7.66918808e-03 9.73954704e-03 2.24204548e-02
7.77268112e-02 -3.17159016e-03 2.11538207e-02 -3.30393985e-02
9.55245178e-03 -3.73011641e-02 2.61360556e-02 -9.79088247e-03
-6.31505251e-02 5.77435084e-03 -3.80030796e-02 1.29684685e-02
-1.82498973e-02 -1.56283118e-02 -1.23362627e-03 5.55579476e-02
1.13055787e-04 -5.61256669e-02 7.40165189e-02 1.84452031e-02
-2.66368371e-02 1.31951738e-02 7.50086978e-02 -2.46796981e-02
-3.24006155e-02 -1.57675017e-02 -8.03516526e-03 -5.61318453e-03
1.05687790e-02 3.26163857e-03 -3.91989797e-02 -9.38677490e-02
1.14227176e-01 6.57304600e-02 -4.72633652e-02 1.45087661e-02
-3.54490578e-02 -3.37761492e-02 -5.15506119e-02 -3.81006091e-03
-5.15036434e-02 -5.93429543e-02 -1.69415120e-03 7.42107704e-02
-4.20091674e-02 -7.19974935e-02 3.17249782e-02 -1.66303441e-02
3.96985374e-03 -6.52750582e-02 2.77391095e-02 -7.51649216e-02
2.27456074e-02 -3.91368195e-02 1.54316127e-02 -5.54908291e-02
1.23318667e-02 -2.59520877e-02 6.66423365e-02 -6.91260152e-34
3.31628956e-02 8.47929120e-02 -6.65584058e-02 3.33541594e-02
4.71610529e-03 1.35361804e-02 -5.38694225e-02 9.20694247e-02
-2.96876654e-02 3.16219665e-02 -2.37497278e-02 1.98771022e-02
1.03446223e-01 -9.06947330e-02 6.30625384e-03 1.42886452e-02
1.19293490e-02 6.43727183e-03 4.20104265e-02 1.25344219e-02
3.93019803e-02 5.35691492e-02 -4.30750065e-02 6.10432588e-02
-5.39524917e-05 6.91682398e-02 1.05520058e-02 1.22111803e-02
-7.23184645e-02 2.50469837e-02 -5.18371165e-02 -4.36561853e-02
-6.71818703e-02 1.34828296e-02 -7.25888535e-02 7.04166992e-03
6.58939630e-02 1.08994804e-02 -2.60010571e-03 5.49969189e-02
5.06966710e-02 3.27948183e-02 -6.68832958e-02 6.45557567e-02
-2.52076164e-02 -2.92572007e-02 -1.16696730e-01 3.24064493e-02
5.85858449e-02 -3.51756439e-02 -7.15240166e-02 2.24935859e-02
-1.00786723e-01 -4.74545024e-02 -7.61962906e-02 -5.87166362e-02
4.21138331e-02 -7.47213587e-02 1.98468063e-02 -3.36506357e-03
-5.29736467e-02 2.74729617e-02 3.45737040e-02 -6.11846782e-02
1.06364839e-01 -9.64119881e-02 -4.55944985e-02 1.51489694e-02
-5.13528613e-03 -6.64447322e-02 4.31721583e-02 -1.10405525e-02
-9.80246533e-03 7.53783211e-02 -1.49570815e-02 -4.80208471e-02
5.80726489e-02 -2.43896637e-02 -2.23137774e-02 -4.36992347e-02
5.12054078e-02 -3.28625850e-02 1.08763322e-01 6.08926788e-02
3.30792717e-03 5.53820059e-02 8.43201131e-02 1.27087291e-02
3.84465344e-02 6.52325973e-02 -2.94683687e-02 5.08005284e-02
-2.09348109e-02 1.46135688e-01 2.25561447e-02 -1.77227761e-08
-5.02672568e-02 -2.79259344e-04 -1.00328594e-01 2.42811255e-02
-7.54043609e-02 -3.79139818e-02 3.96050178e-02 3.10079809e-02
-9.05700121e-03 -6.50411770e-02 4.05452810e-02 4.83390130e-02
-4.56962399e-02 4.76004463e-03 2.64364388e-03 9.35613960e-02
-4.02599610e-02 3.27402353e-02 1.18298214e-02 5.54344654e-02
1.48052216e-01 7.21189082e-02 2.76941719e-04 1.68651324e-02
8.34884215e-03 -8.76157451e-03 -1.33649698e-02 6.14237338e-02
1.57167800e-02 6.94960579e-02 1.08621670e-02 6.08018115e-02
-5.33421412e-02 -3.47924270e-02 -3.36272120e-02 6.93906620e-02
1.22987805e-02 -1.45237371e-01 -2.06971867e-03 -4.61132303e-02
3.72748147e-03 -5.59358019e-03 -1.00659840e-01 -4.45953384e-02
5.40921427e-02 4.98893578e-03 1.49534550e-02 -8.26059580e-02
6.26630932e-02 -5.01908455e-03 -4.81857732e-02 -3.53991352e-02
9.03388113e-03 -2.42337845e-02 5.66267148e-02 2.51528900e-02
-1.70709025e-02 -1.24780005e-02 3.19518745e-02 1.38420770e-02
-1.55814895e-02 1.00178257e-01 1.23657234e-01 -4.22967039e-02]
What db schema it should be for example in this tuto?
Something like this?
//title node
(n:Title {text:'some text'})
//paper node
(Paper1:Paper {name:'Paper1'})
//abstract
(Abstract1:Abstract {embedding: ' [-1.37173459e ... 4.22967039e-02]'})
What relations between them (does not matter what names and properties)?
Gosforth
(Gosforth)
January 6, 2024, 4:28pm
3
Some progress but I've got error:
Failed to invoke procedure
db.index.vector.queryNodes: Caused by: java.lang.IllegalArgumentException: Index query vector has 196 dimensions, but indexed vectors have 384.
I created index:
CREATE VECTOR INDEX `abstract-embeddings`
FOR (n: Abstract) ON (n.embedding)
OPTIONS {indexConfig: {
`vector.dimensions`: 384,
`vector.similarity_function`: 'cosine'
}}
My embeddings were created with SBERT, all-MiniLM-L6-v2 model (384 dimensional dense vector space)
But when I run query
MATCH (title:Title)<--(:Paper)-->(abstract:Abstract)
WHERE toLower(title.text) = 'efficient and robust approximate nearest neighbor search using
hierarchical navigable small world graphs'
CALL db.index.vector.queryNodes('abstract-embeddings', 10, abstract.embedding)
YIELD node AS similarAbstract, score
MATCH (similarAbstract)<--(:Paper)-->(similarTitle:Title)
RETURN similarTitle.text AS title, score
I get error as above. What is 'Index query vector'? Where I can set 'dimenstions' parameter for that?
Gosforth
(Gosforth)
January 9, 2024, 10:40am
4
Dear Neo4j staff, will you help please?