cancel
Showing results for 
Search instead for 
Did you mean: 

FastRP Embedding does not change when I change the value of the property

Hello everyone,

I'm trying to embedding a small test graph using the FastRP algorithm. I realized that something was wrong with the outputs of the algorithm and I did an experiment. Below you can see the queries and outputs I ran in the step-by-step experiment.

+++++++++++++++++++++++++++
match (n) return n
+++++++++++++++++++++++++++ Here I got a .csv export and printed the graph.
ID:0{"name":3,"age":18}
ID:1{"name":4,"age":18}
ID:2{"name":2,"age":45653434}
ID:3{"name":5,"price":3,"isbn":1234}
ID:4{"name":6,"price":25,"isbn":4242}
ID:5{"name":1592,"age":15}
ID:6{"name":112,"age":552}
ID:7{"name":992,"age":98}
ID:8{"name":265,"price":4225,"isbn":92222}
ID:9{"name":32,"price":45,"isbn":76123}
+++++++++++++++++++++++++++ Created Graph
CALL gds.graph.create(
'Graph',
{
Book: {
label: 'Book',
properties: {name: {defaultValue: 0}
, age: {defaultValue: 0}
, isbn: {defaultValue: 0}
, price: {defaultValue: 0}
}
}
, Person: {
label: 'Person',
properties: {name: {defaultValue: 0}
, age: {defaultValue: 0}
, isbn: {defaultValue: 0}
, price: {defaultValue: 0}
}
}}, {
READ: {type: 'READ', orientation: 'UNDIRECTED'},
KNOWS: {type: 'KNOWS', orientation: 'UNDIRECTED'}
}
)
+++++++++++++++++++++++++++ Getting embeddings
CALL gds.fastRP.stream('Graph',
{
embeddingDimension: 8,
randomSeed: 42,
propertyRatio:0.5,
featureProperties: ['name', 'isbn', 'price', 'age']
}
)
YIELD nodeId, embedding
+++++++++++++++++++++++++++ Look nodes 2-5. They have same embeddings even if they have different values. I also curios about that but that is not the main question.
0 [0.00454491563141346, -0.011780989356338978, -0.004483168013393879, -0.011822143569588661, 1.3656461238861084, -1.4588100910186768, 0.0, -0.07964159548282623]
1 [0.005971458740532398, -0.007555503398180008, -0.00596466101706028, -0.007839835248887539, 1.221935749053955, -1.548964500427246, 0.0, -0.2981380820274353]
2 [0.0017735038418322802, -0.00546606257557869, -0.0016580449882894754, -0.005467670038342476, 1.4013422727584839, -1.426560401916504, 0.0, -0.01895918697118759]
3 [0.004278191830962896, -0.00784966628998518, -0.004255092237144709, -0.00795010942965746, 1.3368470668792725, -1.474251627922058, 0.0, -0.11198979616165161]
4 [5.154036308852028e-9, -0.00011119296686956659, 0.0001670514466241002, -0.00011122392606921494, 1.4135295152664185, -1.4148913621902466, 0.0, 0.0028145061805844307]
5 [0.0017735038418322802, -0.00546606257557869, -0.0016580449882894754, -0.005467670038342476, 1.4013422727584839, -1.426560401916504, 0.0, -0.01895918697118759]
6 [0.0, -0.00011119296686956659, 0.00016706176393199712, -0.00011122394062113017, 1.413529396057129, -1.4148914813995361, 0.0, 0.0028138780035078526]
7 [0.0028819642029702663, -0.0024262291844934225, -0.0028979266062378883, -0.0029076484497636557, 1.08656644821167, -1.5791696310043335, 0.0, -0.45227059721946716]
8 [0.0026842784136533737, -0.002415937837213278, -0.002701698336750269, -0.0027089177165180445, 1.2149640321731567, -1.5205340385437012, 0.0, -0.2579750716686249]
9 [0.0035455492325127125, -0.010816436260938644, -0.0034817878622561693, -0.010819618590176105, 1.3885842561721802, -1.4376375675201416, 0.0, -0.04071667790412903]
+++++++++++++++++++++++++++ To bring closer nodes 2 and 5, making the name property same.(I know they have same embeddings but purpose of the experiment is effect of the property on the embeddings.
MATCH (n)
WHERE ID(n)=2
SET n.name = 1592
+++++++++++++++++++++++++++ Droping graph
call gds.graph.drop('Graph')
+++++++++++++++++++++++++++
match (n) return n
+++++++++++++++++++++++++++ Make sure that changes are made
ID:0{"name":3,"age":18}
ID:1{"name":4,"age":18}
ID:2{"name":1592,"age":45653434}
ID:3{"name":5,"price":3,"isbn":1234}
ID:4{"name":6,"price":25,"isbn":4242}
ID:5{"name":1592,"age":15}
ID:6{"name":112,"age":552}
ID:7{"name":992,"age":98}
ID:8{"name":265,"price":4225,"isbn":92222}
ID:9{"name":32,"price":45,"isbn":76123}
+++++++++++++++++++++++++++ Creating again
CALL gds.graph.create(
'Graph',
{
Book: {
label: 'Book',
properties: {name: {defaultValue: 0}
, age: {defaultValue: 0}
, isbn: {defaultValue: 0}
, price: {defaultValue: 0}
}
}
, Person: {
label: 'Person',
properties: {name: {defaultValue: 0}
, age: {defaultValue: 0}
, isbn: {defaultValue: 0}
, price: {defaultValue: 0}
}
}}, {
READ: {type: 'READ', orientation: 'UNDIRECTED'},
KNOWS: {type: 'KNOWS', orientation: 'UNDIRECTED'}
}
)
+++++++++++++++++++++++++++ Getting Embeddings again
CALL gds.fastRP.stream('Graph',
{
embeddingDimension: 8,
randomSeed: 42,
propertyRatio:0.5,
featureProperties: ['name', 'isbn', 'price', 'age']
}
)
YIELD nodeId, embedding
+++++++++++++++++++++++++++ They are still same and there is no change?
0 [0.004544912371784449, -0.011780986562371254, -0.004483164753764868, -0.011822140775620937, 1.365635871887207, -1.4588186740875244, 0.0, -0.07966011762619019]
1 [0.005971447564661503, -0.007555491290986538, -0.0059646498411893845, -0.007839823141694069, 1.2219293117523193, -1.5489674806594849, 0.0, -0.29814738035202026]
2 [0.0017735038418322802, -0.00546606257557869, -0.0016580449882894754, -0.005467670038342476, 1.4013299942016602, -1.4265727996826172, 0.0, -0.018983790650963783]
3 [0.004278188105672598, -0.00784966442734003, -0.004255089443176985, -0.00795010756701231, 1.3368388414382935, -1.4742591381072998, 0.0, -0.11200525611639023]
4 [5.153945714653219e-9, -0.00011119296686956659, 0.0001670514466241002, -0.00011122392606921494, 1.4135171175003052, -1.4149036407470703, 0.0, 0.0027898948173969984]
5 [0.0017735038418322802, -0.00546606257557869, -0.0016580449882894754, -0.005467670038342476, 1.4013299942016602, -1.4265727996826172, 0.0, -0.018983790650963783]
6 [0.0, -0.00011119296686956659, 0.00016706176393199712, -0.00011122394062113017, 1.4135169982910156, -1.4149038791656494, 0.0, 0.0027892529033124447]
7 [0.002881960477679968, -0.002426225459203124, -0.00289792288094759, -0.002907644724473357, 1.086564540863037, -1.5791709423065186, 0.0, -0.45227378606796265]
8 [0.0026842716615647078, -0.002415931783616543, -0.0027016920503228903, -0.0027089109644293785, 1.2149615287780762, -1.520534634590149, 0.0, -0.25797808170318604]
9 [0.0035455492325127125, -0.010816436260938644, -0.0034817878622561693, -0.010819618590176105, 1.3885719776153564, -1.4376498460769653, 0.0, -0.04074126109480858]

In fact, the purpose of this experiment was to check whether the distance between the embeddings of 2 different nodes decreases as the values ​​of their properties get closer to each other. But even when I got the first embedding, I saw that nodes with different values ​​(2-5) have the same embedding and I already have a post about it.
Link: FastRP - Different value but same embedding
How come a node's embedding does not change when its property is changed?

Also, I would appreciate if you could help me with my question about different values ​​but the same embeddings that I asked in my other post.

Neo4j Version : neo4j-community-4.3.3
GDS Version : neo4j-graph-data-science-1.8.2.jar

1 REPLY 1

I see the properties you've encoded in your graph, but not the connectivity of your data.

One thing to remember is that - by default - the embeddings of nodes are based on the properties of neighboring nodes, not of the individual nodes themselves. So as long as the properties of the nodes connected to your node of interest remain the same, that nodes embedding will remain the same.

One easy fix is to use the nodeSelfInfluence setting, so that a nodes own properties are incorporated into the embedding (more details here: FastRP - self influence docs)

The iterationWeights parameter controls the depth (how many hops out from each node) of the embedding - it looks like you're using the default setting, but you can tweak that as well.