cancel
Showing results for 
Search instead for 
Did you mean: 

PageRank returns no difference in Recipe graph

gigauser
Graph Buddy

I tried PageRank on this guide:
https://guides.neo4j.com/4.0-intro-graph-algos-exercises/PracticalApplication.html

I listed the score of PG for the largest group of Ingredient node then I only got the same score about 0.15 for all nodes:
CALL gds.pageRank.stream({
nodeQuery:'MATCH (i:Ingredient) WHERE i.ingredient_community = 377
RETURN id(i) as id',
relationshipQuery:'MATCH (s:Ingredient)-[r:COMMONLY_USED_TOGETHER]->(t:Ingredient)
WHERE s.ingredient_community = 377 AND t.ingredient_community = 377
RETURN id(s) as source, id(t) as target,r.score as weight',
relationshipWeightProperty:'weight'})
YIELD nodeId,score
return nodeId, score
ORDER BY score DESC

How can I get more useful PG scores from this graph?

1 ACCEPTED SOLUTION

I have checked the dataset and found the error:

It seems that the query in part 3 should be fixed to:

CALL gds.graph.writeRelationship('ingredient', 'COMMONLY_USED_TOGETHER', 'score')

In the current setup, all relationship weights are null, because we don't export relationship properties from mutated relationships. Seems like when the relationship weights are null, PageRank algorithm dismisses the relationships as non-existent. I will report this to the GDS team. With the current setup, you would have to ignore relationship weights and it would work as well.

CALL gds.pageRank.stream({
nodeQuery:'MATCH (i:Ingredient) WHERE i.ingredient_community = 377
RETURN id(i) as id',
relationshipQuery:'MATCH (s:Ingredient)-[r:COMMONLY_USED_TOGETHER]->(t:Ingredient)
WHERE s.ingredient_community = 377 AND t.ingredient_community = 377
RETURN id(s) as source, id(t) as target,r.score as weight'})
YIELD nodeId,score
return nodeId, score
ORDER BY score DESC

Thanks for letting us know!

View solution in original post

7 REPLIES 7

Hey gigauser,

When you get the same score of 0.15 for all nodes, that means that no relationships have been projected. The PageRank value of 0.15 is the default value for nodes with no incoming relationships.

Do you get any result when you run the following query:

MATCH p=(s:Ingredient)-[r:COMMONLY_USED_TOGETHER]->(t:Ingredient)
RETURN p LIMIT 10

If not, you should repeat Part 1, Part 3, and Part 5 of the guide.

Hi Bratanic,

Thank you for the help. Yes I got the relationships. I wonder whether it is correct to have the relationships both directions.

You can test by loading the dataset. This guide looks no problem outwardly.
I can get all result as it described. But when I looked the inside of the result by listing PG scores, I found some wrong in my graph or in this guide.
CREATE CONSTRAINT ON (r:Recipe) ASSERT r.name IS UNIQUE
CREATE CONSTRAINT ON (i:Ingredient) ASSERT i.name IS UNIQUE
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "https://github.com/neo4j-apps/neuler/raw/master/sample-data/recipes/recipes.csv" AS row
MERGE (r:Recipe{name:row.recipe})
WITH r,row.ingredients as ingredients
UNWIND split(ingredients,',') as ingredient
MERGE (i:Ingredient{name:ingredient})
MERGE (r)-[:CONTAINS_INGREDIENT]->(i)

I have checked the dataset and found the error:

It seems that the query in part 3 should be fixed to:

CALL gds.graph.writeRelationship('ingredient', 'COMMONLY_USED_TOGETHER', 'score')

In the current setup, all relationship weights are null, because we don't export relationship properties from mutated relationships. Seems like when the relationship weights are null, PageRank algorithm dismisses the relationships as non-existent. I will report this to the GDS team. With the current setup, you would have to ignore relationship weights and it would work as well.

CALL gds.pageRank.stream({
nodeQuery:'MATCH (i:Ingredient) WHERE i.ingredient_community = 377
RETURN id(i) as id',
relationshipQuery:'MATCH (s:Ingredient)-[r:COMMONLY_USED_TOGETHER]->(t:Ingredient)
WHERE s.ingredient_community = 377 AND t.ingredient_community = 377
RETURN id(s) as source, id(t) as target,r.score as weight'})
YIELD nodeId,score
return nodeId, score
ORDER BY score DESC

Thanks for letting us know!

What a nice of you! Now I can get the different PG scores from the graph.

Thank you for helping me teach my customers better with this guide.

Dongho.

Hi @bratanic.tomaz , did you get a chance to have it fixed? As right now I am running a fraud detection graph without weights in the relations and the pagerank scores are still all 0.15. I checked and the connection between nodes are there so wonder if it has something to do with the bug you mentioned?

I believe it's the same, if relationship weights are missing, then the relationship is dismissed. However, you can always use a default relationship weight property using either native or cypher projection.

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.