Page Rank Problem


(Hfero95) #1

Hello, I am trying to use Page Rank algorithm in an amout of data. It is a triplified content in rdf format. The data consists in a group of datasets registered with their metadata, keywords to explain their topics. The idea is to find the most influent in the graph by the keywords.
I first imported the content using the plugin neosemantics to work with the triples inside neo4j. However, using the algorithm it shows innefective as it gives the same score to the datasets. I have tryed even removing all the data and only letting the keywords and datasets and no more metadata. But, even it did not work. I would like to know if it is a problem known or a common mistake I am commiting. Thank in advance everyone trying to help.


(Michael Hunger) #2

Can you share more details?

For a useful result, you'll have to project your graph to a mono-partitite variant of dataset--dataset

e.g. via

call algo.pageRank.stream('MATCH (d:Dataset) return id(d) as id',
'MATCH (d1:Dataset)<-[:KEYWORD]-(kw)-[:KEYWORD]->(d2:Dataset) 
 return id(d1) as source, id(d2) as target, count(kw) as weight', 
{graph:"cypher"})

see the graph algorithms docs for details:

https://neo4j.com/docs/graph-algorithms/current/