I am trying to compute eigenvector centrality on a very large graph. However, since my graph is bigger than it can fit in memory for the required projection (see this discussion for context), I am down-sampling it for the projection before running the algorithm.
The following is the queries I'm using. The query returns an identical value (2.761963394277773E-4
) for all the nodes, so I'm not sure if there is anything wrong with the query or if this is expected, given how small my downsampling is.
- Creating the projection:
CALL gds.graph.project.cypher(
'downsampledGraph',
'MATCH (n:Person) WHERE rand() < 0.01 RETURN id(n) AS id',
'MATCH (n:Person)-[r:Knows]->(m:Person) WHERE rand() < 0.01 RETURN id(n) AS source, id(m) AS target'
) YIELD graphName;
- Running the algorithm:
CALL apoc.export.csv.query(
"CALL gds.eigenvector.stream('downsampledGraph', {maxIterations:20, tolerance:0.0001})
YIELD nodeId, score
RETURN id(gds.util.asNode(nodeId)) AS personId, score
ORDER BY score DESC",
"file:///eigenvector_results.csv",
{}
) YIELD file
RETURN file;
Here is the head of the eigenvector_result.csv
file:
"personId","score"
"169000","2.761963394277773E-4"
"169018","2.761963394277773E-4"
"170962","2.761963394277773E-4"
"172489","2.761963394277773E-4"
"172502","2.761963394277773E-4"
"173127","2.761963394277773E-4"
"173142","2.761963394277773E-4"
"173171","2.761963394277773E-4"
"176204","2.761963394277773E-4"