Hi all! We have a broad network, and when traversing the tree, we are attempting to leverage all cores on our machine, parallelising the process (to speed up results!). We have attempted apoc.cypher.mapParallel with little success, and while we can get apoc.periodic.iterate to run across all cores, its performance is still not improved over single threading.
I would love to know if its simply due to query structure (i.e. apoc.path queries should be in the outside statement, not inner) or if there is something else I am missing. The first of the two snippets below runs in parallel with little speed improvement, the second does not even have batch updates resulting.
CALL apoc.periodic.iterate("
MATCH (p:Vertex:Root)
WITH collect(p) as endNodes
MATCH (n:Vertex) WHERE NOT EXISTS(n.depth)
CALL apoc.path.expandConfig(n, {relationshipFilter:'<CONNECTS',
limit:1, terminatorNodes:endNodes})
YIELD path
RETURN n, length(path) as depth
","
SET n.depth = depth
", {batchSize:1000, parallel:true, iterateList:true, concurrency:70})</code>
CALL apoc.periodic.iterate("
MATCH (p:Vertex:Loc)
WITH collect(p) as endNodes
MATCH (n:Vertex) WHERE NOT EXISTS(n.locs)
CALL apoc.path.subgraphNodes(n, {endNodes:endNodes, relationshipFilter:'CONNECTS>', maxLevel:80}) YIELD node
RETURN n, count(node.id) as locs, count(distinct node.dslam) as dslams
","
SET n.locs = locs, n.dslams = dslams
", {batchSize:1000, parallel:true, iterateList:true, concurrency:70})</code>
Indexed on unique id of the nodes. Hope someone more adept at apoc & parallel cypher will be able to guide me in the right direction!!
Sam