Is it possible to project nodes/relationships to in-memory graph in batches?


I'm trying to reduce a multigraph to a single graph in order to run a community detection algorithm on it. See my discussion here for more context. I solved the crashing/out of memory issues by using apoc.periodic.iterate to create the reduced relationships in batches. However, I don't want these new relationships to be written to the database; I want them to be projected to an in-memory graph to be used for label propagation. My question is: is it possible to project nodes & relationships in batches using apoc.periodic.iterate? I've been playing around trying to use apoc.periodic.iterate within a cypher projection query, something like:

CALL gds.graph.create.cypher(
'MATCH (n:WebContent {networkId: 0}) RETURN id(n) as id',
'CALL apoc.periodic.iterate("MATCH (n1:WebContent {networkId: 0}) RETURN n1", 
"MATCH (n1)-[:Includes|Has_signature]->(m)<-[:Includes|Has_signature]-(n2) 
 WHERE id(n1) < id(n2) WITH n1, n2, count(m) AS weight 
 CREATE (n1)-[r:LINKED_TO]->(n2) SET r.weight=weight", 
 {batchSize:1000}) YIELD batches, total, errorMessages 
RETURN id(n1) AS source, id(n2) AS target, r.weight AS weight')

but no success yet (I'm not sure it's actually possible to access n1, n2, and r from the batches like I'm trying to do here). Just trying to get a sense of if what I'm trying to do is even possible at all... perhaps apoc.periodic.iterate is not the answer, but another method is?


You cannot use periodic iterate for reading data it's only for writing/updating data.

CALL gds.graph.create.cypher(
'MATCH (n:WebContent {networkId: 0}) RETURN id(n) as id',
'MATCH (n1:WebContent {networkId: 0})-[:Includes|Has_signature]->(m)<-[:Includes|Has_signature]-(n2) 
 WHERE id(n1) < id(n2) 
  RETURN id(n1) as source, id(n2) as target, count(m) AS weight')

If you assign your nodes with the networkId: 0 a label (e.g. with periodic iterate)
then you should be able to use native projection for your in memory graph.
But I have never tried it myself, so can't say if it works.

where you specify the label, the two rel-types and an aggregation-type of COUNT for a property *.

And then in a 2nd step you can use collapse-path:

CALL gds.graph.create(
        IncludesOut: {
            type: 'Includes',
            orientation: 'NATURAL'
        HasSignatureOut: {
            type: 'Has_signature',
            orientation: 'NATURAL'
        IncludesIn: {
            type: 'Includes',
            orientation: 'REVERSE'
        HasSignatureIn: {
            type: 'Has_signature',
            orientation: 'REVERSE'
YIELD graphName, nodeCount, relationshipCount;