Is it possible to project nodes/relationships to in-memory graph in batches?

catherinepatino · February 22, 2021, 8:30pm

Hello,

I'm trying to reduce a multigraph to a single graph in order to run a community detection algorithm on it. See my discussion here for more context. I solved the crashing/out of memory issues by using apoc.periodic.iterate to create the reduced relationships in batches. However, I don't want these new relationships to be written to the database; I want them to be projected to an in-memory graph to be used for label propagation. My question is: is it possible to project nodes & relationships in batches using apoc.periodic.iterate? I've been playing around trying to use apoc.periodic.iterate within a cypher projection query, something like:

CALL gds.graph.create.cypher(
'myGraph',
//nodequery
'MATCH (n:WebContent {networkId: 0}) RETURN id(n) as id',
//relationshipquery
'CALL apoc.periodic.iterate("MATCH (n1:WebContent {networkId: 0}) RETURN n1", 
"MATCH (n1)-[:Includes|Has_signature]->(m)<-[:Includes|Has_signature]-(n2) 
 WHERE id(n1) < id(n2) WITH n1, n2, count(m) AS weight 
 CREATE (n1)-[r:LINKED_TO]->(n2) SET r.weight=weight", 
 {batchSize:1000}) YIELD batches, total, errorMessages 
RETURN id(n1) AS source, id(n2) AS target, r.weight AS weight')

but no success yet (I'm not sure it's actually possible to access n1, n2, and r from the batches like I'm trying to do here). Just trying to get a sense of if what I'm trying to do is even possible at all... perhaps apoc.periodic.iterate is not the answer, but another method is?

Thanks!

michael.hunger · February 27, 2021, 8:23pm

You cannot use periodic iterate for reading data it's only for writing/updating data.

CALL gds.graph.create.cypher(
'myGraph',
//nodequery
'MATCH (n:WebContent {networkId: 0}) RETURN id(n) as id',
//relationshipquery
'MATCH (n1:WebContent {networkId: 0})-[:Includes|Has_signature]->(m)<-[:Includes|Has_signature]-(n2) 
 WHERE id(n1) < id(n2) 
  RETURN id(n1) as source, id(n2) as target, count(m) AS weight')

If you assign your nodes with the networkId: 0 a label (e.g. with periodic iterate)
then you should be able to use native projection for your in memory graph.
But I have never tried it myself, so can't say if it works.

where you specify the label, the two rel-types and an aggregation-type of COUNT for a property *.

And then in a 2nd step you can use collapse-path:

https://neo4j.com/docs/graph-data-science/current/alpha-algorithms/collapse-path/

CALL gds.graph.create(
    'my-graph',
    'Network0',
    {
        IncludesOut: {
            type: 'Includes',
            orientation: 'NATURAL'
        },
        HasSignatureOut: {
            type: 'Has_signature',
            orientation: 'NATURAL'
        },
        IncludesIn: {
            type: 'Includes',
            orientation: 'REVERSE'
        },
        HasSignatureIn: {
            type: 'Has_signature',
            orientation: 'REVERSE'
        },
    }
)
YIELD graphName, nodeCount, relationshipCount;

https://neo4j.com/docs/graph-data-science/current/management-ops/native-projection/#native-projection-syntax-relationship-projections

https://neo4j.com/docs/graph-data-science/current/management-ops/native-projection/#_relationship_aggregations

Topic		Replies	Views
Apoc.periodic.iterate only writing one batch with parallel Procedures & APOC	4	746	July 29, 2020
Creating Large Number of Edges with `apoc.periodic.iterate` Cypher apoc , performance , cypher	3	412	October 24, 2023
Apoc.periodic.iterate for CREATE relation can not work on large data (500 million) Neo4j Graph Platform migrated	1	170	November 20, 2022
Apoc.periodic.iterate do not work in create large relation Conferences, Meetups, & Events migrated	0	179	January 17, 2023
Creation of relationship in bulk using apoc.periodic.iterate Procedures & APOC apoc , relationship	7	601	September 13, 2023

Is it possible to project nodes/relationships to in-memory graph in batches?

Related topics