Is it efficient that creating multiple nodes, create corresponding edges and the merge nodes for paralelize very intensive process ? (To avoid deadlocks)

ugurtosun · January 7, 2022, 9:15am

Hello everyone,

I am working on huge graph over 200M nodes and 500M edges. I will add a new node, and approximately 100M nodes will be connected with new node. I am working on with neo4j community edition. It takes too long time. I cannot paralelize because of deadlock exceptions. I got an idea that create mirror nodes such as newnode1, newnode2, newnode3, newnode4, newnode5 ... Create edges in paralel way such as for batch1 -> newnode1, batch2 ->newnode2, batch3 -> newnode3... Then use apoc.refactor.mergeNodes method for merging temporary new nodes into final new node. Is it logical ? What are the pros and cons ?

Thanks.

david_allen · January 7, 2022, 11:50am

Sorry to disappoint but I would recommend you take a different approach. Yes it's going to be hard to write a single node with 100M nodes connected to it. If you succeed, you're going to have a different problem after that, because you will have created a supernode.

I would recommend not doing what you are trying to do. It is likely you need to choose a different data model that doesn't require a single node attached to 100M other things. In other words, I think the import problem you're running into and the query problems you would have afterwards are symptoms of a needed model change.

For much more information, see this article:

siddharth-gitrepo · February 16, 2025, 9:58am

how about situations where model cannot be altered. For example when trying to load map data (open street map data), one cannot remodel the nodes and relations.

Topic		Replies	Views
Using neo4j module and/or apoc to merge large number of nodes Import / Export	6	100	October 22, 2024
Most effecient way to load up millions of nodes? Neo4j Graph Platform migrated	0	160	November 25, 2022
Creating 50 million nodes in neo4j in fastest way Import / Export apoc , performance , import	4	68	April 9, 2025
Merging data : Deadlocks and performance balance in a heavily connected graph Neo4j Graph Platform performance , deadlock , migrated , cypher-tagged	0	249	December 21, 2022
Merging two nodes running endlessly Neo4j Graph Platform migrated	2	82	September 21, 2022

August Summer Fun!

Is it efficient that creating multiple nodes, create corresponding edges and the merge nodes for paralelize very intensive process ? (To avoid deadlocks)

Related topics