Efficiently delete all the nodes and edges of a very large graph

I would like to delete all the nodes and edges of a very large graph (consisting of millions of nodes and tens of millions of edges). After many searches, I learned the following is the most widely recommended query.

match ()-[r]->() delete r
match (n) delete n

However, running this query, I run into the following error.

Neo.TransientError.General.MemoryPoolOutOfMemoryError: The allocation of an extra 2.0 MiB would use more than the limit 2.0 GiB. Currently using 2.0 GiB. dbms.memory.transaction.total.max threshold reached

  • I can try increasing the memory, but that my graph can grow so big that all the available memory of my runner is not sufficient. So, further increasing memory limits is not an option.

  • Running queries in batches is not an option either.

  • I can delete the whole database as the last solution, but that deletes all the setup (e.g., constraints) and will require my system to reconfigure to access Neo4j under a new UUID (that corresponds to the new data). So, this is not a good option either.

Hi there!

You can delete all the nodes and relationships in your graph via using the procedure from the APOC library to run the query in batches automatically. For example:

CALL apoc.periodic.iterate(
"MATCH (n) RETURN n",
"DETACH DELETE n",
{batchSize:10000})

(if this cannot be done as you said batches aren't an option then refer to next solution)

This will delete all nodes in your graph (and consequently all relationships attached to those nodes) in batches of 10,000.

If you do not have access to APOC you can do something like this (the old fashion way):

MATCH (n)
WITH n
LIMIT 500000
DETACH DELETE n

Where you delete nodes by setting a limit (here 500,000) so you don't hit the dbms.memory.transaction.total.max threshold reached error. Note that this method requires you to run the query over and over until all nodes have been deleted from graph. Also note, by deleting the nodes the relationships will be deleted too. As a relationship cannot exist unless the nodes which the relationship represents actually exist. So this query is entirely sufficient to delete an entire graph (excluding constraints and indexes - as to your wishes!)

I hope this helps! =)

Best Regards,
Rob

Thanks, I prefer the APOC-based solution. I hope this would have been possible without first loading data in the memory.