Correct way to ingest millions of results?

I'm having trouble debugging what's going on in my workflow:

from neo4j import GraphDatabase
def get_results(db):
    q = " ... my query ..."
    driver = GraphDatabase.driver(uri, auth=("neo4j", "pass"))
    db = driver.session(
    with db.begin_transaction() as tx:
        r = tx.run(q)
        tx.success = True
        for r in res:
            process_res(r)

The for loop seems to randomly hang after processing a a few hundred thousand results. My process_res() function is simple enough that I don't think it's causing any problems.

Is this the correct way to ingest millions of results, or is there a better way?

You should take care regarding transactions sizes. Typically 10 - 100k atomic operations (like creating a node, setting a property) are a good tx size. If you're way above that you might exhaust transaction state memory.

Either use client side transaction batching or take a look at apoc.periodic.iterate doing this on the neo4j server itself.

Hello @Rogie :slight_smile:

I wrote a little example to load data in your database, you can adapt it a bit to use it :slight_smile:

Regards,
Cobra