How to import a csv file with 100 million nodes without exploring the ram memory

Greetings to all graph lovers (like me)

I want to load a csv file that has 100 million rows which will be transformed into 100 million nodes, I can split the file into several smaller parts and then import it, but I would like to know if there is any way with cypher to import it all without having to divide my file and without exploding my ram memory.

I used the following script but it gave me an error and I ran it from python:

conn.query("""
         LOAD CSV WITH HEADERS FROM 'file:///csv/tags.csv' AS row
         CALL {
             WITH row
             MERGE (:Tag {id: row.id, url: row.url, name: row.name})
         } IN TRANSACTIONS OF 500 ROWS
     """)

In cypher I use it with auto: and it gives me the same error for exceeding the ram.

Cordially greetings

Hi @dairon,

What version are you using? Try with the latest 5.x version. I remember there was a memory tracker bug on early 5 versions.

Hi @bennu_neo i am using version 5.5.0

Hi @dairon,

It was fixed in 5.7. Can you do an upgrade? Otherwise, try with a slotted runtime inside your inner transaction.

Thanks for the clarification, I'm going to upgrade because currently I'm only creating large data collections but all fake to see how the project works.

Thank you

If you want to confirm usability before upgrading (event tho it should be pretty simple), my second comment was around the lines of Cypher query options - Cypher Manual.

Basically, adding CYPHER runtime=slotted at the beginning of the subquery.