How to import a csv file with 100 million nodes without exploring the ram memory

dairon · June 5, 2023, 5:31pm

Greetings to all graph lovers (like me)

I want to load a csv file that has 100 million rows which will be transformed into 100 million nodes, I can split the file into several smaller parts and then import it, but I would like to know if there is any way with cypher to import it all without having to divide my file and without exploding my ram memory.

I used the following script but it gave me an error and I ran it from python:

conn.query("""
         LOAD CSV WITH HEADERS FROM 'file:///csv/tags.csv' AS row
         CALL {
             WITH row
             MERGE (:Tag {id: row.id, url: row.url, name: row.name})
         } IN TRANSACTIONS OF 500 ROWS
     """)

In cypher I use it with auto: and it gives me the same error for exceeding the ram.

Cordially greetings

bennu_neo · June 5, 2023, 8:48pm

Hi @dairon,

What version are you using? Try with the latest 5.x version. I remember there was a memory tracker bug on early 5 versions.

dairon · June 5, 2023, 8:50pm

Hi @bennu_neo i am using version 5.5.0

bennu_neo · June 5, 2023, 9:10pm

Hi @dairon,

It was fixed in 5.7. Can you do an upgrade? Otherwise, try with a slotted runtime inside your inner transaction.

dairon · June 5, 2023, 9:26pm

Thanks for the clarification, I'm going to upgrade because currently I'm only creating large data collections but all fake to see how the project works.

Thank you

bennu_neo · June 5, 2023, 9:37pm

If you want to confirm usability before upgrading (event tho it should be pretty simple), my second comment was around the lines of Cypher query options - Cypher Manual.

Basically, adding CYPHER runtime=slotted at the beginning of the subquery.

Topic		Replies	Views
Load large CSV with LOAD CSV or python Neo4j Graph Platform migrated	2	1103	August 4, 2023
Neo4j Desktop doesn't import all the data from my CSV file General migrated	1	192	December 22, 2022
What is the most efficient and fast way to load very large volumes of data into a Neo4j graph database? Import / Export apoc , cypher , import	2	719	August 19, 2021
I have some questions about importing data Import / Export	4	1072	January 3, 2019
Fastest way to load data in neo4j using python Cypher	5	9766	May 5, 2021

Get Certified in June!

How to import a csv file with 100 million nodes without exploring the ram memory

Related topics