Loading in millions of nodes

J.J.Dekker · February 18, 2022, 3:58pm

Hi everyone,

I'm trying to load in a large dataset with about 6 million rows. on each row, I want to create 3000 relationships to other nodes.

At first, I tried to run each connection 1 by 1 but this took a long time so I try to merge all 3000 relationships in one command but this is still taking too long. I have indexes for all properties I'm merging on. Is there any way to speed up the queries?

an example of a query for a single row:

MATCH (snp:SNP where id(snp)=2222222) MERGE (NA21112:Sample {id: 21112})-[r21112:has_SNP {format: 'gt', genotype: '0|0'}]->(snp) MERGE (NA21113:Sample {id: 21113})-[r21113:has_SNP {format: 'gt', genotype: '0|0'}]->(snp) ... thousands more ... return snp

Please let me know if I can do anything to optimize this. I'm using neo4j 3.5 with the python neo4j driver.

Julian

Topic		Replies	Views
Very slow cypher queries to create relationships Import / Export apoc , performance , browser , relationship	1	1497	December 16, 2020
Relationships take too long - help! Import / Export import	4	369	March 10, 2021
Create relationship between two existing nodes without introducing new nodes Cypher performance , cypher , import	3	8944	December 15, 2019
Upload large amounts of data on Neo4j Community Edition Import / Export	5	1081	February 13, 2020
Importing Relationships / Nodes very slow Import / Export performance , cypher , import	3	1087	March 5, 2020

July Summer Fun!

Loading in millions of nodes

Related topics