Dear neo4j community,
I have two imports, the first of which ran super smoothly and the second of which started off slow and eventually got even slower so I stopped it.
As I'm still pretty new to neo4j and the utilized cypher queries seem pretty similar to me, I was wondering if anyone has a good guess what the reason behind this difference could be.
The first import is for the relationship between patent applications and their associated families (around 100,000,000 split across 3 files), which I imported using this query:
:auto USING PERIODIC COMMIT 25000
LOAD CSV WITH HEADERS FROM 'file:///path/to/file01.txt' AS row
FIELDTERMINATOR ','
MATCH (app:Application {appln_id: toInteger(row.appln_id)})
MATCH (fam:Family {family_id: toInteger(row.docdb_family_id)})
MERGE (app)-[:BELONGS_TO]->(fam);
The second import is for citation relationships (over 100,000,000 in 1 file) between families:
:auto USING PERIODIC COMMIT 25000
LOAD CSV WITH HEADERS FROM 'file:///path/to/file02.txt' AS row
FIELDTERMINATOR ','
MATCH (fam_citing:Family {family_id: toInteger(row.docdb_family_id)})
MATCH (fam_cited:Family {family_id: toInteger(row.cited_docdb_family_id)})
MERGE (fam_citing)-[:CITES]->(fam_cited);
Both Applications
and Families
have constraints on them.
Any tips for getting the second one to work as smoothly as the first one would be greatly appreciated!