Hi Neo4J'ers,
I'm new to this platform and I'm struggling with the varying methods to perform commands. I've read all sorts of previous questions and neo4j docs but nothing seems to work.
The data I'm using is quite large, there's a people csv of ~8 million rows, each with 14 properties and a company csv with ~4.5 million records with 25 properties each.
I've finally managed to create the 12.5million nodes after a lot of trial, error and a lot more waiting by using PERIODIC COMMIT LOAD CSV and upping the memory configs. However, nothing I seem to be able to find allows for the creation of relationships between those nodes. Once they're in the database, no amount of apoc iterate calls seem to be able to handle the volume of data to check and create relationships between.
The relationships I'm trying to create are:
Person is member of company (CompanyID is same on both)
Person is same as other Person (Name + DOB is same). People can be declared more than one time, with different meta data depending on their ties to company.
How do bulk relationships get assigned reliably?
A relationship I'm after works if I've only got a few records, so the logic seems okay.
MATCH (a:PSC),(b:PSC)
WHERE a.name = b.name AND a.dateOfBirth = b.dateOfBirth
CREATE (a)-[r:SAME_PERSON]->(b)
RETURN type(r), r.companyNumber
I can't seem to run this over a periodic commit so it fails quite quickly as the memory runs out.
I've tried apoc iterate too...
CALL apoc.periodic.iterate("MATCH (a:PERSON),(b:Company) WHERE a.companyNumber = b.companyNumber CREATE (a)-[r:IS_PSC_OF { companyNumber: a.companyNumber + '<->' + b.companyNumber }]->(b)",
"RETURN type(r), r.companyNumber",
{batchSize:10000, parallel: true, iterateList:true})
This doesn't work, no matter how much I try to tweak the memory settings. At best it runs for about 16 hours before falling over.
Do I assign the relationships as they're created?
Is this not possible on a Macbook with 16Gig RAM or should I be doing this remotely on a box with a bit more punch?
Sorry for the winding post. I've been at this for a while making terribly slow progress so any help would be appreciated.
Cheers