Hi i am new to neo4j.
I have a fike that contains more then 10k rows, i was trying to push the data to neo4j with the help of technique called batching for this i am using apoc.periodic.iterate. but the issue it in my file there is a column named "region" Which contains 16 distinct values. When i am creating the region node without the use of apoc.periodic.iterate its creating 16 nodes which is correct but as soon as i create the node using with the help of apoc.periodic.iterate its creating 90 nodes.
Here is the code snippet i am using:
CALL apoc.periodic.iterate('
LOAD CSV WITH HEADERS FROM "fike:///test.csv" AS row RETURN row',
'WITH row
WHERE row.regionName IS NOT null
MERGE(r:Region {region_name:row.regionName}) ',
{batchSize:1000, parallel:true, iterateList:true})
Can anyone please help me out with this
You have “parallel” as true. I believe this is causing a race condition where the merge does not always detect an existing node a specific region_name. This is because multiple merge operations with the same region_name can execute concurrently when using parallel as true and these merges will not recognize each other until a node already exists. This is because there is no lock on creating nodes unless you have a uniqueness constraint on the merge property.
The solution is to change the parallel to false or try adding a uniqueness constraint on region_name. Uniqueness seems appropriate since these nodes seem like reference nodes..
1 Like
Thank you for you suggestion, yes it did solve the problem of having duplicate data.
I tried implementing CALL but i got an error this is what it says:
[A query with 'CALL { ... } IN TRANSACTIONS' can only be executed in an implicit transaction, but tried to execute in an explicit transaction.]
I am currently using Neo4j desktop, version:5.12.0
Start the query with :auto
I tried with :auto, i got a new error:
[Invalid input ':': expected
"WITH" (line 2, column 1 (offset: 8))
":auto LOAD CSV WITH HEADERS FROM "file:///test.csv" AS row"
can you post the entire query?
:auto LOAD CSV WITH HEADERS FROM "file:///test.csv" AS row
WITH row WHERE row.id IS NOT NUll
WITH row
MATCH (d:Date {date:"2024-02-26"})-[:HAS_VALUE]-(id:identity {s_id:row.id})
MERGE(id)-[:ID_HAS_COUNTRY]-(c:country {country_name:row.countryName})
you need to start the query with :auto, so move profile to after :auto.