I have a CSV file with these data:
latitude,longitude,duplicate_ids
-84.93620649907857,-166.584936,"163,150"
-84.92621120161785,-176.6079961,"94,64,93"
-78.4748578001507,163.73635759999996,"64270335,64270336"
-78.45997140147799,163.7564968,"64270468,64272133,64272834"
The duplicate_ids columns contains 2-4 IDs of nodes already created in Neo4j with the property global_id. I am trying to run a Cypher query to parse the CSV file and create EQUAL_TO relationships between nodes with the global_id listed in the CSV rows. This is my query:
LOAD CSV WITH HEADERS FROM 'file:///duplicate-nodes-planet.csv' AS row
CALL {
WITH row
WITH split(row.duplicate_ids, ',') AS ids
WITH [id IN ids | toInteger(id)] AS integerIds
MATCH (n:WaterNode) WHERE n.global_id IN integerIds
WITH collect(n) AS nodes
WITH nodes AS n1, nodes AS n2
UNWIND n1 AS node1
UNWIND n2 as node2
WITH node1, node2 WHERE node1 <> node2
AND NOT (node1)-[:EQUAL_TO]-(node2)
AND node1.latitude = node2.latitude
AND node1.longitude = node2.longitude
MERGE (node1)-[r:EQUAL_TO]-(node2)
SET r.distance = 0
} IN TRANSACTIONS OF 1000 ROWS;
In a small dataset the query runs without any issues. On a large dataset I get this error after creating 9,897,806 relationships. It always stops at the same spot but unfortunately I could not find the CSV row where it fails. I get this error:
NOT PART OF CHAIN!
RelationshipTraversalCursor
[id=4293918719, open state with: denseNode=false, next=4293918719, , underlying record=Relationship[4293918719,used=false,source=-1,target=-1,type=-1,sCount=1,sNext=-1,tCount=1,tNext=-1,prop=-1, sFirst, tFirst]]
Could you please advise how to debug this issue or how to circumvent it to run the query without failing? Thank you.