Neo4j Server Community edition 4.3.3 Ubuntu 20.04
Hi all,
For a long time I used a query to import a complex CSV file, and it run nicely.
Yesterday night I 'dream' an horrible bug
in the query, but I have no idea how to solve it.
The problem resides in the structure of the data, and in the fact that I must maintain the existing DB content:
Each line describe the following info:
- a primary node, each one with:
- three different labels for the node;
- one of more relationships with another node (regions);
- one relationship with a complex object
So, in my import query, I do the following:
LOAD CSV WITH HEADERS FROM 'xxx' as row FIELDTERMINATOR ';'
// Select a primary node and change its name
WITH row, split(row.Appellation,",") as appellationNames
MERGE (appellation:Appellation { uuid: row.`Appellation UUID` })
SET appellation.name = appellationNames
// Replace the primary node labels
WITH row, appellation
CALL apoc.create.setLabels(appellation,[row.`EU Classification`,row.`Classification`, 'Appellation']) YIELD node
// delete and replace the relationships with the regions
WITH row, appellation, split(row.Region,",") as regions
UNWIND regions as aRegion
MATCH (region:Region)
WHERE ToLower(region.name) = trim(ToLower(aRegion))
WITH row, appellation, region
MATCH (appellation)-[oldRelationship:IS_PRODUCED_IN]->(region)
DETACH DELETE oldRelationship
WITH row, appellation, region
MERGE (appellation)-[r:IS_PRODUCED_IN]->(region)
// then finally start working on the core content which differ for each line
with row, appellation
<DO A LOT OF OTHER THINGS WITH REMAINING INFO>
Where is the problem?
The input file is ordered, and I have a bunch of sequential rows having the same appellation, labels and regions (because these are common data), but with all the remaining information that differs.
So, if I have a sequence of, just to say, 100 rows with the same initial info, I repeat 100 time the query described above, deleting and recreating the same info for each row, when I would execute just the latest query content <DO A LOT OF THINGS WITH REMAINING INFO>
for all the lines except the first.
If I have been able to describe the problem, there someone who see an approach to solve it?
A nice workaround, for me, would be to have the ability to access the previous row ....
Thanks