Creating multiple unique nodes based on CSV entries that have multiple values

marciano · October 7, 2024, 3:50pm

I have a test .csv file, and want to create a graph that has unique nodes for each year but also unique nodes for the Numbers values

Screenshot 2024-10-07 at 11.47.03 AM

My cypher is:
LOAD CSV WITH HEADERS FROM 'file:///test.csv' AS row MERGE (y:Year {name: row.Year}) WITH y, row UNWIND split (row.Numbers, ',') AS X MERGE (n: Numbers {name: X}) MERGE (y)-[r:LINK]->(n) RETURN y, r, n
which produces duplicate Numbers nodes ("3" in this case).

which produces duplicate Numbers nodes ("3" in this case).

Is there a way to create uniqueness with the Numbers node values as well (only have 1, 2, 3)? In other words I only want 2016 to be associated with three unique nodes (no duplicates in the Numbers nodes).

glilienfield · October 7, 2024, 6:47pm

I ran the query and got unique Numbers nodes. Can you provide what your results are?

Keep in mind if you add another year with one of these numbers, the new year will use these existing Numbers nodes as well.

marciano · October 7, 2024, 7:46pm

Thank you for looking at this. Interesting! I'm guessing that it could be the space between the ',' and digit 3. When I get rid of that space in the .csv file it works the way you saw. Wonder if in one case is sees '3' and another ' 3'. Thank you!

glilienfield · October 7, 2024, 8:03pm

That is correct. I created a file without spaces. The split is parsing on “,” and is not designed to compensate for a space after the comma. You can use the “trim” function if you want to avoid this in the future.

Change MERGE (n: Numbers {name: X})to MERGE (n: Numbers {name: trim(X)})

In those cases where you would use the value of 'X' in multiple places and don't want to wrap each usage with 'trim', you can eliminate the extra spaces upfront with list comprehension.

LOAD CSV WITH HEADERS FROM 'file:///test.csv' AS row 
MERGE (y:Year {name: row.Year}) 
WITH y, row 
UNWIND [i in split (row.Numbers, ',') | trim(i)] AS X 
MERGE (n: Numbers {name: X}) 
MERGE (y)-[r:LINK]->(n) 
RETURN y, r, n

marciano · October 7, 2024, 9:10pm

Gary,

Brilliant! this solved it and I learned a few things.

Most appreciative and thank you for your time.

-Richard

Topic		Replies	Views
Creating 2 unique nodes per row from a CSV Cypher cypher	1	287	January 24, 2022
Unique Nodes loaded from CSV file Neo4j Graph Platform migrated	2	172	June 9, 2022
Duplicate Node General	4	5632	April 26, 2019
Multiple LOAD CSV operations creating "duplicate" nodes Newbie Questions import	7	1806	August 10, 2020
How to Split a Row While importing CSV with Cypher Neo4j Graph Platform migrated	27	290	August 10, 2022

Creating multiple unique nodes based on CSV entries that have multiple values

Related topics