Struggling with csv parent child relationship

Greetings. I'm new to neo4j, cypher and databases, so thanks in advance for considering my remedial question. I have a csv data set, with headers, that reflect a list of parts, part names and parent of each part. Each part has a relationship to an assembly where it exists in a system. The CSV data format is as follows
id , name, parent
1111, foo, 1000
1112, bar, 1000
1000, xxx, 10000

I'm able to load the data from csv, but I'm struggling with the syntax to create the relationships (id<->parent) . My end goal is to have a visual representation of the graph that I can navigate/traverse relationships from any point in the graph. I've read the cypher docs / examples where MERGE is used, but I can't seem to get my head around "what gets merged", considering I am importing a table where each row already has an id and parent defined in the row

Thank you for your help

This is a common thing, it's part of learning how to "think in graphs". Here's a quick solution first, then I'll try to explain how it works.

LOAD CSV WITH HEADERS FROM 'whatever.csv' AS line

MERGE (thisThingHere:Element { id: line.id })
   ON MATCH SET thisThingHere.name = line.name
   ON CREATE SET thisThingHere.name = line.name
MERGE (parent:Element { id: line.parent })
MERGE (thisThingHere)-[:PARENT]->(parent)

What MERGE does is to create something only if it doesn't already exist. For every line in the CSV, we need to guarantee that the thing itself exists, but also that the parent exists, so we can link them together. That's why we have the first 2 merges. Basically it guarantees that you'll have a thing, and its parent. The third merge is then what connects them.

But in CSV, you don't have a guaranteed ordering of rows. So what if you encounter the mention of a parent (in the parent field) before you know what its name is? (Because the name comes later in the file, for example). That's what the ON MATCH/ON CREATE is for. On line 2 of your input, you see a mention of parent 1000, but it's not until line 3 that you get the parent's name. To handle this, your cypher has to merge the parent by it's ID (which is all the information you have in line 2), but you need to re-match that same parent down on line 4 and set its name once you find it in the file.

In the example query provided, parent 1000 is active in the "parent" element on line 2, 3, and it becomes "thisThingHere" on line 4.

2 Likes

Thank you very much David. Your explanation is super helpful and makes sense to me. I will need to go back to study Cypher to figure out a syntax error "thiThingHere" not defined. I really like the platform, but I've got a lot to learn. Again, thank you for your help.

1 Like

Glenn,

"thiThingHere" is simply a variable that you arbitrarily assign to use to refer to a node later within your transaction.

You likely got that error if you had a typo and the variable name didn't match between your MERGE and your SET statements. If you try to use a variable in a SET command that was never created in a MATCH/MERGE (or a WITH statement) you would see that error.

If you are still stuck, paste the cypher code you are trying to run, and we can help troubleshoot with you.

1 Like

I too am new to Neo4j. I have a use case where I need to load very similar data with parent child relationships from .csv file so was delighted to find this example.

However, I cannot make the solution given here work with the 3 record data set given in the first post without a "Cannot merge node using null property value for id". I wonder if related to the parent 10000 not having an id, but not sure since the solution provided included a comment about unordered data.

Can others load the sample data given above (3 records) with the solution provided without error?

Thank you!

Hello Tom, and welcome to Neo4j,

There are serveral ways to solve that problem, and the "solution" above was intended to highlight that a single command may not be sufficient. Open a new topic, include a small sample of your csv, and the Cypher you've tried, and we'll be happy to help.

Try this:

MERGE (p:Parent {id: 1000})
MERGE (gp:Parent {id: 10000})
MERGE (ch1:Child {id: 1111})
MERGE (ch2:Child {id: 1112})

MERGE (gp)-[:CHILD]->(p)
MERGE (p)-[:CHILD]->(ch1)
MERGE (p)-[:CHILD]->(ch2)

Screen Shot 2020-07-24 at 10.59.42 AM