Build a graph from a csv

Hi all,

this is my sample data

Name,L1,L2,L3
A,x,,
B,,x,
C,,x,
D,,,x
E,,,x
F,,x,

The graph should look like:

A->B, A->C, A->F

C->D, C->E

i tried to load the data with:

LOAD CSV WITH HEADERS
FROM 'file:///x.csv' AS row
with row where row.`L1` <> ''
merge (m:Product:Level1{name: row.Name, linenumber : linenumber()});

LOAD CSV WITH HEADERS
FROM 'file:///x.csv' AS row
with row where row.`L2` <> ''
merge (m:Product:Level2{name: row.Name, linenumber : linenumber()});

LOAD CSV WITH HEADERS
FROM 'file:///x' AS row
with row where row.`L3` <> ''
merge (m:Product:Level3{name: row.Name, linenumber : linenumber()});

this works, but i am sure there are easier solutions, of course.

I don't know how to build the references

Any help will highly appreciated!

Thanks

This seems to work. I didn't try to optimize it to be as compact as possible. I just got a solution. This is just one approach. Are you going to have a large file with many more nodes and levels? It seems hard coding a set of merge statements to build this specific structure would be easier. I left the line numbers out to reduce complexity.

LOAD CSV WITH HEADERS FROM 'file:///x.csv' AS row
with collect(row) as rows
with [x in rows where x.L1 is not null] as level1Data,
 [x in rows where x.L2 is not null] as level2Data,
 [x in rows where x.L3 is not null] as level3Data
foreach(x in level3Data |
    merge (m:Product:Level3{name: x.Name})
)
with level1Data, level2Data, level3Data
call {
    with level2Data, level3Data
    unwind level2Data as lvl2
    merge (m:Product:Level2{name: lvl2.Name})
    with m, level3Data
    unwind level3Data as lvl3
    match (n:Product:Level3{name: lvl3.Name})
    merge(m)-[:PARENT_OF]->(n)
}
call {
    with level1Data, level2Data
    unwind level1Data as lvl1
    merge (m:Product:Level1{name: lvl1.Name})
    with m, level2Data
    unwind level2Data as lvl2
    match (n:Product:Level2{name: lvl2.Name})
    merge(m)-[:PARENT_OF]->(n)
}

Hello glilienfield, thanks for the suggestion, i will try it and thanks for the links.

About your question, let me explain it.

The graph should go from L1 to all L2 and from L2 to all L3.

So Name A is L1, it should reference to B, C and F, because they are all L2.

A->B, A->C, A->F

D and E is L3 and directly under C which is L2.

C->D, C->E

That's what i meant with the references.

You implement the same using call subqueries, as follows:

LOAD CSV WITH HEADERS 
FROM 'file:///x.csv' AS row
call {
    with row
    with row where row.`L1` <> ''
    merge (m:Product:Level1{name: row.Name, linenumber : linenumber()})
}
call {
    with row
    with row where row.`L2` <> ''
    merge (m:Product:Level2{name: row.Name, linenumber : linenumber()})
}
call {
    with row
    with row where row.`L3` <> ''
    merge (m:Product:Level3{name: row.Name, linenumber : linenumber()})
}

You can also use some APOC procedures to dynamically set the label, so you can perform the merge call labels with one line.

Look at these for conditional logic:

https://neo4j.com/labs/apoc/4.1/overview/apoc.do/

creating nodes with dynamic labels:

https://neo4j.com/labs/apoc/4.1/overview/apoc.create/

Executing a cypher statement as a string:

https://neo4j.com/labs/apoc/4.1/overview/apoc.cypher/apoc.cypher.doIt/

I will try to help with the adding the relationships, but I don't understand your encoding logic in our csv file.

Hi glilienfield,

thanks for the suggestion, but this is not exactly what i want. There are relationships from B,C and F to D and E, because the statement recognize only the level. Indeed, it should only contain a relationship from C to D and E.

A->B, A->C, A->F: okay

C->D, C->E: okay

F->D, F->E, B->D, B->E: that's wrong, because only the level3 objects directly under a level2 object should have a reference.

That means, the row is also import, not only the level.

Thanks for your help so far!

Hi glilienfield,

i found a solution, maybe it helps also others.

match(n) detach delete n;

LOAD CSV WITH HEADERS

FROM 'file:///home/walden/Dokumente/circularTree/simple_neo4Community.csv' AS row

call {

with row

merge (m:Line{linenumber : linenumber()})

}

call {

with row

with row where row.`L1` <> ''

merge (m:Product:Level1{name: row.Name, linenumber : linenumber(), level: 1})

}

call {

with row

with row where row.`L2` <> ''

merge (m:Product:Level2{name: row.Name, linenumber : linenumber(), level: 2})

}

call {

with row

with row where row.`L3` <> ''

merge (m:Product:Level3{name: row.Name, linenumber : linenumber(), level: 3})

}

match(ln:Line) with ln

MATCH (n:Product {linenumber : ln.linenumber})

call {

with n, ln

MATCH(x:Product) WHERE n.linenumber > x.linenumber and x.level = n.level-1

with n,x, ln order by x.linenumber desc limit 1 merge (x)-[:PARENT_OF]->(n)

}

match(ln:Line) detach delete ln;

Anyway, thanks for your support, the "call statement" and the "limit" was important for the references.

Regards,

Andreas