Creating the simplest possible graph

I'm scoping out Neo4j on the possibility that we might eventually deploy graph databases at work. Though I've not had much trouble following the tutorials, somehow I can't seem to make a simple graph with three barebones csv files.

Here's my region data.

reg

My division data looks almost exactly the same.

Here's my relationship data.

The first two .csv files I loaded with:

LOAD CSV WITH HEADERS FROM 'file:///location_dim.csv' AS row
WITH row.REGION_NM AS REGION_NM
CREATE (n {reg: REGION_NM})
return n

and:

LOAD CSV WITH HEADERS FROM 'file:///divs.csv' AS node
WITH node.DIVISION_NM AS DIVISION_NM
CREATE (n {div: DIVISION_NM})
return n

The relationship I loaded in with:

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from {id: rels.REG}), (to {id: rels.DIV})
create (from)-[:REL {type: rels.`RELATIONSHIP`}]->(to)
return from, to

For the most part this is all taken from tutorials, but isn't working for me. There are two problems.

First, the data load just fine but the nodes aren't labeled.

And second, when I load in the relationships nothing actually happens.

What I want is a graph with region nodes, and region nodes with division nodes hanging off of them where indicated.

Unfortunately I haven't kept a record of every experiment I've tried, but I've altered the syntax in various small ways (not returning anything, using create (from)-[:hasDiv}]->(to) instead of create (from)-[:REL {type: rels.RELATIONSHIP}]->(to), etc.)

I've also read a non-trivial amount of the documentation and searched the forums for threads.

Any advice?

First, the data load just fine but the nodes aren't labeled.

typcially labels are applied via a

create (n:<label>) { <properties>})

and thus

LOAD CSV WITH HEADERS FROM 'file:///divs.csv' AS node
WITH node.DIVISION_NM AS DIVISION_NM
CREATE (n:Division {div: DIVISION_NM})
return n

to load nodes with a a label named Division

And second, when I load in the relationships nothing actually happens.

you might change

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from {id: rels.REG}), (to {id: rels.DIV})
create (from)-[:REL {type: rels.`RELATIONSHIP`}]->(to)
return from, to

to

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {id: rels.REG}), (to:Division {id: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

and presuming when you run LOAD CSV against the location_dim.csv that the

CREATE (n {reg: REGION_NM})

is changed to

CREATE (n:Region {reg: REGION_NM})
1 Like

Dana,

Thanks so much for taking the time to answer. Your amended code did create labeled nodes, but unfortunately the relationship script didn't accomplish anything. I got a (no changes, no records) message.

I didn't forget to use CREATE (n:Region {reg: REGION_NM}) for the region script.

ok.. I think I see the problem.
To create the relationships your code is

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {id: rels.REG}), (to:Division {id: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

but :Region nodes do not have a id property and :Division nodes also do not have a id property for when you created said nodes you created :Region labeled nodes with a property named reg and :Division labeled nodes with a div property.

This can be confirmed for example by running

match (n:Region) return n limit 3;
match (n:Division) return n limit 3;

which will return 3 :Region nodes and 3 :Division nodes.
And if I am correct then

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {id: rels.REG}), (to:Division {id: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

should be rewritten as

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to
1 Like

Hey, that worked!

Now I just have to figure out what to do with all these duplicate divisions.

But it's progress :slight_smile:

div_dupes

one last. if you continue, and if you have larger datasets so as to help the

match (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})

ideally you should create an index on :Region and :Division and on the properties reg and div respectively. See https://neo4j.com/docs/cypher-manual/current/administration/indexes-for-search-performance/#administration-indexes-create-a-single-property-index for more details/syntax.

it should be noted that the relationship creation is in fact a creation so if you run the cypher to create the relationships 5 times, you might then get the same 5 relationships between 2 nodes. However if you change

create (from)-[:hasDiv]->(to)

to

merge (from)-[:hasDiv]->(to)

then merge acts as a create or replace. So if you run the script to create the relationships 5 times then you would expect to see no more than 1 :hasDiv relationship between 2 nodes

1 Like

Good point! Replacing the CREATEs with MERGEs cleared that right up.

Here's everything, producing a simple, labeled graph with no ridiculous duplicates:

LOAD CSV WITH HEADERS FROM 'file:///location_dim.csv' AS row
WITH row.REGION_NM AS REGION_NM
MERGE (n: Region {reg: REGION_NM})
RETURN n

LOAD CSV WITH HEADERS FROM 'file:///divs.csv' AS node
WITH node.DIVISION_NM AS DIVISION_NM
MERGE (n:Division {div: DIVISION_NM})
RETURN n

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
MATCH (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})
MERGE (from)-[:hasDiv]->(to)
RETURN from, to

your image/screen shot. maybe start all over by

match (n:Division) detach delete n;
match (n:Region) detach delete n;

which will find all :Division node remove any associated relationships with said :Division node and then delete the node itself. And the 2nd statement will work on a :Region nodes in the same manner.

Upon running the 2 lines above all :Region and :Division nodes should be removed.

Then rerun the LOAD CSV statement and when creating :Division and :Region nodes change the create to a merge

1 Like

sorry our post crossed.. but yes @trent_fowler your last post with all the merges is what should be used :+1:

1 Like

You're awesome, thank you for the help!