Creating the simplest possible graph

trent_fowler · January 27, 2021, 9:12pm

I'm scoping out Neo4j on the possibility that we might eventually deploy graph databases at work. Though I've not had much trouble following the tutorials, somehow I can't seem to make a simple graph with three barebones csv files.

Here's my region data.

reg

My division data looks almost exactly the same.

Here's my relationship data.

The first two .csv files I loaded with:

LOAD CSV WITH HEADERS FROM 'file:///location_dim.csv' AS row
WITH row.REGION_NM AS REGION_NM
CREATE (n {reg: REGION_NM})
return n

and:

LOAD CSV WITH HEADERS FROM 'file:///divs.csv' AS node
WITH node.DIVISION_NM AS DIVISION_NM
CREATE (n {div: DIVISION_NM})
return n

The relationship I loaded in with:

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from {id: rels.REG}), (to {id: rels.DIV})
create (from)-[:REL {type: rels.`RELATIONSHIP`}]->(to)
return from, to

For the most part this is all taken from tutorials, but isn't working for me. There are two problems.

First, the data load just fine but the nodes aren't labeled.

And second, when I load in the relationships nothing actually happens.

What I want is a graph with region nodes, and region nodes with division nodes hanging off of them where indicated.

Unfortunately I haven't kept a record of every experiment I've tried, but I've altered the syntax in various small ways (not returning anything, using create (from)-[:hasDiv}]->(to) instead of create (from)-[:REL {type: rels.RELATIONSHIP}]->(to), etc.)

I've also read a non-trivial amount of the documentation and searched the forums for threads.

Any advice?

dana_canzano · January 27, 2021, 10:20pm

First, the data load just fine but the nodes aren't labeled.

typcially labels are applied via a

create (n:<label>) { <properties>})

and thus

LOAD CSV WITH HEADERS FROM 'file:///divs.csv' AS node
WITH node.DIVISION_NM AS DIVISION_NM
CREATE (n:Division {div: DIVISION_NM})
return n

to load nodes with a a label named Division

And second, when I load in the relationships nothing actually happens.

you might change

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from {id: rels.REG}), (to {id: rels.DIV})
create (from)-[:REL {type: rels.`RELATIONSHIP`}]->(to)
return from, to

to

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {id: rels.REG}), (to:Division {id: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

and presuming when you run LOAD CSV against the location_dim.csv that the

CREATE (n {reg: REGION_NM})

is changed to

CREATE (n:Region {reg: REGION_NM})

trent_fowler · January 27, 2021, 10:47pm

Dana,

Thanks so much for taking the time to answer. Your amended code did create labeled nodes, but unfortunately the relationship script didn't accomplish anything. I got a (no changes, no records) message.

I didn't forget to use CREATE (n:Region {reg: REGION_NM}) for the region script.

dana_canzano · January 27, 2021, 10:53pm

ok.. I think I see the problem.
To create the relationships your code is

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {id: rels.REG}), (to:Division {id: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

but :Region nodes do not have a id property and :Division nodes also do not have a id property for when you created said nodes you created :Region labeled nodes with a property named reg and :Division labeled nodes with a div property.

This can be confirmed for example by running

match (n:Region) return n limit 3;
match (n:Division) return n limit 3;

which will return 3 :Region nodes and 3 :Division nodes.
And if I am correct then

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {id: rels.REG}), (to:Division {id: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

should be rewritten as

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
match (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})
create (from)-[:hasDiv]->(to)
return from, to

trent_fowler · January 27, 2021, 10:57pm

Hey, that worked!

Now I just have to figure out what to do with all these duplicate divisions.

But it's progress

div_dupes

dana_canzano · January 27, 2021, 10:58pm

one last. if you continue, and if you have larger datasets so as to help the

match (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})

ideally you should create an index on :Region and :Division and on the properties reg and div respectively. See https://neo4j.com/docs/cypher-manual/current/administration/indexes-for-search-performance/#administration-indexes-create-a-single-property-index for more details/syntax.

it should be noted that the relationship creation is in fact a creation so if you run the cypher to create the relationships 5 times, you might then get the same 5 relationships between 2 nodes. However if you change

create (from)-[:hasDiv]->(to)

to

merge (from)-[:hasDiv]->(to)

then merge acts as a create or replace. So if you run the script to create the relationships 5 times then you would expect to see no more than 1 :hasDiv relationship between 2 nodes

trent_fowler · January 27, 2021, 11:10pm

Good point! Replacing the CREATEs with MERGEs cleared that right up.

Here's everything, producing a simple, labeled graph with no ridiculous duplicates:

LOAD CSV WITH HEADERS FROM 'file:///location_dim.csv' AS row
WITH row.REGION_NM AS REGION_NM
MERGE (n: Region {reg: REGION_NM})
RETURN n

LOAD CSV WITH HEADERS FROM 'file:///divs.csv' AS node
WITH node.DIVISION_NM AS DIVISION_NM
MERGE (n:Division {div: DIVISION_NM})
RETURN n

LOAD CSV WITH HEADERS FROM 'file:///reg_div_rel.csv' AS rels
MATCH (from:Region {reg: rels.REG}), (to:Division {div: rels.DIV})
MERGE (from)-[:hasDiv]->(to)
RETURN from, to

dana_canzano · January 27, 2021, 11:12pm

your image/screen shot. maybe start all over by

match (n:Division) detach delete n;
match (n:Region) detach delete n;

which will find all :Division node remove any associated relationships with said :Division node and then delete the node itself. And the 2nd statement will work on a :Region nodes in the same manner.

Upon running the 2 lines above all :Region and :Division nodes should be removed.

Then rerun the LOAD CSV statement and when creating :Division and :Region nodes change the create to a merge

dana_canzano · January 27, 2021, 11:13pm

sorry our post crossed.. but yes @trent_fowler your last post with all the merges is what should be used

trent_fowler · January 27, 2021, 11:16pm

You're awesome, thank you for the help!

Topic		Replies	Views
Hello Graphers, Introduce-Yourself	0	647	October 1, 2019
Node then Relationship Newbie Questions cypher , neo4j-desktop	2	324	July 2, 2020
Neo4j: Relationships Neo4j Graph Platform migrated	32	283	December 28, 2022
Import relationships using a csv file Cypher cypher , import	26	688	April 4, 2022
Do I need to load the CSV files twice for creating relationships? Neo4j Graph Platform	11	1638	April 23, 2020

Creating the simplest possible graph

Related topics