Relating nodes in csv with column names in Neo4J

aditya_digala · January 2, 2022, 5:29am

I have a csv file in below format,

There are n location columns of various locations.
There's Token column at the end where each location row is associated with a different token.

TASK

Associate each token to each location column such that the relationship is named after the column name.

Example

My cypher code

load csv with headers from "file:///locations.csv" as row with row where row is not null
merge (l1:locations {name:row.LOC_1})
merge (t:tokens {name:row.Token})
merge (l1) -[:LOC_1]->(t)

As you can see, This code loads each column and relates them to tokens manually which is a tedious task. In reality, there are 67 columns and doing this manually takes a lot of time.

filantrop · January 2, 2022, 10:22am

Hi,
This one works, but could somehow be optimized.

load csv with headers from "file:///locations.csv" as row with row
with row,keys(row) as columns
merge (token:tokens {name:row.Token})
with row,columns,token
unwind range(0,size(columns)) as columnIndex
with row,columns[columnIndex] as columnName,token
where not columnName='Token'
merge (l1:locations {name:row[columnName]})
with token,columnName,l1
call apoc.merge.relationship(token,columnName,null,null,l1,null) 
yield rel
return count(rel)

aditya_digala · January 2, 2022, 10:41am

Hey @filantrop,
Thanks for the code.
There are few things to consider,
1.There are null values in some columns. (Avoid them and load remaining values)
2. The dataset is too huge with 150K records and 67 columns. (How can we load faster)
(Also in original dataset the columns are named, c1,c2....cn)
Can you please modify your code keeping in mind the above points?

filantrop · January 2, 2022, 11:14am

Test this one. You also could change the batch size of the transactions by uncommenting the //OF 500 rows

otherwise the defaultsize is 1000

:auto load csv with headers from "file:///locations.csv" as row with row
with row
call{
    with row
    with row,keys(row) as columns

    merge (token:tokens {name:row.Token})
    with row,columns,token
    unwind range(0,size(columns)) as columnIndex
        with row,columns[columnIndex] as columnName,token
        where not columnName='Token' and not row[columnName] is null
        merge (l1:locations {name:row[columnName]})
        with token,columnName,l1
        call apoc.merge.relationship(token,columnName,null,null,l1,null)
        yield rel
        return rel
} IN TRANSACTIONS //OF 500 ROWS
return count(rel)

aditya_digala · January 2, 2022, 11:48am

5 minutes since I started running this code. It's still running, will keep you updated here about the progress.
Also, it ran out of memory when I ran the first code in Neo4J Aura. Now I'm running it locally, hope this works faster. Any other tweaks I should do if this also runs out of memory?

filantrop · January 2, 2022, 12:06pm

You could change the memory.

Have a look here:

aditya_digala · January 2, 2022, 12:41pm

After 45 minutes, it finally completed running and works well.
Also, I am a beginner and still learning about Neo4J. I appreciate it if you suggest any important concepts for me to learn this software quickly.
Thank you so much!

filantrop · January 2, 2022, 12:47pm

Great!
I think myself it was a good starting point to use the free online courses

Topic		Replies	Views
Relating nodes in csv with column names in Neo4J Neo4j Graph Platform apoc , neo4j-desktop , migrated , cypher-tagged , operations-tagged	3	263	November 14, 2022
Relate multiple columns to single column also naming relationship against associated column name Procedures & APOC	1	297	January 3, 2022
Load CSV - Loading relationships in one column, but same type of node Cypher	7	823	November 24, 2020
Load CSV into Neo4j and create nodes out of each row Cypher apoc , cypher , neo4j-desktop	2	998	January 1, 2022
Load matrix (node x node) into neo4j Import / Export import	2	317	December 24, 2021

August Summer Fun!

Relating nodes in csv with column names in Neo4J

TASK

Example

My cypher code

Related topics