cancel
Showing results for 
Search instead for 
Did you mean: 

Create a relationship based on the property value of the generated node

hilumen94
Node

Hello.
Currently, I have created about 1,000,000 nodes using the "ciations file.csv" file, and each node has its property values configured as {lenid, title, year, journal_name}. When I created the node, I used it as below.

LOAD CSV FROM 'file:///citations%20file.csv' AS ci
CREATE (jj:Journal{lensid:ci[0], title:ci[1], year:ci[2], journal_name:ci[3]})

Since then, for connections between nodes, we are trying to create a relationship using a unique identification value called lensid.
For matching identification values, a separate "reference file.csv" file consists of one lensid and reference per line. The lensid has an average of about 30 references per one. The "reference file.csv" contains approximately 30,000,000 (lensid, reference) pairs. Below is what I wrote to create a relationship. ref[0] is the lensid, and ref[1] is the identifier reference belonging to the lensid.

LOAD CSV FROM 'file:///reference%20file.csv' AS ref
WITH ref
MATCH (j:Journal{lensid:ref[1]})
WITH j, ref
MATCH(j2:Journal{lensid:ref[0]})
CREATE (j)-[:referenced]->(j2)

The question is this.
I have performed the work using the code written above, but it is still in progress for 3 days.
However, it seems that the generated code takes a lot of time to match.
Could somebody give me some advice on how to make the calculation for the above relationship simpler?

I don't know if it's necessary, but the hardware information and RAM allocation are as follows.
CPU : intel i7-9750h
RAM : 32GB
dbms.memory.heap.initial_siz e=16G
dbms.memory.heap.max_size=16G
dbms.memory.pagecache.size=10G

Thank you for reading.

2 REPLIES 2

glilienfield
Ninja
Ninja

Do you have an index created for the ‘lensid’ property for the journal label? If not, it will perform a full scan of the Journal nodes for each match. You could use EXPLAIN to see the query plan.

Also, you shouldn’t need the ‘with’ clauses in your query. The following should work too.

LOAD CSV FROM 'file:///reference%20file.csv' AS ref
MATCH (j:Journal{lensid:ref[1]})
MATCH (j2:Journal{lensid:ref[0]})
CREATE (j)-[:referenced]->(j2)

The 'lensid' property is created for all journal labels.
I confirmed that the method you told me is also proceeding normally.
I will check the query plan you mentioned.
Thank you for your reply.

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.