Relationship Load Failure

dariusaudryc1 · May 24, 2019, 8:58am

Hi,

I want to ask whether cypher may miss merging relationship during csv load.

So I want to load csv data into neo4j using the following query:

Using Periodic Commit 10000
    Load CSV with headers from 'file:///{filename}' as row
    FIELDTERMINATOR '|'

MATCH (c:user{user_id: row.user_id} )
MATCH (d:service_provider{sp_id: row.sp_id} )
MATCH (b:trx{order_id: row.order_id} )
MERGE (c)-[:creates]->(b)
MERGE (d)-[:completes]->(b)

//user device information
WITH d, c, row
OPTIONAL MATCH (c_deviceid:device_id {device_id: row.user_device_id})
FOREACH (i IN CASE WHEN c_deviceid IS NOT NULL THEN [c_deviceid] ELSE [] END |
	MERGE (c)-[:associates]-(c_deviceid)
)
WITH d, c, row
OPTIONAL MATCH (c_d1:d1{{d1: row.user_d1}})
FOREACH (i IN CASE WHEN c_d1 IS NOT NULL THEN [c_d1] ELSE [] END |
	MERGE (c)-[:associates]-(c_d1)
)
WITH d, c, row
OPTIONAL MATCH (c_devtoken:devtoken{{devtoken: row.user_devtoken}})
FOREACH (i IN CASE WHEN c_devtoken IS NOT NULL THEN [c_devtoken] ELSE [] END |
	MERGE (c)-[:associates]-(c_devtoken)
)

//sp device information
WITH d, row
OPTIONAL MATCH (d_deviceid:device_id {device_id: row.sp_device_id})
FOREACH (i IN CASE WHEN d_deviceid IS NOT NULL THEN [d_deviceid] ELSE [] END |
	MERGE (d)-[:associates]-(d_deviceid)
)
WITH d, row
OPTIONAL MATCH (d_d1:d1{{d1: row.sp_d1}})
FOREACH (i IN CASE WHEN d_d1 IS NOT NULL THEN [d_d1] ELSE [] END |
	MERGE (d)-[:associates]-(d_d1)
)
WITH d, row
OPTIONAL MATCH (d_devtoken:devtoken{{devtoken: row.sp_devtoken}})
FOREACH (i IN CASE WHEN d_devtoken IS NOT NULL THEN [d_devtoken] ELSE [] END |
	MERGE (d)-[:associates]-(d_devtoken)
)
;

This CQL only merges relationship since the node creation is done on separate CQL script. So the issue that I have is that after running this script, relationships with devtoken and d1 are not created while device_id is created. There is no error in the creation process.

Maybe what I want to understand more is how the load csv works. My current assumption is that the load csv will take the data per row and apply the graph data schema on the SQL. My guess is that the relationship is not created due to the optional match.

Can someone give some light on this problem?

jsmccrumb · May 24, 2019, 11:18am

Can you confirm that it is finding some of each of those nodes that are not being related?

LOAD CSV WITH HEADERS FROM 'file:///{filename}' as row
FIELD TERMINATOR '|'
MATCH (c:user{user_id: row.user_id} )
MATCH (d:service_provider{sp_id: row.sp_id} )
MATCH (b:trx{order_id: row.order_id} )
MATCH (c_devtoken:devtoken{{devtoken: row.user_devtoken}})
RETURN * LIMIT 3

This uses match to require the :devtoken node can be found, and returns first 3 that it can find from the file.
Then do the same thing replacing the c_devtoken line with
MATCH (d_d1:d1{{d1: row.sp_d1}})

and then
MATCH (d_devtoken:devtoken{{devtoken: row.sp_devtoken}})

My first thought is there is a disconnect between the CSV file and the properties on the node (extra space or type difference) so it cannot find the nodes you want to build the relationships to, whereas for :device_id it does find the nodes which is why that segment works.

dariusaudryc1 · May 25, 2019, 1:37am

Hey Jacob, thanks for the reply.

Yes I have confirmed it that it can be query. The csv file that I used to defined the node is also the same csv file to define the relationship. So I have this huge csv file for both.

What I did is that I just separate them. I tried casting all of them as a string. but the result remain the same.

One thing though, the devtoken and d1 have null value in it. I assume that the issue can arise if the load csv function in cypher reads the whole column as one rather than going row per row.

jsmccrumb · May 25, 2019, 1:53am

Could you post the query you used to create the nodes?

The null values are because it cannot match a node for that row, it could be a simple thing as a typo between the query to create the nodes and the query to create the relationships. Since the one relationship is being created but not the others, I would lean towards a minor typo between the two queries (this happens very often to me when I am writing queries...).

ameyasoft · May 25, 2019, 6:39am

Try adding WITH row before the first MATCH statement

Load CSV with headers from 'file:///{filename}' as row
FIELDTERMINATOR '|'

WITH row

MATCH (c:user{user_id: row.user_id} )
..........

Topic		Replies	Views
Import CSV relationship error Cypher	10	982	June 21, 2019
Problems importing relationships Cypher cypher , csv	9	3020	February 3, 2019
Non Matching Records from CSV / Neo: Using Load CSV Cypher cypher , load-csv , import	3	302	February 8, 2022
Load data from CSV Cypher cypher , load-csv , import , csv	2	592	March 4, 2020
I'm new to neo4j Introduce-Yourself	5	362	February 10, 2023

July Summer Fun!

Relationship Load Failure

Related topics