Self_directed relationship

Reuben · February 9, 2023, 3:43am

I am having some sort of a tricky situation, and I would like some guidance on it. I have a CSV file with same-name columns because of their unique relationships. However, when I build the graph I have relationships that return to the same node which I think should not have happened since I merged their node labels. What am I doing wrong? why are some nodes returning the relationship as illustrated below:


def similar(tx,path):
    
    tx.run (
    
    "LOAD CSV WITH HEADERS FROM $path AS row "
    "MERGE (sgd:Solar_grade {name: row.Solar_grade}) "
    "MERGE (sgd1:Solar_grade {name : row.Solar_grade}) "
    "MERGE (sgd)-[:isSimilar_Same {relationship:row.Relationship}]->(sgd1) "
 
    , path=path
    )

Brief Data from Table

Solar Grade

Relationship

Solar Grade

AC03

isSimilarTo

BC01D

BC04

isSimilarTo

BC01D

DC05

isSimilarTo

BC01D

EC06

isSimilarTo

BC01D

F08

isSameAs

BC01D

G08F

isSameAs

BC01D

HR22

isSameAs

BC01D

I7PH

isSameAs

l17-7XC

SJ

isSameAs

l17-7XC

KSUS

isSameAs

l17-7XC

AISMI

isSimilarTo

l17-7XC

AISIN

isSimilarTo

l17-7XC

PAISI

isSimilarTo

l17-7XC

QI7PH

isSimilarTo

l17-7XC

R0OH

isSimilarTo

l17-7XC

#Cypher #CSV #python #Neo4j

Reuben · February 9, 2023, 5:45am

@glilienfield thanks for the education on the map keys. So I tried the approach you illustrated. However, instead of creating a relationship from A-[rel]->B, it rather creates nodes from the relationship column row[1] in the table and connects them to each other, while row[0] and row[2] are not connected.

glilienfield · February 9, 2023, 4:43am

That is expected behavior, as the sgd and sgd1 aliases reference the same node. As such, the relationship created will relate back to the same node. The cause of your confusion is that you think you are utilizing the two "Solar Grade" columns. What is happen is that your query is only using the value of "Solar Grade" from the second column. This occurs because when you use "load csv with headers" a map is created for each row with the column names are the map keys. Since you have a duplicate column name, the first value gets replaced when the second value is added to the row's map.

If you don't want to change the column names, you can avoid this by not using 'with headers'. This will then create an array of values for each row instead of a map. You then access the columns using an index, such as row[0], row[1], and row[2]. You will also need to skip the first row in the file since your file has header data in the first row.

You can see the behavior with the example below:

Screen Shot 2023-02-08 at 11.40.19 PM.png

Screen Shot 2023-02-08 at 11.38.40 PM.png

Screen Shot 2023-02-08 at 11.39.00 PM.png

Try this instead:


def similar(tx,path):
    
    tx.run (
    
    "LOAD CSV FROM $path AS row "
    "WITH ROW SKIP 1
    "MERGE (sgd:Solar_grade {name: row[0]}) "
    "MERGE (sgd1:Solar_grade {name : row[2]}) "
    "MERGE (sgd)-[:isSimilar_Same {relationship:row[1]}]->(sgd1) "
 
    , path=path
    )

glilienfield · February 9, 2023, 7:06pm

I ran a small test and it seems to be working. Can you share your test data?

Reuben · February 11, 2023, 6:13am

You are right! It worked as you suggested. Thank you @glilienfield

glilienfield · February 11, 2023, 9:37am

What changed from two days again when it wasn’t working?

Reuben · February 11, 2023, 10:05am

The issue I had was with the indexing, which was an oversight on my end. So instead of row[0] in the solution you provided, I should have used row[1]-[: row[2]]-> row[3] in my case. Yep! so you're right.

Besides I was trying to maintain the "relationship property" as indicated in the .csv file, since they vary for each row (i.e. avoid a common relationship name). Unfortunatley, my queries gave me syntax errors. However, I was able to resolve it with this approach.

LOAD CSV WITH HEADERS FROM "file:///similar_grades.csv" AS row
MERGE (sgd:Solar_grade {name: row.Solar_grade})
MERGE (sgd1:Solar_grade {name: row.Solar_grade_1})
WITH sgd, sgd1, row
WHERE row.Relationship = "isSimilarTo"
CREATE (sgd)-[:SIMILAR_TO]->(sgd1)
RETURN sgd, sgd1

LOAD CSV WITH HEADERS FROM "file:///similar_grades.csv" AS row
MERGE (sgd:Solar_grade {name: row.Solar_grade})
MERGE (sgd1:Solar_grade {name: row.Solar_grade_1})
WITH sgd, sgd1, row
WHERE row.Relationship = "isSameAs"
CREATE (sgd)-[:SAME_AS]->(sgd1)
RETURN sgd, sgd1

Thanks for always helping out

glilienfield · February 11, 2023, 7:32pm

Nice you caught the issue. For future use, you can use an apoc procedure to create a relationship that allows you too define the relationships dynamically.

Another approach that would have worked for you is using the apoc.do.when procedure that allows you to implement an if/else statement, conditionally executing two blocks of cypher.

Reuben · February 11, 2023, 11:41pm

Great! I will look into that. thanks

Topic		Replies	Views
How to create relationship between 2 nodes in the same node labels? Neo4j Graph Platform load-csv	8	430	April 5, 2022
How to create the same node from two different columns and create relationships, Cypher	2	406	June 22, 2021
Create relationships from csv with distinct node Neo4j Graph Platform migrated	1	158	June 28, 2022
Two columns in a csv containing same data and i want relationship just to be created if both columns have different data. How to do that? Cypher browser , cypher	9	2828	May 30, 2019
Duplicated relationships for same nodes when loading from CSV Newbie Questions	2	1231	February 17, 2019

Self_directed relationship

Related topics