Hello everyone,
I am trying to find a way to use the start and end information that is stored inside a relationship.
What I'm trying to solve is the following:
My data has duplicates and I'm trying to create Master relationship between these nodes.
MATCH (p1:PERSON)-[r1:HAS_INTID]->(i1:INTID)
MATCH (p2:PERSON)-[r2:HAS_INTID]->(i2:INTID)
WHERE i1.name = i2.name AND p1.overall > p2.overall
CREATE (p1)-[:MASTER_OF]->(p2)
"Overall" here is a score of data completeness. If the row has 5 columns and all 5 are filled, it would have a score of 1. If 4 out of 5 is filled, the score will be 0.80.
So what I do is find the nodes that have matching names and the (Person) node with higher overall, will be the Master record.
My issue is with nodes where they have matching names and equal overall score. I will get a bidirectional MASTER_OF relationship (p1 to p2 | p2 to p1 | p1 to p1 | p2 to p2). I obviously want a one direction going from (p1 to p2) only.
When I investigated the r1 relationship I noticed that there is data about the start and end as you can see in the picture below.
What I had in mind is writing a cypher code with the following logic:
MATCH (p1:PERSON)-[r1:HAS_INTID]->(i1:INTID)
MATCH (p2:PERSON)-[r2:HAS_INTID]->(i2:INTID)
WHERE i1.name = i2.name AND p1.overall = p2.overall AND r1.start > r2.start
CREATE (p1)-[:MASTER_OF]->(p2)
In this case I'd be able to create a relationship that connects p1 to p2 only.
I am using Neo4j Sandbox in my example running Version: 4.3.9.
