Dynamically create nodes and relationships

I'm trying to model my data in Cypher and Neo4j desktop and dynamically create new nodes and relationships based on existing nodes and relationships to improve the data quality.

My MWE looks like this:

// Create nodes
CREATE (a:House {name: "House A", location: 1})
CREATE (b:House {name: "House B", location: 2})
CREATE (c:House {name: "House C", location: 3})
CREATE (d:House {name: "House D", location: 4})
CREATE (e:House {name: "House E", location: 5})
CREATE (f:House {name: "House F", location: 6})
CREATE (g:House {name: "House G", location: 7})
CREATE (h:House {name: "House H", location: 8})


// Create relationships
CREATE (a)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(b)
CREATE (c)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(d)
CREATE (c)-[:PIPE {start: 2, end: 3, id: randomUUID()}]->(d)
CREATE (e)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(f)
CREATE (e)-[:PIPE {start: 2, end: 3, id: randomUUID()}]->(f)
CREATE (e)-[:PIPE {start: 3, end: 4, id: randomUUID()}]->(f)
CREATE (g)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(h)
CREATE (g)-[:PIPE {start: 2, end: 3, id: randomUUID()}]->(h)
CREATE (g)-[:PIPE {start: 3, end: 4, id: randomUUID()}]->(h)
CREATE (g)-[:PIPE {start: 99, end: 100, id: randomUUID()}]->(h)


RETURN *;

My dataset contains:

  • Values for houses and their location
  • Which pipes are connecting two houses
  • The start and end value of the pipes, which indicate the sequence of the pipes in the ground (If the starting value of pipeY is equal to the end value of pipeX than the direction of is pipeX->pipeY)

What I try to achieve is to replace all the relationships between two houses if there are more than 1 pipe between them and when the pipes are in sequence and insert a junction-node that is missing in the data.

The relationships should look like this in the end:

// Create relationships
CREATE (a)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(b)

CREATE (c)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(:Junction {id: randomUUID{})-[:PIPE {start: 2, end: 3, id: randomUUID()}]->(d)

CREATE (e)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(:Junction {id: randomUUID{})-[:PIPE {start: 2, end: 3, id: randomUUID()}]->(:Junction {id: randomUUID{})- [:PIPE {start: 3, end: 4 id: randomUUID()}]->(f)

CREATE (g)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(:Junction {id: randomUUID{})-[:PIPE {start: 2, end: 3, id: randomUUID()}]->(:Junction {id: randomUUID{})- [:PIPE {start: 3, end: 4 id: randomUUID()}]->(h)

CREATE (g)-[:PIPE {start: 99, end: 100, id: randomUUID()}]->(h)

 
RETURN *;

What is import is, that I do not know, how many sequential pipes are in the ground and how many junction-nodes I might need. Furthermore, I need to connect houses, junctions with the pipes in the correct order.

Try this. It seems to work. I don't know how big your db is that you want to remediate. We might have to work on the efficiency if the performance is off.

match (a)-[r:PIPE]->(b)
order by r.start 
with a, b, collect(r) as rels
where size(rels) > 1
call (rels) {
    unwind range(0, size(rels) - 2) as index
    create (n:Junction{id: randomUUID()})
    return collect(n) as junctions
}
with rels, [a]+junctions+[b] as nodes
unwind range(0, size(rels) - 1) as index
with nodes[index] as startNode, nodes[index+1] as endNode, rels[index] as rel
merge (startNode)-[r:PIPE{start: rel.start, end: rel.end, id: randomUUID()}]->(endNode)
delete rel

Test data:

Result:

Result shown as text so you can see the ordering of the PIPEs are correct:

1 Like

Thank you for your swift response @glilienfield. I had to fiddle a little with your code to make it work but it served as a perfect starting point to get to the correct solution.

This is a slightly altered version of your code to make it work. I had to replace rels before CALL {...} with WITH rels within the CALL {...}-procedure.

MATCH (a)-[r:PIPE]->(b)
WITH a, b, r
ORDER BY r.start 
WITH a, b, COLLECT(r) AS rels
WHERE SIZE(rels) > 1
\\ Changed call procedure
CALL { 
    WITH rels
    UNWIND RANGE(0, SIZE(rels) -2) AS index
    CREATE (n:Junction{name: "Junction", id: randomUUID()})
    RETURN COLLECT(n) AS junctions
}
WITH rels, [a] + junctions + [b] AS nodes
UNWIND RANGE(0, SIZE(rels) -1) AS index
WITH nodes[index] AS startNode, nodes[index+1] AS endNode, rels[index] AS rel
CREATE (startNode)-[:PIPE {start: rel.start, end: rel.end, id: randomUUID()}]->(endNode)
DELETE rel;

Nevertheless, your solution ignored the edge case, when two houses are not only connected by pipes in sequence. For example Houses G and H are connected by four pipes, three of them in sequence.

Additionally, I'd like to cover the case when houses are connected by multiple pipes, some of which are in sequence, some of them are not.

To make it more clear, here is an update of the MWE:

// Create nodes
CREATE (a:House {name: "House A", location: 1})
CREATE (b:House {name: "House B", location: 2})
CREATE (c:House {name: "House C", location: 3})
CREATE (d:House {name: "House D", location: 4})
CREATE (e:House {name: "House E", location: 5})
CREATE (f:House {name: "House F", location: 6})
CREATE (g:House {name: "House G", location: 7})
CREATE (h:House {name: "House H", location: 8})
CREATE (i:House {name: "House I", location: 9})

// Create relationships
CREATE (a)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(b)
CREATE (c)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(d)
CREATE (c)-[:PIPE {start: 2, end: 3, id: randomUUID()}]->(d)
CREATE (e)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(f)
CREATE (e)-[:PIPE {start: 2, end: 3, id: randomUUID()}]->(f)
CREATE (e)-[:PIPE {start: 3, end: 4, id: randomUUID()}]->(f)
CREATE (g)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(h)
CREATE (g)-[:PIPE {start: 2, end: 3, id: randomUUID()}]->(h)
CREATE (g)-[:PIPE {start: 3, end: 4, id: randomUUID()}]->(h)
CREATE (g)-[:PIPE {start: 99, end: 100, id: randomUUID()}]->(h)
CREATE (h)-[:PIPE {start: 1, end: 2, id: randomUUID()}]->(i)
CREATE (h)-[:PIPE {start: 100, end: 101, id: randomUUID()}]->(i)
CREATE (h)-[:PIPE {start: 101, end: 102, id: randomUUID()}]->(i)
CREATE (h)-[:PIPE {start: 200, end: 201, id: randomUUID()}]->(i)
CREATE (h)-[:PIPE {start: 202, end: 203, id: randomUUID()}]->(i)
CREATE (h)-[:PIPE {start: 203, end: 204, id: randomUUID()}]->(i)
RETURN *;
MATCH (a)-[r:PIPE]->(b)
WITH a, b, r
ORDER BY r.start
WITH a, b, COLLECT(r) AS rels
WHERE SIZE(rels) > 1

// Group pipes into separate sequences where end = start
WITH a, b, 
     REDUCE(s = [], x IN rels | 
        CASE 
            WHEN SIZE(s) = 0 OR s[-1][-1].end = x.start 
            THEN s[0..-1] + [COALESCE(s[-1], []) + [x]] 
            ELSE s + [[x]] 
        END
     ) AS sequences

// Only keep sequences longer than 1
WITH a, b, [seq IN sequences WHERE SIZE(seq) > 1] AS validSequences
UNWIND validSequences AS sequence

// Create junctions for each sequence
CALL {
    WITH sequence
    UNWIND RANGE(0, SIZE(sequence) - 2) AS index
    CREATE (n:Junction {name: "Junction", id: randomUUID()})
    RETURN COLLECT(n) AS junctions
}

WITH a, b, sequence, [a] + junctions + [b] AS nodes
UNWIND RANGE(0, SIZE(sequence) - 1) AS index
WITH nodes[index] AS startNode, nodes[index+1] AS endNode, sequence[index] AS rel
CREATE (startNode)-[:PIPE {start: rel.start, end: rel.end, id: randomUUID()}]->(endNode)
DELETE rel;