I have very huge data with these kinds of network relationships, showing a simplistic view:
The requirement is to show the high level nodes with the connectivity as pairs.
If I want to show only of Label types (A,D,E), need to return like this:
[A, D1]
[D1, D]
[D, D1]
[D, D2]
[D, D3]
[D, D4]
[D, D5]
[D, D6]
[D, D7]
[D, D8]
[D8, E]
[D8, D]
[D7, D]
[D6, D]
[D5, D]
[D4, D]
[D3, D]
[D2, D]
Based on my search criteria, A and D would have connected but D & E may not be connected, so Idea is to get the connectivity between A & D first and later between D & E.
Getting all the paths, since the nodes of the paths are ordered based on the connectivity, checking if a node is any of A, D or E and then returning it(neglecting the other nodes of a path),
//First get all the matching nodes at first via cypher run, otherwise, it iterates can't get the expected results
CALL apoc.cypher.run("optional match(a:A) return a", {}) YIELD value as aNode
CALL apoc.cypher.run("optional match(a:D) return a", {}) YIELD value as dNode
CALL apoc.cypher.run("optional match(a:E) return a", {}) YIELD value as eNode
WITH collect(aNode.a) as aNodes,collect(dNode.a) as dNodes, collect(eNode.a) as eNodes
//Get all the paths
//Needs to change, A and D may have any number of hops
match p1= (:A)-[*1..10]-(:D)
match p2= (:D)-[*1..10]-(:E)
//perform operations for p1 with aNodes and dNodes
call {
with aNodes,dNodes, p1
with aNodes,dNodes,p1
WITH nodes(p1) as p1nodes, aNodes, dNodes
//nodes(p1) will be ordered based on the connectivity
//get only the needed nodes from this path by comparing with aNodes and dNodes
UNWIND p1nodes as n1
with n1, p1nodes where n1 in aNodes or n1 in dNodes
WITH collect((n1.name)) as eachList,p1nodes ORDER BY p1nodes
RETURN eachList
UNION
//perform operations for p2 with dNodes and eNodes
with dNodes,eNodes, p2
with dNodes,eNodes,p2
WITH nodes(p2) as p2nodes, dNodes, eNodes
//nodes(p2) will be ordered based on the connectivity
//get only the needed nodes from this path by comparing with dNodes and eNodes
UNWIND p2nodes as n2
with n2, p2nodes where n2 in dNodes or n2 in eNodes
WITH collect((n2.name)) as eachList,p2nodes ORDER BY p2nodes
RETURN eachList
}
//UNWIND and return as pairs
WITH DISTINCT eachList
WITH apoc.coll.pairsMin(eachList) as pairedList
unwind pairedList as x
RETURN collect(DISTINCT {source:x[0], target:x[1]})
But this is having performance issues in a larger database, either it takes a lot of time or heap memory runs out. Also searching with just 10 hops is not ideal, may have to search more!
Can you please suggest a better way to do this?