cancel
Showing results for 
Search instead for 
Did you mean: 

Join the community at Nodes 2022, our free virtual event on November 16 - 17.

Finding closest node(s) with label, through specific relationships & directions

kurt_slater
Node Link

I have a use case where I have a specific node as a starting point, and I want to find the closest nodes (with a specific label), when only traversing specific relationships in specified direction.  I've made some progress but would appreciate any suggestions on a more elegant approach.

 

The diagram below shows an example data set. This is just a subset of the overall model and there are many other relationships and nodes/labels.

The data model is basically document meta data that consists of a (:DocVersion), which [;CONTAINS] a hierarchy of (:Clause). The (:Applicability) is meta data that can be associated with (:DocVersion) or (:Clause)

 

kurt_slater_0-1654620686617.png

Starting at any Clause, I want to return the closest Applicability node.  The closest can be directly related to the starting Clause, or it could be found by traversing up the hierarchy like (c0:Clause)<-[:CONTAINS]-(c1:Clause)<-[:CONTAINS]-(:DocVersion) hierarchy.

Some examples of the starting point and expected output:

Starting ClauseExpected Output
D1V1 C1D1V1 A1a, D1V1 A2b
D1V1 C2

D1V1 A2

D1V1 C1.1

D1V1 A1a, D1V1 A2b

D1V1 C1.2.1.1

D1V1 A5a, D1V1A5b

 

Sample Data
This script will create the sample data above.

// Create Sample Doc1
MERGE (dv:DocVersion {name:'D1V1'})
MERGE (dv)-[:CONTAINS]->(c1:Clause {name:'D1V1 C1'})
MERGE (c1)-[:CONTAINS]->(c1_1:Clause {name:'D1V1 C1.1'})
MERGE (c1_1)-[:CONTAINS]->(c1_1_1:Clause {name:'D1V1 C1.1.1'})
MERGE (c1)-[:CONTAINS]->(c1_2:Clause {name:'D1V1 C1.2'})
MERGE (c1_2)-[:CONTAINS]->(c1_2_1:Clause {name:'D1V1 C1.2.1'})
MERGE (c1_2_1)-[:CONTAINS]->(c1_2_1_1:Clause {name:'D1V1 C1.2.1.1'})
MERGE (c1_2_1)-[:CONTAINS]->(c1_2_1_2:Clause {name:'D1V1 C1.2.1.2'})
MERGE (dv)-[:CONTAINS]->(c2:Clause {name:'D1V1 C2'})
MERGE (c2)-[:CONTAINS]->(c2_1:Clause {name:'D1V1 C2.1'})
MERGE (dv)-[:CONTAINS]->(c3:Clause {name:'D1V1 C3'})
MERGE (c3)-[:CONTAINS]->(c3_1:Clause {name:'D1V1 C3.1'})

 

MERGE (dv)-[:APPLIES_TO]->(a1a:Applicability {name:'D1V1 A1a'})
MERGE (dv)-[:APPLIES_TO]->(a1b:Applicability {name:'D1V1 A1b'})
MERGE (c2)-[:APPLIES_TO]->(a2:Applicability {name:'D1V1 A2'})
MERGE (c3)-[:APPLIES_TO]->(a3:Applicability {name:'D1V1 A3'})
MERGE (c1_1_1)-[:APPLIES_TO]->(a4:Applicability {name:'D1V1 A4'})
MERGE (c1_2_1)-[:APPLIES_TO]->(a5a:Applicability {name:'D1V1 A5a'})
MERGE (c1_2_1)-[:APPLIES_TO]->(a5b:Applicability {name:'D1V1 A5b'})

 

Currently I have 2 approaches that get close to what I'm looking for.  They both return a superset of what I'm wanting.

Method 1: plain cypher
This returns ALL the possible (:Applicability) and the number of relationships.   What I want is just the ones that have the shortest distance. 

WITH "D1V1 C1.2.1" as Cname
MATCH (c:Clause {name:Cname})
// Get all possibly applicability nodes
MATCH p=(a:Applicability)<-[:APPLIES_TO]-(n)-[:CONTAINS*0..10]->(c)
// For each path, get the steps to applicability node
UNWIND p as pn
// return applicability nodes and steps
WITH a, length(p) as steps
RETURN a.name, steps
ORDER BY steps ASC

Method 2: using apoc
This returns a set of paths and lengths which would then need to be parsed to see which path includes (:Applicability), and which are the closest.


// Get clause applicability (apoc)
MATCH (c:Clause {name: "D1V1 C1.2.1"})
CALL apoc.path.expand(c, "<CONTAINS|APPLIES_TO>", "Clause|DocVersion|Applicability", 1, 5)
YIELD path
RETURN path, length(path) AS hops
ORDER BY hops;
 
 
 
 
1 ACCEPTED SOLUTION

glilienfield
Ninja
Ninja

Try this modification to your cypher approach:

 

WITH "D1V1 C1.2.1" as Cname
MATCH (c:Clause {name:Cname})
MATCH p=(a:Applicability)<-[:APPLIES_TO]-(n)-[:CONTAINS*0..10]->(c)
WITH collect({a:a, length: length(p)}) as paths, min(length(p)) as minLength
unwind [i in paths where i.length = minLength] as result
return result.a as a, result.length as length

 

Screen Shot 2022-06-07 at 3.13.48 PM.png

View solution in original post

7 REPLIES 7

glilienfield
Ninja
Ninja

Try this modification to your cypher approach:

 

WITH "D1V1 C1.2.1" as Cname
MATCH (c:Clause {name:Cname})
MATCH p=(a:Applicability)<-[:APPLIES_TO]-(n)-[:CONTAINS*0..10]->(c)
WITH collect({a:a, length: length(p)}) as paths, min(length(p)) as minLength
unwind [i in paths where i.length = minLength] as result
return result.a as a, result.length as length

 

Screen Shot 2022-06-07 at 3.13.48 PM.png

Thanks! This does seem to provide the expected output, at least for a single starting point.  I'll be curious to see how performant it is with a larger data set.  My concern is the step that gets all the paths (which could be hundreds), then finds the minLength.

Give it a try to see.  I am not sure it will be too bad. You have a defined anchor node to start with; make sure there is an index on the search property. Also, your graph looks like a tree that does not have crossovers and any looping, so the expansion should not be too bad. Let us know how it works. 

ameyasoft
Graph Maven

Try this:

match (a:Clause) where a.name = "D1V1 C1"
CALL apoc.path.spanningTree(a,{labelFilter:'DocVersion|Applicability', relationshipFilter:'CONTAINS|APPLIES_TO',maxLevel:2}) YIELD path
return path
Screen Shot 2022-06-07 at 1.18.43 PM.png

kurt_slater
Node Link

Thanks.  I think the issue with this approach is the maxLevel will be different depending on the starting point. 

For example in this slightly modified example, the maxLevel needs to be 3.

match (a:Clause) where a.name = "D1V1 C1.1"
CALL apoc.path.spanningTree(a,{labelFilter:'Clause|DocVersion|Applicability', relationshipFilter:'<CONTAINS|APPLIES_TO>',maxLevel:3}) YIELD path
return path

 

 


But using 3 for a different start node results in more Applicability nodes than expected.

 
match (a:Clause) where a.name = "D1V1 C2.1"
CALL apoc.path.spanningTree(a,{labelFilter:'Clause|DocVersion|Applicability', relationshipFilter:'<CONTAINS|APPLIES_TO>',maxLevel:3}) YIELD path
return path

 

 

kurt_slater_0-1654634849534.png

 






ameyasoft
Graph Maven
match (a:Clause) where a.name = "D1V1 C2.1"
CALL apoc.path.spanningTree(a,{labelFilter:'Clause|DocVersion|Applicability',relationshipFilter:'<CONTAINS|APPLIES_TO>',maxLevel:2}) YIELD path
return path
Screen Shot 2022-06-07 at 2.22.17 PM.png

 

ameyasoft
Graph Maven

Try without having any maxLevel. This should give your expected results.

match (a:Clause) where a.name = "D1V1 C2.1"
CALL apoc.path.spanningTree(a,{labelFilter:'Clause|DocVersion|Applicability',relationshipFilter:'<CONTAINS|APPLIES_TO>'}) YIELD path
return path