Dynamic multi pattern query

Hi,

I am working on building a multi hop recursive pattern / path across several different node and relationship combinations (see below)

(:Node1)<-[:REL_A_1]-(:NodeA)-[:REL_A_2]->(:Node2)
(:Node1)<-[:REL_B_1]-(:NodeB)-[:REL_B_2]->(:Node2)
(:Node1)<-[:REL_C_1]-(:NodeC)-[:REL_C_2]->(:Node2)

such that going from (:Node1) to (:Node2) can traverse any of those 3 patterns. And I want the patter to continue recursively once it hits (:Node2).
Note: (:Node1) and (:Node2) are the same type, but labeled different in this example to show that these aren't the same node.

I have been working with apoc.path.expandConfig which is promising:

CALL apoc.path.expandConfig(node, {
  relationshipFilter: "<REL_A_1|REL_A_2>|<REL_B_1|REL_B_2>|<REL_C_1|REL_C_2>",
  labelFilter: ">Node2
}) YIELD path

However, due to the graph model and recursive nature, (:NodeA), (:NodeB), and (:NodeC) all share a similar property, year, meaning multiple paths can exist, one for each year value. Based on my understanding of how this apoc.path.expandConfig works, it returns each path such that if multiple years are present along each hop it returns that path which can result in a string of paths where each property value year is different.

Is it possible to incorporate a property filter with this method? It doesn't appear so based on what I am seeing in the documentation.

Is there another way to accomplish this using cypher? Or would it be best served as a custom procedure?

NOTE: I am currently working in 4.4 CE

expandConfig will return a collection of paths. The two issues to note are 1) the paths will contain subpaths (paths that are a subset of another longer path), and 2) the paths don't have the property constraint you desire.

You can address each of these with cypher filtering of the results in a 'where' clause with a predicate to test 1) that the path's ending node is a leaf node (it only has one relationship), and 2) all the nodes along the path have the same year with a specific value.

These predicates should work:

count {with last(nodes(path)) as lastNode match (lastNode)--(x) return x} = 1

and

all(i in nodes(path) where i.year = '2023')

Yes I have been doing some filtering once all the paths are collected and it works well when the network is relatively small. However, I am looking at networks that can easily go 10+ levels deep with year ranges of around 7 years. So it quickly becomes "bloated". Hence why I was hoping to improve the initial query process to avoid having to do too much post processing.

Here is more of what I am currently working on:

CALL apoc.path.expandConfig(node, {
  relationshipFilter: "<REL_A_1|REL_A_2>|<REL_B_1|REL_B_2>|<REL_C_1|REL_C_2>",
  labelFilter: ">Node2
}) YIELD path
WITH node, year, path
WITH node, year, path,
           [x IN nodes(path) WHERE x:NodeA OR x:NodeB OR x:NodeC | x] AS nodes,
           [x IN nodes(path) WHERE x:Node1 | x.prop] AS props
WHERE all(x IN nodes WHERE x.year = year)
RETURN node, year, path, nodes, props

So I am doing some post filtering, but due to the different nodes that are along the path I have to filter them apart and then use the all() function. Lots of eagerness going on.

One thing I just realized is that I can adjust my WHERE clause placement to the following:

WITH node, year, path
WHERE none(x IN nodes(path) WHERE (x:NodeA OR x:NodeB OR x:NodeC) AND x.year <> year)
WITH node, year, path,
           [x IN nodes(path) WHERE x:NodeA OR x:NodeB OR x:NodeC | x] AS nodes,
           [x IN nodes(path) WHERE x:Node1 | x.prop] AS props

Using none() before that other list functions does help filter things first.

That makes sense....

Your concerns about the performance being impacted as the graph complexity growing is valid. The best solution is a custom procedure that traverses the graph while enforcing your constraints. It will be more efficient since 1) you will not be creating all paths and post filtering them based on your property constraint, 2) you will not return the subgraphs, and 3) there will be less memory needed since you will only be accumulating the nodes/relationships as your traverse them (not all the individual paths that make up the graph).

BTW- you shouldn't need the first 'with' statement.