How to deny traversing subgraph if there is a specific relationship type

Hello,

I'm looking for a way to stop traversing a subgraph when there is at least one relationship of the certain type.

I have a similar approach for the nodes labels. I add a label Excluded on the nodes I want to exclude for some reasons, use apoc.path.subgraphNodes, and add labelFilter: "-Excluded".

I'd want to make this approach more granular and exclude relationships instead of nodes. Unfortunately relationshipFilter does not support the same syntax as labelFilter and I can't specify relationshipFilter: "-Excluded" to have a query like that:

    MATCH (n: A {id: $node_id})
    CALL apoc.path.subgraphNodes(
        n,
        {{
            filterStartNode: false,
            maxLevel: $max_level,
            minLevel: $min_level,
            labelFilter: "-Excluded|SomeOtherFilters",
            relationshipFilter: "-Excluded|SomeOtherFilters",
            limit: $limit
        }}
    )
    YIELD node
    RETURN node

Data model looks like that :

MERGE (a: A {id: 1} )
MERGE (b1:B {id: 1})
MERGE (b2:B {id: 2})
MERGE  (a) - [:UsedRelA] -> (b1)
MERGE  (a) - [:Excluded] -> (b1)
MERGE  (a) - [:UsedRelA] -> (b2)

I want to run a query that will get me all the nodes connected to node a using a relationship UsedRelA and does not have a relationship Excluded. So in this case it will return nodes a, and b2 but not b1 as it has a connection Excluded.

I was trying to achieve it using the new cypher syntax but I don't get the desired result neither.

MATCH (a:A)-[r where r.relType <> "Excluded"]-{,}(b:B) return *

Is it possible to make this kind of filtering? I'm especially interested in the solution using apoc.path.subgraphNodes as it has a lot better performance than pure cypher.

UPD:
I prefer to use apoc as the real data have a lot of nodes and multiple hops relationships.
The results could have 5+ hops with 10k+ nodes, and so far I didn't find a way to use pure cypher that will be at least close to the same apoc performance.

Thanks for your help!

What about:

MATCH (a:A) ((c)-[r where r:UsedRelA AND not exists ((c)-[:Excluded]-(b))]->(b:B))+ return *;

does it do what you want?

1 Like

I would not use node labels nor relation types to indicate criteria for exclusion in a query. Those are for categorizing domain objects. I would instead put the exclusion criteria in the query.

There can be scenarios where you would do such a thing, like for data remediation or refactoring your data model. But these labels would be temporary.

Anyways, if you are querying a node and its relationships, apoc path methods are not necessary. You can try this query to get the result you requested:

MATCH (a:A)-[r:UsedRelA]->(b)
Where not exists( (a)-[:Excluded]-(b) )
Return a, r, b
2 Likes

Hello @glilienfield, thanks for your reply.

Could you develop this point please?
We've decided to put labels as it was the only good option we found to properly exclude nodes using apoc.path.subgraphNodes. As If we use a post filtering on the output by properties we got "floating islands" not related to the anchor node but still in the output.

The real database has a lot of data and I need to get all the related nodes on the multiple hops (could be more than 10K nodes on 5+ hops), what made usage of the pure cypher impossible. On the other side apoc.path.subgraphNodes resolved the performance problem.

Thanks for an example.

Hello @valerio.malenchino ,

Thanks for your reply. Yes, it gives the desired output. I'll try to adapt it to multiple hops now.

The cypher was for solving the example you presented. It was a simple one hop scenario.

You should use apoc path methods for traversing large subgraphs or create a custom procedure using custom traversal or use the traversal framework.

Yes, post filtering is not efficient for traversing subgraphs, so exclusion labels makes sense if you are using apoc. You would incorporate the exclusion criteria into a custom procedure, so you could avoid temp labels. As you stated, it does have the limitation with relationships, since the can have just one type.

Sorry, I should have given all the details. I've tried to simplify an example and went too far by removing important details as well. I'll update the description.

So I guess that if apoc does not properly allow to make this kind of filtering, the way to go is "custom procedure using custom traversal or use the traversal framework".
Can they be used on the AuraDB instances or it's restricted for the self-hosted instances only?

The query I posted should work for one or more hops. I am not sure the logic it implements is what you need, but it will stop traversing the moment it finds a node that is connected to the next via the :Excluded relationship (in addition to the allowed one).

For complex traversals, use Neo4j patterns, like the one I used in my query. See https://neo4j.com/docs/cypher-manual/current/patterns/

That is usually more efficient and better integrated with other Cypher features than APOC (e.g work well with parallel runtime), and leave the Traversal API as an absolute last resort. Remember that somebody will have to maintain what you build. As somebody once said, “Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.”

The Traversal API and custom code will not work in AURA.

@valerio.malenchino, thanks for the advise. I usually follow the "psychopath" principle.
However, when I'm running Neo4j patterns (even without adding the filtering asked in this topic) I get the out of memory problems starting from the 4+ hops. But on the other side the apoc query returns a response in less then a 100ms. Probably it's a skill issue on my side but still...

Anyway, thanks for your help