How to deny traversing subgraph if there is a specific relationship type

oleksandr.dashkov · September 5, 2024, 9:28am

Hello,

I'm looking for a way to stop traversing a subgraph when there is at least one relationship of the certain type.

I have a similar approach for the nodes labels. I add a label Excluded on the nodes I want to exclude for some reasons, use apoc.path.subgraphNodes, and add labelFilter: "-Excluded".

I'd want to make this approach more granular and exclude relationships instead of nodes. Unfortunately relationshipFilter does not support the same syntax as labelFilter and I can't specify relationshipFilter: "-Excluded" to have a query like that:

    MATCH (n: A {id: $node_id})
    CALL apoc.path.subgraphNodes(
        n,
        {{
            filterStartNode: false,
            maxLevel: $max_level,
            minLevel: $min_level,
            labelFilter: "-Excluded|SomeOtherFilters",
            relationshipFilter: "-Excluded|SomeOtherFilters",
            limit: $limit
        }}
    )
    YIELD node
    RETURN node

Data model looks like that :

MERGE (a: A {id: 1} )
MERGE (b1:B {id: 1})
MERGE (b2:B {id: 2})
MERGE  (a) - [:UsedRelA] -> (b1)
MERGE  (a) - [:Excluded] -> (b1)
MERGE  (a) - [:UsedRelA] -> (b2)

I want to run a query that will get me all the nodes connected to node a using a relationship UsedRelA and does not have a relationship Excluded. So in this case it will return nodes a, and b2 but not b1 as it has a connection Excluded.

I was trying to achieve it using the new cypher syntax but I don't get the desired result neither.

MATCH (a:A)-[r where r.relType <> "Excluded"]-{,}(b:B) return *

Is it possible to make this kind of filtering? I'm especially interested in the solution using apoc.path.subgraphNodes as it has a lot better performance than pure cypher.

UPD:
I prefer to use apoc as the real data have a lot of nodes and multiple hops relationships.
The results could have 5+ hops with 10k+ nodes, and so far I didn't find a way to use pure cypher that will be at least close to the same apoc performance.

Thanks for your help!

valerio.malenchino · September 5, 2024, 11:14am

What about:

MATCH (a:A) ((c)-[r where r:UsedRelA AND not exists ((c)-[:Excluded]-(b))]->(b:B))+ return *;

does it do what you want?

glilienfield · September 5, 2024, 11:23am

I would not use node labels nor relation types to indicate criteria for exclusion in a query. Those are for categorizing domain objects. I would instead put the exclusion criteria in the query.

There can be scenarios where you would do such a thing, like for data remediation or refactoring your data model. But these labels would be temporary.

Anyways, if you are querying a node and its relationships, apoc path methods are not necessary. You can try this query to get the result you requested:

MATCH (a:A)-[r:UsedRelA]->(b)
Where not exists( (a)-[:Excluded]-(b) )
Return a, r, b

oleksandr.dashkov · September 5, 2024, 12:37pm

Hello @glilienfield, thanks for your reply.

Could you develop this point please?
We've decided to put labels as it was the only good option we found to properly exclude nodes using apoc.path.subgraphNodes. As If we use a post filtering on the output by properties we got "floating islands" not related to the anchor node but still in the output.

The real database has a lot of data and I need to get all the related nodes on the multiple hops (could be more than 10K nodes on 5+ hops), what made usage of the pure cypher impossible. On the other side apoc.path.subgraphNodes resolved the performance problem.

Thanks for an example.

oleksandr.dashkov · September 5, 2024, 12:39pm

Hello @valerio.malenchino ,

Thanks for your reply. Yes, it gives the desired output. I'll try to adapt it to multiple hops now.

glilienfield · September 5, 2024, 2:38pm

The cypher was for solving the example you presented. It was a simple one hop scenario.

You should use apoc path methods for traversing large subgraphs or create a custom procedure using custom traversal or use the traversal framework.

Yes, post filtering is not efficient for traversing subgraphs, so exclusion labels makes sense if you are using apoc. You would incorporate the exclusion criteria into a custom procedure, so you could avoid temp labels. As you stated, it does have the limitation with relationships, since the can have just one type.

oleksandr.dashkov · September 5, 2024, 3:16pm

Sorry, I should have given all the details. I've tried to simplify an example and went too far by removing important details as well. I'll update the description.

So I guess that if apoc does not properly allow to make this kind of filtering, the way to go is "custom procedure using custom traversal or use the traversal framework".
Can they be used on the AuraDB instances or it's restricted for the self-hosted instances only?

valerio.malenchino · September 5, 2024, 3:18pm

The query I posted should work for one or more hops. I am not sure the logic it implements is what you need, but it will stop traversing the moment it finds a node that is connected to the next via the :Excluded relationship (in addition to the allowed one).

valerio.malenchino · September 5, 2024, 3:26pm

For complex traversals, use Neo4j patterns, like the one I used in my query. See https://neo4j.com/docs/cypher-manual/current/patterns/

That is usually more efficient and better integrated with other Cypher features than APOC (e.g work well with parallel runtime), and leave the Traversal API as an absolute last resort. Remember that somebody will have to maintain what you build. As somebody once said, “Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.”

The Traversal API and custom code will not work in AURA.

oleksandr.dashkov · September 6, 2024, 7:35am

@valerio.malenchino, thanks for the advise. I usually follow the "psychopath" principle.
However, when I'm running Neo4j patterns (even without adding the filtering asked in this topic) I get the out of memory problems starting from the 4+ hops. But on the other side the apoc query returns a response in less then a 100ms. Probably it's a skill issue on my side but still...

Anyway, thanks for your help

Topic		Replies	Views
Problems matching a whole subgraph without knowledge about content Cypher cypher	1	378	May 19, 2020
Relationship filters in `apoc.path.spanningTree` Neo4j Graph Platform apoc , cypher	6	250	October 11, 2023
Filter relationships on relationship properties returned by subGraphAll Procedures & APOC	15	3148	September 11, 2019
Help with the post filtering after the apoc.path.subgraphAll Cypher apoc , cypher	4	430	February 9, 2023
How to filter path relationships by relationship properties? Cypher	1	5617	January 10, 2019

How to deny traversing subgraph if there is a specific relationship type

Related topics