Removing sub-chains when aggregating relationships

I have a parent -child structure that I maintain in a graph database . I am trying to aggregate all of the child entities to a parent where the relationship type is of a particular category. Query below
<
MATCH (e),(e1),p=(e)-[r*]->(e1)
WHERE all (a in relationships(p) where type(a) in ["C1","C2","C3"])
UNWIND relationships(p) as rel
return e1.name,collect(distinct type(rel)) as types,collect(distinct e.name) AS related_entity
/>

This query works however I want to remove sub-chains within a chain and only display the top level aggregation. Basically I want the first row only in the below result set.

e1.name types related_entity
Entity A C1,C2 Entity 1 , Entity 2 , Entity 3 , Entity 4
Entity 2 C1 Entity 3

I am using Neo4j desktop Version: 1.3.3

Hello @adriel.frederick and welcome to the Neo4j community :slight_smile:

MATCH (e), (e1), p=(e)-[r*]->(e1)
WHERE all (a IN relationships(p) WHERE type(a) IN ["C1","C2","C3"])
UNWIND relationships(p) AS rel
WITH e1.name, collect(distinct type(rel)) AS types, collect(distinct e.name) AS related_entity
WHERE size(related_entity) > 1
RETURN e1.name, types, related_entity

Regards,
Cobra

Thanks Cobra , this worked for me . However this also filters out some results I was looking to keep like chains where there is only 1 parent and child . Comments below

e1.name types related_entity Comments
Entity A C1,C2 Entity 1 , Entity 2 , Entity 3 , Entity 4 Multi level aggregation
Entity 2 C1 Entity 3 Sub-chain of above aggregation needs to be removed
Entity B C1 Entity 5 Unrelated single parent/child needs to be retained

Oh I see, you need to keep the first related_entity somewhere and for each next rows check if the related_entity is in the global related_entity.

Yes Cobra , that should work for my usecase

1 Like

Hi Cobra ,
Was working on this and found a partial solution . I can collect the entities that I want using the query below

MATCH (ent:Entity)
    WHERE NOT(ent)-[]->()
    and ()-[]->(ent)
    RETURN collect(ent.name) as entity_list

and then plugin the entities one by one in the query below .

MATCH (e),(e1),p=(e)-[r*]->(e1)
WHERE all (a in relationships(p) where type(a) in ["C1","C2","C3"])
AND e1.name = "Value from entity list "
UNWIND relationships(p) as rel
return e1.name,collect(distinct type(rel)) as types,collect(distinct e.name) AS related_entity

Would you know how to connect these two queries ?

If you are on the lastest version of Neo4j (>= 4.1), you can use subqueries:

MATCH (ent:Entity)
WHERE NOT(ent)-[]->()
AND ()-[]->(ent)
WITH DISTINCT ent.name AS entity
CALL {
    MATCH (e),(e1),p=(e)-[r*]->(e1)
    WHERE all (a in relationships(p) where type(a) in ["C1","C2","C3"])
    AND e1.name = entity
    UNWIND relationships(p) as rel
    RETURN e1.name AS name,
           collect(distinct type(rel)) AS types,
           collect(distinct e.name) AS related_entity
}
RETURN name, types, related_entity

Otherwise, you can use apoc.cypher.run() from APOC plugin:

MATCH (ent:Entity)
WHERE NOT(ent)-[]->()
AND ()-[]->(ent)
WITH DISTINCT ent.name AS entity
CALL apoc.cypher.run('
    MATCH (e), (e1), p=(e)-[r*]->(e1)
    WHERE all (a in relationships(p) where type(a) in ["C1","C2","C3"])
    AND e1.name = entity
    UNWIND relationships(p) as rel
    RETURN e1.name AS name,
           collect(distinct type(rel)) AS types, 
           collect(distinct e.name) AS related_entity
', {entity:entity}) YIELD value
RETURN value.name AS name,
       value.types AS types,
       value.related_entity AS related_entity

Regards,
Cobra