Getting to a certain label

dariusaudryc1 · May 3, 2019, 6:36am

So I have an identification of node, let say a user id. From this user id, I want to get the subgraph of he is connected to other entities such as other users, or other institution such as bank or company. Let say I want to take the eight-degree separation:

(u {u.id = '32dsf51'} )-[*1..8]-()

This is because of a transaction between two entities is represented as a node.

The question that I want to know is whether we can count the number of unique node label? For example, I want to know how many and what are the bank accounts within the subgraph above?

I assume we use the function Node() but I don't know what to put inside? Is this where we use FOREACH?

david_allen · May 3, 2019, 4:50pm

When you do something like this, you're matching a path, not a node. On that path (that could be between 1 and 8 hops long) there could be a lot of nodes!

So you want this function:

And you want to use it together with binding the path, like this:

MATCH p=(u {u.id = '32dsf51'} )-[*1..8]-()
RETURN length(p), nodes(p)

If you want to count the unique node labels of everything in nodes(p), this is left as an exercise to the reader. :) But what you want to look into is that nodes(p) returns a list. You'll have lots of paths, so you'll have lots of lists. You'll need to work through those lists and build the unique labels of all of the nodes.

andrew_bowman · May 4, 2019, 2:12am

I'll add on a bit.

For getting distinct nodes of a subgraph, APOC Procedures should help you out, notably the path expander proc apoc.path.subgraphNodes(). This uses a special type of expansion behavior that is optimized for finding distinct nodes and otherwise pruning potential paths if we've visited a node previously.

Once we have these nodes we can UNWIND the labels of those nodes and get the count of distinct labels.

Oh, and you should definitely be using labels yourself in your match pattern, as otherwise this will do an all nodes scan to find u, which will hurt the performance of your query (you'll also want an index on the label+id for quick lookup).

So let's assume that we're using the label :Node in your graph (replace it with whatever you're actually using).

The query would be:

MATCH (u:Node {id:'32dsf51'}) // though you'll want to parameterize this
CALL apoc.path.subgraphNodes(u, {maxLevel:8}) YIELD node
WITH node
SKIP 1 // ignore the starting node
UNWIND labels(node) as label
RETURN count(DISTINCT label) as uniqueNodeLabels

dariusaudryc1 · May 7, 2019, 2:22am

Thanks, David and Andrew,

I haven't explored enough on APOC, but I will definitely download it and give it a try.

Something that I tried:

match (b:node)
    where b.id = '32dsf51'//total_amount > 5000000
match (b)-[*1..3]-(u)
return 
    collect(CASE WHEN ANY(x IN labels(u) WHERE x='LABEL1') THEN u.id ELSE null END) AS list_of_label1_id, 
    count(*) as num_label1

This gives me the result that I want, but is there a better way of filtering label that is faster than case when?

andrew_bowman · May 7, 2019, 8:29am

If you're filtering for only specific, known labels that are hardcoded (in this case just 'LABEL1'), then you can use a list comprehension to do both filtering and extraction (of the node id) at once:

match (b:node)-[*1..3]-(u)  // just do the entire pattern here
    where b.id = '32dsf51'//total_amount > 5000000
with 
    [node in collect(u) WHERE node:Label1 | node.id] AS list_of_label1_id
return list_of_label1_id, size(list_of_label1_id) as num_label1

Topic		Replies	Views
Finding all connected nodes in a cyclic graph Cypher	10	5775	February 24, 2020
How to count relationships of nodes in subgraph Cypher	8	692	January 12, 2022
How apoc.path.subgraphNodes travers the graph? Procedures & APOC apoc , cypher , operations	7	722	October 20, 2020
How to count the nodes group by their properties' values Cypher apoc	3	2081	July 25, 2021
How do I collect all of the nodes in a subgraph Neo4j Graph Platform cypher	9	2655	October 21, 2020

Getting to a certain label

Related topics