Sub-graph analytics

Goal: run analytic metrics for the nearest neighbors (nn) and nn + next nearest neighbors.

I created a bimodal graph from Kaggle's Marvel Universe Social Network:

CALL apoc.schema.assert( {},
{Comic:['name'],Hero:['name']})

CALL apoc.load.csv('https://raw.githubusercontent.com/tomasonjo/neo4j-marvel/master/data/edges.csv') yield map as row WITH row
MERGE (h:Hero {name:row.hero})
MERGE (c:Comic {name:row.comic})
MERGE (h)-[:APPEARS_IN]->(c)

I want to get the density within a local ego net (density around a specific node with it's nearest nearest neighbors and the nearest neighbors + next nearest neighbors). I want to do this for all Heroes in the graph (6,439).

The best solution appears to be to use the apoc.paths.subgraphAll() function.

Using the following code I can get the subgraph for a specific Hero:

CALL apoc.path.subgraphAll(h, {maxLevel:2})
YIELD nodes, relationships
RETURN nodes, relationships

Which returns the following:

Replacing "WITH" for "RETURN" should allow me to use just the nodes and relationships in the subgraph, but I can't figure out the syntax to reference just the results.

If I do:

MATCH (h:Hero {name:"4-D MAN/MERCURIO"})
CALL apoc.path.subgraphAll(h, {maxLevel:2})
YIELD nodes, relationships
RETURN count(nodes)

I get a result of 1, when the actual number of nodes in the subgraph is 82.

If I insert a match statement after "WITH" it reverts to the full graph and I get a count of all nodes in the full graph:

MATCH (h:Hero {name:"4-D MAN/MERCURIO"})
CALL apoc.path.subgraphAll(h, {maxLevel:2})
YIELD nodes, relationships
WITH nodes, relationships
MATCH (n)
RETURN count(n)

Result: 19,090

Is there an easy way to reference the nodes and relationships in the subgraph so I can count them or do other analytics on the actual subgraph that was returned?

I'll also want this capability when I add the ~900 villains to the graph and want to count how many are in the subgraph.

the count() will count how many records or rows are of the nodes key. The correct results is 1. What you want to use is the size() clause to count the length of the array.

MATCH (h:Hero {name:"4-D MAN/MERCURIO"})
CALL apoc.path.subgraphAll(h, {maxLevel:2})
YIELD nodes, relationships
RETURN size(nodes)

This returns 82

1 Like