cancel
Showing results for 
Search instead for 
Did you mean: 

Best way to create a metagraph of the results from a Cypher query that returns paths

Joel
Ninja
Ninja

I often work with medium size (millions of nodes/edges) heterogeneous graphs with more than a dozen different node labels, and dozens of relationship types. Often queries return too many results to display so I'm in the habit of always returning a count() first. It occurred to me that in the same way the metagraph provides a overview of the entire database, a metagraph of a cypher query result could be very useful. I have a draft that seems to work in trivial cases, but it does not feel like a clean approach (sub-optimal, and messy to use) and it might not work work in all cases? I'm believe there must be a better way, possibly there is a tricky simple approach that I didn't find or think of? I will provide a trivial example use with the Northwind database example (built in Guide demo) For this graph, and with counts shown on nodes instead of names the apoc.meta.graph() output is
3X_6_8_684ce9fcbb77163380283a3acdd70ef03a296a91.png

A query may go through multiple node and relationship types) a trivial example in this case could be

MATCH p= (p1:Supplier {supplierID:'20'})-[*..2]->(p2) RETURN p

Which returns
3X_a_1_a11d69d2026ec471acac01557430a3e0268c5827.png

The cypher query "metagraph" would look like this (below)
3X_c_c_ccb8d0ac23b2e52faf62e614a1c82e48bb5c3a40.png

I used the Cypher below to build this, the query has morphed over time as I try different angles on this task (there are many suboptimal solutions), but I'm looking for a a leap change. I'm convinced there must be a better way, perhaps a novel approach, function(s) that could help, or maybe even some kind of next level Cypher. Having to run the query twice for example bugs me, and also this approach assumes there is only one label per node (which is not true of some of my graphs).

MATCH p= (p1:Supplier {supplierID:'20'})-[*..2]->(p2)    // PUT QUERY HERE - nodes
UNWIND nodes(p) as n
with distinct n as dn
WITH labels(dn) as l
WITH apoc.create.vNode(l,{name:head(l), count:count(l)}) as metaNodes
WITH apoc.map.groupBy(collect(metaNodes),'name') as ml
MATCH p=(p1:Supplier {supplierID:'20'})-[*..2]->(p2)  // PUT QUERY HERE TOO - relationships
unwind relationships(p) as r
WITH head(labels(startNode(r))) as cFrom, head(labels(endNode(r))) as cTo, count(*) as count, ml, type(r) as tr
RETURN ml[cFrom] as from, ml[cTo] as to, apoc.create.vRelationship(ml[cFrom],tr,{count:count},ml[cTo]) as rel

I may be overthinking this, is there a less complicated approach?

Or maybe this is best done in a plugin? I like the idea enough that I might just do it, if that is the best way.

2 REPLIES 2

Sounds like a great thing to add to apoc. would you mind to create a GH issue.

either an aggregation function or a regular function or procedure

like apoc.meta.result(list-of-things)

GH feature request issue filed.