Create Graph Data Science Graph from APOC Graph

fschlatt · November 13, 2020, 1:00pm

I would like to run some graph data science algorithms on a sub spanning tree of nodes and relations of a larger graph. To get the spanning tree, I use the APOC library.

MATCH (e:Entity)
WHERE e.property in ["foo", "bar"] # some property condition
CALL apoc.path.spanningTree(e, minLevel: 0, maxLevel: 3})
YIELD path
WITH collect(path) as paths
CALL apoc.graph.fromPaths(paths, "tree", null)
YIELD graph
RETURN *;

I'm now at a loss of how to get that subgraph into a graph data science graph though. I don't think the native projection can capture the complex substructure of a sub spanning tree from specific starting nodes.

The only way I can currently see is to use the cypher projection. I would need to unravel the node and relationship ids and add match for them, but that approach seems very inelegant.

The easiest way would probably be if I could directly pass node and relationship ids to gds.graph.create, but that doesn't seem to be a possibility. Maybe I am missing something very obvious, but any and all suggestions would be appreciated.

Joel · November 13, 2020, 8:53pm

I tinkered with this a bit, best I can tell the gds routines only operate on the main db store, I couldn't get them to recognize virtual relationships/nodes (or graph). It is what I expected, but gave it a try anyway. At the moment, I can see two ways to do this,

Use cypher project, just do the query twice. You'll need to adjust for your dataset and use, but here is my test query, below... Seems to work ok. Yes this runs the query twice, but that is the GDS design for Cypher projection, one query for nodes and another for the relationships.

CALL gds.graph.create.cypher(
    'my-cypher-graph',
    'match (e:Gene {name:"SNCA"}) 
CALL apoc.path.spanningTree(e, {minLevel: 0, maxLevel: 3, limit: 25}) YIELD path
unwind nodes(path) as n return id(n) as id',
    'match (e:Gene {name:"SNCA"}) 
CALL apoc.path.spanningTree(e, {minLevel: 0, maxLevel: 3, limit: 25}) YIELD path
unwind relationships(path) as r
RETURN id(startNode(r)) AS source, id(endNode(r)) AS target, type(r) AS type'
)

One could use tagging to mark the subgraph by adding a new label to all the nodes in the subgraph (caveat: this modifies the graph!), then I believe one could use native projection (or cypher projection) to extract the subgraph easily. If identifying a sub-graph required a complicated or compute intensive process this approach might be worth exploring, but probably not for a simple spanning tree...

fschlatt · November 13, 2020, 9:25pm

I hadn't thought of adding labels. That might be a good option. Thanks for the help though!

I'll also try asking in the GDS github repo to see if the devs have any other ideas. If anything comes up I'll post it here as well.

stuart.laurie1 · May 19, 2021, 5:55pm

I have just been looking into a similar question, and using parameters you can pass in the node and relationship ids (GDSL docs - projection parameters)

So, a query using the base example from apoc.graph.fromPaths docs would be the following:

MATCH path = (:Person)-[:ACTED_IN]->(:Movie)
WITH collect(path) AS paths
CALL apoc.graph.fromPaths(paths,'test', {})
YIELD graph AS g
WITH [node in g.nodes | ID(node)] AS nodeIds, 
     [rel in g.relationships | [ID(startNode(rel)),ID(endNode(rel))]] AS relIds
CALL gds.graph.create.cypher(
    'test-param-input',
    'UNWIND $nodes AS id RETURN id',
    'UNWIND $relationships AS rel RETURN rel[0] as source, rel[1] as target',
    {
       parameters: { nodes: nodeIds, relationships: relIds }
    }
) YIELD graphName, nodeCount, relationshipCount
return graphName, nodeCount, relationshipCount

Topic		Replies	Views
Running GDS on a subsets of a graph projection based on list of nodes Graph Algorithms/Graph Data Science apoc	4	692	May 17, 2021
Duplicate a subgraph Cypher apoc , performance , cypher	25	583	May 19, 2022
Using a subgraph for cypher projection (APOC) Procedures & APOC apoc , cypher	0	744	August 6, 2019
Creating a subgraph that makes certain neighboring nodes a property of the new node Procedures & APOC apoc , cypher	0	401	July 24, 2020
Algorithm On Subgraph Graph Algorithms/Graph Data Science cypher , graph , data-science	1	527	January 18, 2022

Create Graph Data Science Graph from APOC Graph

Related topics