Processing after creating a 'virtual' graph/sub-graph

mike · February 28, 2019, 4:21pm

This is a general question about what processing is possible after returning a virtual graph/sub-graph.

Our data model includes nodes/rels needed for data lineage and other 'admin' purposes. To run analysis on the 'real' nodes/rels we are creating virtual sub-graphs. I'm trying to understand the implications of taking this approach...

Is it feasible to 'virtualise' an entire graph. In our case that would mean lifting out 18M nodes from 100M and maybe 50M rels from 230M?
Is any cypher functionality NOT able to run on virtual graphs?
Can the algorithms run on virtual graphs without restriction?

Another approach could be to dynamically build a second db from the master db limiting the data model to only the real nodes/rels.

Thanks in advance.
Mike

michael.hunger · March 1, 2019, 1:06am

What kind of functionality would you want to run?

The existing virtual nodes and rels in APOC are mainly meant for visualization purposes.
There are a bunch of functions in apoc to allow you to access their properties, labels, type, id etc.
Cypher itself uses the lower level APIs so those virtual nodes with negative ids don't exist for it.

While virtualization would probably work at this scale I'd rather recommend to just use aggregation as needed and work on that aggregated data for regular cypher queries.

For graph algorithms it should work fine to e.g. use algo.graph.load to load your projection into a named graph and then run multiple algorithms on it, and either consume the results in a client or write the computations back to the graph.

mike · March 7, 2019, 7:00pm

Thanks Michael,

There are a bunch of functions in apoc to allow you to access their properties, labels, type, id etc

Is this the case for output from all apocs? For example I am using apoc.path.expand and its derivatives and want to access properties from the returned path's node/rels.

What kind of functionality would you want to run?

Yes, we want to run algos, thanks for the algo.graph.load tip

michael.hunger · March 9, 2019, 1:49pm

It's only for Virtual Nodes which APOC creates on the fly that Cypher/Kernel-SPI cannot access them.

For expand etc. it's real nodes from the DB that are returned.

Topic		Replies	Views
Querying in-memory graphs using Cypher Procedures & APOC apoc , cypher	2	293	February 14, 2023
Issue of Performance in Generating our Graphs Cypher apoc , performance , plugin	2	55	November 21, 2024
Can i apply apoc procudures/functions on virtual graphs? If so, please explain with an example Neo4j Graph Platform apoc	5	819	September 11, 2019
Is it possible to properly rewrite `apoc.path.subgraphNodes` query using Cypher? Cypher apoc , performance , cypher	2	94	September 11, 2024
Creating virtual node/rels from output of apoc.path.expand Cypher apoc	0	327	July 15, 2020

Get Certified in June!

Processing after creating a 'virtual' graph/sub-graph

Related topics