How to create a subgraph and run graph algorithms only on that?

apremgeorge · November 26, 2019, 8:00am

Hi,
Using py2neo, I have a graph

mygraph = Graph("bolt://localhost:7687", auth=("neo4j", "***"))

I do all my queries in mygraph, also the graph algorithms

betweenness_query = """CALL algo.betweenness('', '',{
  concurrency: 8,
  direction: 'Both',
  writeProperty: 'betweenness'
})"""
mygraph.run(betweenness_query).data()
betweenness_table = """ MATCH (n) RETURN n as Nodes, id(n) as Node_id, n.betweenness as BetweenNess """
BetweenNess = mygraph.run(betweenness_table).to_data_frame()
BetweenNess

I looking to filter out a set of nodes

mygraph.run("MATCH (t:Trans {TrxID: 'T1'})-[*1..5]-(x) RETURN x").to_ndarray()

From the above I get a set of nodes, or any traversal of few nodes.

I would like to save that in a new variable for subgraph, then run the graph algorithm only on the set of nodes

is it possible?

Thanks in advance

Thomas_Silkjaer · November 26, 2019, 10:57am

You can store it as projected graph model (Graph Catalog - Neo4j Graph Data Science). If you are working with several node labels, look at "Loading multiple node properties" to see how you project multiple node properties.

Then you can use this projection in your algorithm by adding a graph:'nameForYourGraph' option.

apremgeorge · November 27, 2019, 12:02am

Hi @Thomas_Silkjaer , Thanks for that,

I have this small group in a large set of nodes

Is it possible to create for traversal with specific values like below 2 instead of labels and relations

MATCH (t:PayTransactions {TrxID: 'T17'})-[*1..5]-(x) RETURN x

MATCH path = allShortestPaths((p:PayTransactions {TrxID:'T17'})--(pp:PayTransactions {TrxID:'T18'})) RETURN path

Thanks

Thomas_Silkjaer · November 27, 2019, 7:56am

Using cypher projection you need the first cypher query to output the id's of all nodes in the subgraph and the second to output the source and target id of all relations.

apremgeorge · November 29, 2019, 4:15am

Hi @Thomas_Silkjaer

For Cypher Projection

CALL algo.betweenness.sampled.stream(
  'MATCH (n) RETURN id(n) AS id',
  'MATCH (n)-[*0..2]-(m) WHERE n.CustomerNo = $cno RETURN id(n) as source, id(m) as target',
  {graph:'cypher', params: {cno : 'C13'} }
);

I got centralities as zeros

By running only

MATCH (n)-[*0..2]-(m) WHERE n.CustomerNo ='C13' RETURN id(n) as source, id(m) as target

I get the nodes as shown below , May I know the issue here please

and my final call is

CALL algo.betweenness.sampled.stream(
  'MATCH (n) RETURN id(n) AS id',
  'MATCH (n)-[*0..2]-(m) WHERE n.CustomerNo = $cno RETURN id(n) as source, id(m) as target',
  {graph:'cypher', params: {cno : 'C13'} }
) YIELD nodeId, centrality
MATCH (customer:Customer) WHERE id(customer) = nodeId
RETURN customer.CustomerNo AS Customer, centrality AS BetweennessCentrality
ORDER BY centrality DESC limit 5;

which gives me

Thanks

apremgeorge · November 29, 2019, 4:59am

Instead of above Cypher Projection then running

What if I run betweenness on whole graph and return the values

#Betweenness
CALL algo.betweenness.sampled('', '',{
  concurrency: 8,
  direction: 'Both',
  maxDepth: null,
  probability: null,
  strategy: 'random',
  writeProperty: 'approxBetweenness'
})

#Return Max Betweenness
MATCH (n)-[*0..2]-(m) WHERE n.CustomerNo = 'C13' RETURN id(m) AS ID, m.CustomerNo as Customer, m.betweenness as BetweennessCentrality
ORDER BY m.betweenness DESC
LIMIT 5

and I get

Can you please confirm whether both going to return me same betweenness

Thanks

Thomas_Silkjaer · November 29, 2019, 7:45am

Sorry, my knowledge on the algo's ends here. Hope someone else pitches in

apremgeorge · December 1, 2019, 11:20pm

Thanks @Thomas_Silkjaer , no worries

alicia.frame · December 3, 2019, 10:58pm

The betweenness centrality of a node is calculated as the sum, for every pair of nodes, fraction of all pairs shortest paths in the graph that pass through that node divided by the total number of shortest paths between each pair of nodes:

If you compute betweenness centrality on a subgraph, you have a different set of nodes that you're calculating shortest paths between, so it's likely you'll get different metrics. My intuition is you'll probably get higher betweenness centrality metrics for the nodes in your subgraph vs when they're in the full graph (you're cutting out a whole lot of paths that don't go through them).

Does that make sense?

Topic		Replies	Views
Passing Neo4j SubGraph to Python Graph + AI py2neo	1	1614	November 23, 2018
Py2neo interaction between subgraph and database Drivers & Stacks py2neo	0	279	August 11, 2021
Creating and querying relationships between the nodes of Neo4j SubGraphs Cypher	0	276	August 5, 2020
Using a subgraph for cypher projection (APOC) Procedures & APOC apoc , cypher	0	744	August 6, 2019
Algorithm On Subgraph Graph Algorithms/Graph Data Science cypher , graph , data-science	1	518	January 18, 2022

How to create a subgraph and run graph algorithms only on that?

Related topics