Hi,
Using py2neo, I have a graph
mygraph = Graph("bolt://localhost:7687", auth=("neo4j", "***"))
I do all my queries in mygraph, also the graph algorithms
betweenness_query = """CALL algo.betweenness('', '',{
concurrency: 8,
direction: 'Both',
writeProperty: 'betweenness'
})"""
mygraph.run(betweenness_query).data()
betweenness_table = """ MATCH (n) RETURN n as Nodes, id(n) as Node_id, n.betweenness as BetweenNess """
BetweenNess = mygraph.run(betweenness_table).to_data_frame()
BetweenNess
I looking to filter out a set of nodes
mygraph.run("MATCH (t:Trans {TrxID: 'T1'})-[*1..5]-(x) RETURN x").to_ndarray()
From the above I get a set of nodes, or any traversal of few nodes.
I would like to save that in a new variable for subgraph, then run the graph algorithm only on the set of nodes
is it possible?
Thanks in advance
You can store it as projected graph model (Graph Catalog - Neo4j Graph Data Science). If you are working with several node labels, look at "Loading multiple node properties" to see how you project multiple node properties.
Then you can use this projection in your algorithm by adding a graph:'nameForYourGraph'
option.
3 Likes
Hi @Thomas_Silkjaer , Thanks for that,
I have this small group in a large set of nodes
Is it possible to create for traversal with specific values like below 2 instead of labels and relations
MATCH (t:PayTransactions {TrxID: 'T17'})-[*1..5]-(x) RETURN x
MATCH path = allShortestPaths((p:PayTransactions {TrxID:'T17'})--(pp:PayTransactions {TrxID:'T18'})) RETURN path
Thanks
Using cypher projection you need the first cypher query to output the id's of all nodes in the subgraph and the second to output the source and target id of all relations.
Hi @Thomas_Silkjaer
For Cypher Projection
CALL algo.betweenness.sampled.stream(
'MATCH (n) RETURN id(n) AS id',
'MATCH (n)-[*0..2]-(m) WHERE n.CustomerNo = $cno RETURN id(n) as source, id(m) as target',
{graph:'cypher', params: {cno : 'C13'} }
);
I got centralities as zeros
By running only
MATCH (n)-[*0..2]-(m) WHERE n.CustomerNo ='C13' RETURN id(n) as source, id(m) as target
I get the nodes as shown below , May I know the issue here please
and my final call is
CALL algo.betweenness.sampled.stream(
'MATCH (n) RETURN id(n) AS id',
'MATCH (n)-[*0..2]-(m) WHERE n.CustomerNo = $cno RETURN id(n) as source, id(m) as target',
{graph:'cypher', params: {cno : 'C13'} }
) YIELD nodeId, centrality
MATCH (customer:Customer) WHERE id(customer) = nodeId
RETURN customer.CustomerNo AS Customer, centrality AS BetweennessCentrality
ORDER BY centrality DESC limit 5;
which gives me
Thanks
Instead of above Cypher Projection then running
What if I run betweenness on whole graph and return the values
#Betweenness
CALL algo.betweenness.sampled('', '',{
concurrency: 8,
direction: 'Both',
maxDepth: null,
probability: null,
strategy: 'random',
writeProperty: 'approxBetweenness'
})
#Return Max Betweenness
MATCH (n)-[*0..2]-(m) WHERE n.CustomerNo = 'C13' RETURN id(m) AS ID, m.CustomerNo as Customer, m.betweenness as BetweennessCentrality
ORDER BY m.betweenness DESC
LIMIT 5
and I get
Can you please confirm whether both going to return me same betweenness
Thanks
Sorry, my knowledge on the algo's ends here. Hope someone else pitches in
Thanks @Thomas_Silkjaer , no worries
The betweenness centrality of a node is calculated as the sum, for every pair of nodes, fraction of all pairs shortest paths in the graph that pass through that node divided by the total number of shortest paths between each pair of nodes:
If you compute betweenness centrality on a subgraph, you have a different set of nodes that you're calculating shortest paths between, so it's likely you'll get different metrics. My intuition is you'll probably get higher betweenness centrality metrics for the nodes in your subgraph vs when they're in the full graph (you're cutting out a whole lot of paths that don't go through them).
Does that make sense?