Help re-writing query

performance
cypher

(Leandro) #1

I currently have a query of these style:

MATCH (a:NodeA { `value`: 9 })
MATCH (b:NodeB { `othervalue`: 45 })
WITH a,b 
MATCH p = ()-[*1..5]-() WHERE (a in nodes(p) OR b in nodes(p))
RETURN DISTINCT(p)

The idea is to return all the different graphs in the database, that has any of those nodes present.

The query seems to work, but is killing the server.

Is there any better way to re-write the query in a more performant way?
Maybe using APOC?

I'm using Neo4j 3.5.0

Thanks!
Leandro


(Andrew Bowman) #2

Yes, APOC has a solution that should work here.

apoc.path.subgraphAll() can be used to get the subgraph from a starting node, so in this case we can get all reachable nodes and relationships:

MATCH (a:NodeA { `value`: 9 }), (b:NodeB { `othervalue`: 45 })
CALL apoc.path.subgraphAll([a, b], {maxLevel:5}) YIELD nodes, relationships
RETURN nodes, relationships

Note that this will give you back all relationships between all reachable nodes (up to depth 5), so it's possible that if there are two reachable nodes at distance 5 that are connected, you'll get the relationship between them too.

Does this work for you, or are you looking for something else?


(Leandro) #3

Thanks for the quick reply.

Yes, that is exactly what I was trying to do, but by some reason I can't get the same results.

MATCH (a:MyNode { `value`: 9 })
WITH a MATCH p = ()-[*1..1]-() WHERE a in nodes(p) RETURN DISTINCT(p)

This returns 4 graphs, with 3 nodes each.

MATCH (a:MyNode { `value`: 9 })
CALL apoc.path.subgraphAll(a, {maxLevel:1}) YIELD nodes, relationships
RETURN nodes, relationships;

This returns only 1 graph, with 3 nodes

So instead of the 4 graphs, is only returning one.

Any idea why?

Just in case:

MATCH (a:MyNode {value: 9 }) return a

Returns 4 nodes.

Thanks!


(Andrew Bowman) #4

You may need to explain what you mean by "returning graphs". Your query is returning paths, not graphs. The apoc call is returning a single row with a collection of reachable nodes and the collection of relationships between those nodes, and that combination represents the subgraph reachable at the given distance from the starting node.


(Leandro) #5

Ok, my apologies if I'm using the wrong terminology. Please see attached images.

To put it in another words, the first query is returning 12 nodes, while the second only 3.

Thanks again for your time!


(Andrew Bowman) #6

I can reproduce what you're seeing, but looking at the code returned (and running this to UNWIND on the lists) I can see that it does have the proper results being returned. The query itself is correct and returning the correct values: 1 row per a node, and for each row a separate list of nodes and relationships from the expansion.

This is looking like a bug in the browser when displaying nodes and relationships that are in a list, when we have lists across multiple rows, though I can't quite isolate which logical case it's getting tripped up at.

In the meantime, as a workaround, when you know you have multiple starting nodes and not just one, you can collect the starting nodes first and use that when making the APOC call. That will result in just a single row with a single list of nodes reachable by either node, and relationships reachable by either node. The browser/visualizer isn't tripping on that.

MATCH (a:MyNode { `value`: 9 })
WITH collect(a) as startNodes
CALL apoc.path.subgraphAll(startNodes, {maxLevel:1}) YIELD nodes, relationships
RETURN nodes, relationships;

(Leandro) #7

Thank you very much. That seems to do the trick.


(Andrew Bowman) #8

I've created a github issue for the browser team to investigate here