Neo4j Efficiently Query Multiple Nodes Each With Different WHERE Values

marcuschiu · September 30, 2022, 2:23am

Given a database with thousands of nodes:

MERGE (n1:Node {id:1});
MERGE (n2:Node {id:2, first:"John", last:"Doe"});
etc

I want to construct a SCALABLE query that returns nodes if any 1 of the following conditions applies:

(node.id = 12) OR (node.first = "Turkey" AND node.last = "Legs")
(node.id = 12) OR (node.first = "John" AND node.last = "Doe")
(node.id = 1) OR (node.first = "Jiggly" AND node.last = "Puff")
thousands more

I have come up with the following query but it threw an out-of-memory error when the number of conditions reached 10,000

WITH [
   {id: 12, first: "Turkey", last: "Legs"},
   {id: 12, first: "John", last: "Doe"},
   {id: 1, first: "Jiggly", last: "Puff"}
   // thousands more
] AS conditions
UNWIND conditions AS condition
MATCH (n:Node) WHERE node.id = condition.id) AND (n.first = condition.first AND n.last = condition.last)
RETURN collect(DISTINCT n)

Just in case you are wondering, I have added indices and composite indices where needed

glilienfield · September 30, 2022, 3:24am

Do you run out of memory if you return ‘n’ , without the collect and distinct?

micro · September 30, 2022, 2:51am

IMHO , this is not graph traversal and you are just trying with typical SQL way of scanning the entire table based on Index filter . When you say 1000 more such conditions, we need to have labels or relationship based on the conditions .
Wud be great to understand your problem statement more . Is it like you want to check Friend of Friend network at Hops level?
Cant we label the Generic name 'Node' to differentiate with 'SpecialNode' | ' General' | 'Churn' and using sub query you can consolidate the results from different labels of Node

Check this apoc.path.expand - APOC Documentation (neo4j.com)

marcuschiu · September 30, 2022, 3:03am

yes, you are correct it's more of a typical SQL way of scanning the entire table based on index filter. There are no graph traversals in the query that I am trying to construct, thus the conditions have no labels nor relationships. All nodes are assumed to be the same label (i.e. Node in this case).

You may be wondering why use Neo4j instead of an SQL database, well this is a subproblem extracted from a much larger graph database with different node labels and relationships. But I left that out since it wasn't needed as part of the question

marcuschiu · October 2, 2022, 4:44pm

Yes I ran it without collect(DISTINCT ?) and it still ran out of memory.

glilienfield · October 2, 2022, 5:22pm

Can you post the query plan. You insert ‘explain’ at the beginning of the query. There will be a new ‘Plan’ icon in the list of output formats.

Topic		Replies	Views
Query all complete net of a specific node Cypher querying , cypher	8	346	November 3, 2021
Help re-writing query Cypher performance , cypher	7	1050	March 11, 2019
Efficiently filter nodes which have multiple relationships Cypher	3	7532	February 5, 2019
Visualize entire graph but with one condition filter Neo4j Graph Platform	2	211	February 24, 2022
Conditionally returning data based on the node label Neo4j Graph Platform migrated	8	191	October 28, 2022

Get Certified in June!

Neo4j Efficiently Query Multiple Nodes Each With Different WHERE Values

Related topics