UNION vs MATCH/WITH/COLLECT/UNWIND performance

awu · September 25, 2018, 2:07pm

How is the performance of UNION queries versus the MATCH/WITH/COLLECT/UNWIND strategy?
The latter allows for post-union processing for sorting and processing, but I fear a performance hit in looping through the result sets.
I wonder if it is better to have multiple queries instead of an aggregate if the cypher query does not perform well. Any tips or hints?

Thanks in advance for the insights.

andrew_bowman · September 25, 2018, 3:33pm

I think we'd need to take a look at the query in particular to give any kind of accurate advice.

awu · September 25, 2018, 3:56pm

Thanks for the response @andrew_bowman.

Here's something more specific:

MATCH (me:Person {id:"myID"})-[:EMPLOYEE_OF]->(c:Company)<-[rel:EMPLOYEE_OF]-(them:Person) return them, rel, c
UNION
MATCH (them:Person {id:"myID"})-[rel:EMPLOYEE_OF]->(c:Company) return them, rel, c

versus

MATCH (them:Person {id:"myID"})-[rel:EMPLOYEE_OF]->(c:Company)
WITH collect({them:them, rel:rel, c:c}) as MyRow
MATCH (me:Person {id:"myID"})-[:EMPLOYEE_OF]->(c:Company)<-[rel:EMPLOYEE_OF]-(them:Person) return them, rel, c
WITH MyRow + collect({them:them, rel:rel, c:c}) as AllRows
UNWIND AllRows as rows
WITH rows.them as them, rows.rel as rel, rows.c as c
RETURN them, rel, c

andrew_bowman · September 25, 2018, 3:59pm

In this case wouldn't it be enough to use the initial id, get the connected company, then match out to all of the employees (which will include the person with "myId")?

MATCH (:Person {id:"myID"})-[:EMPLOYEE_OF]->(c:Company)
MATCH (c)<-[rel:EMPLOYEE_OF]-(them:Person) 
RETURN them, rel, c

awu · September 25, 2018, 4:36pm

@andrew_bowman - That's brilliant! Thank you!

How would I extend this if I wanted to include a company's parent company's employees?

ie.

MATCH (:Person {id:"myID"})-[:EMPLOYEE_OF]->(:Company)
<-[:SUBSIDIARY_OF]-(c:Company)
<-[rel:EMPLOYEE_OF]-(them:Person)
RETURN them, rel, c

Previously I "union-ed" this match with the ones above, but it seems like there may be smarter ways to handle this in cypher.

Thanks!

andrew_bowman · September 25, 2018, 4:40pm

Ah, for this one you can use a <-[:SUBSIDIARY_OF*0..1]-(c:Company) to capture both in the pattern. That is, c will include both the connected companies as well as the origin company, so we'll end up getting employees of both.

MATCH (:Person {id:"myID"})-[:EMPLOYEE_OF]->(c1:Company)
MATCH (c1)<-[:SUBSIDIARY_OF*0..1]-(c:Company)<-[rel:EMPLOYEE_OF]-(them:Person) 
RETURN them, rel, c

awu · September 25, 2018, 4:49pm

@andrew_bowman - Thank you for the help, this is beautiful. Appreciate your timely responses!

signed,

cypher n00b

andrew_bowman · September 25, 2018, 4:50pm

Glad to help! I'll probably create a knowledge base article from this, may I use some of these as examples?

awu · September 26, 2018, 4:55am

@andrew_bowman - Of course, glad those were good examples to follow.

Topic		Replies	Views
Using UNWIND to generate multiple MATCH statements Cypher cypher	6	78	February 27, 2025
Problem on the UNWIND clause Cypher cypher	1	330	July 15, 2021
Strange results using UNION Cypher cypher	1	201	November 9, 2021
Create a query that brings data from two different nodes that doesn't have any relationship Cypher cypher	5	460	May 18, 2022
Simplify/optimizing my query (removing UNION) Cypher optimization , cypher , union	3	271	March 22, 2023

Get Certified in June!

UNION vs MATCH/WITH/COLLECT/UNWIND performance

Related topics