Query Performance for Label Matching

rookuu · November 24, 2021, 1:41pm

Hello!

I feel like I'm encountering some unintuitive behavior that I'm trying to explain.

My database has about 50k nodes, and 200k relationships. I have two queries with wildly different performance characteristics.

Query 1:

MATCH p=(x:Element {name: "Target"})<-[:Has|Belongs*]-(y) RETURN y

This computes with 5200 total hits in 65ms.

Query 2:

MATCH p=(x:Element {name: "Target"})<-[:Has|Belongs*]-(y:Node) RETURN y

This computes with 8,813,850 db hits in 32396ms

I would have expected that the second query would have less computation time since the set of source nodes is restricted to a specific label. Am I missing something?

Bennu · November 24, 2021, 2:11pm

Hi @rookuu !

Can you share the Explain of each query?

Second one adds for sure a Filter on y nodes.

Bennu

rookuu · November 24, 2021, 2:54pm

Query 1:

NodeIndexSeek x:Element(name) WHERE name = $
VarLengthExpand (x)<-[anon_0:Has|Belongs*]-(y)
ProduceResults y

Query 2:

NodeIndexSeek x:Element(name) WHERE name = $ && NodeByLabelScan y:Node
CartesianProduct x, y
VarLengthExpand (x)<-[anon_0:Has|Belongs*]-(y)
ProductResults y

Apologies for the notation, the difference being that in Query 2, it runs NodeByLabelScan y:Node at the same time as NodeIndexWeek x:Element(name) then feeds that into CartesianProduct.

Profiling both queries tells me that it's the VarLengthExpand that differs wildly in db hits from about 4.5k to 9 million hits.

Bennu · November 25, 2021, 10:07am

Hi @rookuu !

Clearly the problem is that the Query planner is using a NodeByLabelScan plus Cartesian Product instead of Expanding on x and filtering on y afterwards. Which version of Neo4J are you using?

Can you try:

MATCH (x:Element {name: "Target"})
WITH x
MATCH p=(x)<-[:Has|Belongs*]-(y:Node) 
RETURN y

Bennu

PS: Next time, a screenshot of the planner could be easier for both of us.

Topic		Replies	Views
Variable path with Label is very slow(2nd post) Cypher	3	245	January 25, 2024
Why does specifying the label of a node result in a Cartesian product? Cypher performance , cypher	7	300	October 19, 2023
Why does a single label massively alter my query plan? Cypher	15	921	March 2, 2020
Label limits for the query planner? Cypher performance , cypher	1	266	April 4, 2023
Performance issue when matching a node without specific label Cypher performance , cypher	5	232	April 18, 2024

July Summer Fun!

Query Performance for Label Matching

Related topics