Performance issue when matching a node without specific label

yanivts1 · March 21, 2024, 6:07am

Hi,

I have nodes with different labels and i want to let my customer to search by some identifier.
I have the labels: Person, Company, Factory - all have the 'id' property.
I i specify the label like: "MATCH (n:Person {id: '1234'}) .... It works great - 2 ms (i have indexes of 'id' on all labels).
But i want it to search all db and not only a specific label , so i wrote "MATCH (n{id: '1234'})... but it takes a few seconds.
Any advise what can i do to improve it?
Thanks,

dana_canzano · March 21, 2024, 11:21am

@yanivts1
not much you can do as "MATCH (n{id: '1234'}) says find me a node, any node with any label an said node has a property named id and its value is 1234. If your graph has 100 milliion nodes then the query needs to read all 100 million nodes and then apply the filter of id=1234.
And as you saw indexes as based upon the 2 part combination of LABEL and PROPERTY(S).

glilienfield · March 21, 2024, 5:06pm

Try:

Match(n:Person|Company|Factory{id:’1234’})
Return n

dana_canzano · March 21, 2024, 8:47pm

@glilienfield @yanivts1

if you run

Match(n:Person|Company|Factory{id:’1234’})
Return n

and have an index on each of :Person, :Company, Factory and on the id property and thus 3 indexes

the resultant query plan under 5.18 is

@neo4j> profile Match(n:Person|Company|Factory{id:'1234'})
        Return n;
+---+
| n |
+---+
+---+

+-------------------------------------------------------------------------------                                               --------------------+
| Plan      | Statement   | Version | Planner | Runtime     | Time | DbHits | Ro                                               ws | Memory (Bytes) |
+-------------------------------------------------------------------------------                                               --------------------+
| "PROFILE" | "READ_ONLY" | ""      | "COST"  | "PIPELINED" | 898  | 3      | 0                                                   | 1592           |
+-------------------------------------------------------------------------------                                               --------------------+


Planner COST

Runtime PIPELINED

Runtime version 5.18

Batch size 128

+------------------+----+----------------------------------------------------+--                                               --------------+------+---------+----------------+------------------------+------                                               -----+---------------------+
| Operator         | Id | Details                                            | E                                               stimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses | Time                                                (ms) | Pipeline            |
+------------------+----+----------------------------------------------------+--                                               --------------+------+---------+----------------+------------------------+------                                               -----+---------------------+
| +ProduceResults  |  0 | n                                                  |                                                             1 |    0 |       0 |                |                        |                                                          |                     |
| |                +----+----------------------------------------------------+--                                               --------------+------+---------+----------------+                        |                                                          |                     |
| +Distinct        |  1 | n                                                  |                                                             1 |    0 |       0 |              0 |                        |                                                          |                     |
| |                +----+----------------------------------------------------+--                                               --------------+------+---------+----------------+                        |                                                          |                     |
| +Union           |  2 |                                                    |                                                             1 |    0 |       0 |            128 |                    0/0 |     0                                               .000 | Fused in Pipeline 4 |
| |\               +----+----------------------------------------------------+--                                               --------------+------+---------+----------------+------------------------+------                                               -----+---------------------+
| | +NodeIndexSeek |  3 | RANGE INDEX n:Factory(id) WHERE id = $autostring_0 |                                                             0 |    0 |       1 |            376 |                    0/1 |     0                                               .485 | In Pipeline 3       |
| |                +----+----------------------------------------------------+--                                               --------------+------+---------+----------------+------------------------+------                                               -----+---------------------+
| +Union           |  4 |                                                    |                                                             1 |    0 |       0 |            256 |                    0/0 |     1                                               .119 | In Pipeline 2       |
| |\               +----+----------------------------------------------------+--                                               --------------+------+---------+----------------+------------------------+------                                               -----+---------------------+
| | +NodeIndexSeek |  5 | RANGE INDEX n:Company(id) WHERE id = $autostring_0 |                                                             0 |    0 |       1 |            376 |                    0/1 |     0                                               .525 | In Pipeline 1       |
| |                +----+----------------------------------------------------+--                                               --------------+------+---------+----------------+------------------------+------                                               -----+---------------------+
| +NodeIndexSeek   |  6 | RANGE INDEX n:Person(id) WHERE id = $autostring_0  |                                                             1 |    0 |       1 |            376 |                    0/1 |    41                                               .615 | In Pipeline 0       |
+------------------+----+----------------------------------------------------+--                                               --------------+------+---------+----------------+------------------------+------                                               -----+---------------------+

Total database accesses: 3, total allocated memory: 1592

0 rows
ready to start consuming query after 817 ms, results consumed after another 81 m

which is rather nice since it uses each of the 3 indexes and unions the results together.

so yeah excellent suggestion @glilienfield . Excellent indeed

@yanivts1 are you running Neo4j 5.18? or some other version? It was never described in the initial post

yanivts1 · April 2, 2024, 7:23am

Thanks, it's working great!

joyeenicholess · April 18, 2024, 6:55am

I believe there are some label-using tactics. Assuming that 10,000 nodes of the same type are present in the Neo4j database. When labels are added to these nodes, the query will function more efficiently.

I hope this information will be useful to some people and I look forward to their feedback. I have learned the Salesforce CPQ Course. Therefore, your feedback is very valuable to further my career development whether good or bad.

Topic		Replies	Views
Filtering a node that only has 1 type of label Newbie Questions cypher	6	2143	July 7, 2020
Why do these two queries differ a lot in speed? Neo4j Graph Platform	8	556	June 24, 2021
Data search on a Labeled node Cypher	4	396	December 7, 2020
Query Performance for Label Matching Cypher	3	294	November 25, 2021
Having a label as a parameter in a cypher query (efficiently) Neo4j Graph Platform migrated	1	341	September 27, 2022

July Summer Fun!

Performance issue when matching a node without specific label

Related topics