Neo4j version and hardaware configuration:
- Neo4j version: 4.3.x Community
- Driver: Java Bolt driver 4.1.0
- Server: Single node - 48vCPUs - 64 GB RAM - Cento7
- Heap: 24 GB
- Pagecache: 28 GB
My graph model like this:
(:App {name,partition}) -[:CALL{code,partition}]->(:App{name,partition})
-
App.name
the unique name of App node -
CALL.code
the unique code of the relationship. The relationships which code are equals mean to they are in one http call. - There are index built on App(name,partition) AND CALL(code,partition)
There are more than 100k totally different relationships(code is also diffrent) between node A and node B, and node B has relationships to other App nodes X(Maybe C, D, E, F) , which code may same with A -> B, or other codes. I want to query all nodes X with the following cypher query:
match (a:App{name:"A",partition:"p"})-[r:CALL{partition:"p"}]->(b:App {name:"B",partition:"p"})
with b,collect(distinct r.source) as sources
match (b)-[c:CALL{partition:"p"}]->(x:App {partition:"p"})
return distinct x.name
But this query is very slow. How can i optimize it or my graph model is not suitable for this situation?
the query time should be worst if I query all intermidiate node B and its downstream node X.
Following is the query execution plan:
+----------------------------------------+------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Memory (Bytes) | Page Cache Hits/Misses |
+----------------------------------------+------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| +ProduceResults@neo4j | `count(c)` | 1 | 1 | 0 | | 0/0 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| +EagerAggregation@neo4j | count(c) AS `count(c)` | 1 | 1 | 0 | 24 | 0/0 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| +Filter@neo4j | cache[nd.partition] = $autostring_8 AND cache[nd.name] IS NOT NULL AND next:App AND next.name = $aut | 24 | 60730 | 2100627 | | 0/0 |
| | | ostring_5 AND next.partition = $autostring_6 | | | | | |
| | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| +Apply@neo4j | | 381814 | 1059722 | 0 | | 0/0 |
| |\ +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| | +NodeHashJoin@neo4j | nd | 381814 | 1059722 | 0 | 12167290921960 | 0/0 |
| | |\ +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| | | +CacheProperties@neo4j | cache[nd.name], cache[nd.partition] | 6360 | 6361 | 12722 | | 0/0 |
| | | | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| | | +NodeByLabelScan@neo4j | nd:App | 6360 | 6361 | 6362 | | 0/0 |
| | | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| | +DirectedRelationshipIndexSeek@neo4j | (next)-[c:CALL(partition, source)]->(nd) WHERE partition = $autostring_7 AND source IN sources | 813705 | 2115572 | 2183511 | | 0/0 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| +EagerAggregation@neo4j | collect(DISTINCT r.source) AS sources | 1 | 1 | 150350 | 11710416 | 0/0 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| +Filter@neo4j | r.partition = $autostring_2 | 5 | 82411 | 82411 | | 0/0 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| +Expand(Into)@neo4j | (n)-[r:CALL]->(next) | 103 | 82411 | 195424 | 8649688 | 0/0 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| +CartesianProduct@neo4j | | 253 | 1 | 0 | | 0/0 |
| |\ +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| | +NodeIndexSeek@neo4j | next:App(partition, name) WHERE partition = $autostring_4 AND name = $autostring_3 | 253 | 1 | 2 | | 0/0 |
| | +------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+
| +NodeIndexSeek@neo4j | n:App(partition, name) WHERE partition = $autostring_1 AND name = $autostring_0 | 16 | 1 | 2 | | 0/0 |
+----------------------------------------+------------------------------------------------------------------------------------------------------+----------------+---------+---------+----------------+------------------------+