Upgrade Kills Query


(Jeremy) #1

I'm testing upgrading neo4j from 3.0.4 to 3.4.7 and a query that previously returned in 321 ms now hangs and never returns.

I don't use any indexes or constraints.

The explain looks a bit different, but pretty similar

I can't post a profile because I've been waiting for about 3 hours for it to return....

It's a small install with about 50k records.

The data is genealogy data, so relationship heavy. The variable length relationship part of the query is what slows it down in general and in this instance can have a length of up to 18.

I did a pretty minimal vanilla upgrade, I only had to change the upgrade and shell parameters in neo4j.conf to make it work.

Any ideas?

3.0.4 Explain

neo4j-sh (?)$ explain match (c:Person)-[:IS_WHAKAPAPA_FOR]->(tenant:Tenant)
> where tenant.id = 'TRONR' and not((:Person)-[:HAS_PARENT]->(c))
> match (t:Person)-[:IS_WHAKAPAPA_FOR]->(tenant)
> where t.tupunaId is not null
> with c, tenant, t
> match (c)-[:HAS_PARENT*]->(t)
> return count(c);
+--------------------------------------------+
| No data returned, and nothing was changed. |
+--------------------------------------------+
582 ms

Compiler CYPHER 3.0

Planner COST

Runtime INTERPRETED

+-----------------------+----------------+------------------------------------------------+-----------------------------------+
| Operator              | Estimated Rows | Variables                                      | Other                             |
+-----------------------+----------------+------------------------------------------------+-----------------------------------+
| +ProduceResults       |             13 | count(c)                                       | count(c)                          |
| |                     +----------------+------------------------------------------------+-----------------------------------+
| +EagerAggregation     |             13 | count(c)                                       |                                   |
| |                     +----------------+------------------------------------------------+-----------------------------------+
| +Expand(Into)         |            168 | anon[17] -- anon[135], anon[222], c, t, tenant | (c)-[:IS_WHAKAPAPA_FOR]->(tenant) |
| |                     +----------------+------------------------------------------------+-----------------------------------+
| +AntiSemiApply        |            534 | anon[135], anon[222], t, tenant -- c           |                                   |
| |\                    +----------------+------------------------------------------------+-----------------------------------+
| | +Filter             |           2835 | anon[89], anon[98], c                          | anon[89]:Person                   |
| | |                   +----------------+------------------------------------------------+-----------------------------------+
| | +Expand(All)        |           2835 | anon[89], anon[98] -- c                        | (c)<-[:HAS_PARENT]-()             |
| | |                   +----------------+------------------------------------------------+-----------------------------------+
| | +Argument           |           2134 | c                                              |                                   |
| |                     +----------------+------------------------------------------------+-----------------------------------+
| +Filter               |           2134 | anon[135], anon[222], c, t, tenant             | c:Person                          |
| |                     +----------------+------------------------------------------------+-----------------------------------+
| +VarLengthExpand(All) |           2134 | anon[222], c -- anon[135], t, tenant           | (t)-[:HAS_PARENT*]->(c)           |
| |                     +----------------+------------------------------------------------+-----------------------------------+
| +Filter               |           1607 | anon[135], t, tenant                           | t:Person AND hasProp(t.tupunaId)  |
| |                     +----------------+------------------------------------------------+-----------------------------------+
| +Expand(All)          |           3213 | anon[135], t -- tenant                         | (tenant)<-[:IS_WHAKAPAPA_FOR]-(t) |
| |                     +----------------+------------------------------------------------+-----------------------------------+
| +Filter               |              0 | tenant                                         | tenant.id == {  AUTOSTRING0}      |
| |                     +----------------+------------------------------------------------+-----------------------------------+
| +NodeByLabelScan      |              3 | tenant                                         | :Tenant                           |
+-----------------------+----------------+------------------------------------------------+-----------------------------------+

Total database accesses: ?

3.4.7 Explain

neo4j-sh (?)$ explain match (c:Person)-[:IS_WHAKAPAPA_FOR]->(tenant:Tenant)
> where tenant.id = 'TRONR' and not((:Person)-[:HAS_PARENT]->(c))
> match (t:Person)-[:IS_WHAKAPAPA_FOR]->(tenant)
> where t.tupunaId is not null
> with c, tenant, t
> match (c)-[:HAS_PARENT*]->(t)
> return count(c);
+--------------------------------------------+
| No data returned, and nothing was changed. |
+--------------------------------------------+
2 ms

Compiler CYPHER 3.4

Planner COST

Runtime INTERPRETED

Runtime version 3.4

+-----------------------+----------------+------------------------------------------------+--------------------------------------+
| Operator              | Estimated Rows | Variables                                      | Other                                |
+-----------------------+----------------+------------------------------------------------+--------------------------------------+
| +ProduceResults       |             13 | count(c)                                       |                                      |
| |                     +----------------+------------------------------------------------+--------------------------------------+
| +EagerAggregation     |             13 | count(c)                                       |                                      |
| |                     +----------------+------------------------------------------------+--------------------------------------+
| +Expand(Into)         |            168 | anon[135] -- anon[17], anon[222], c, t, tenant | (t)-[:IS_WHAKAPAPA_FOR]->(tenant)    |
| |                     +----------------+------------------------------------------------+--------------------------------------+
| +Filter               |            534 | anon[17], anon[222], c, t, tenant              | EXISTS(t.tupunaId); t:Person         |
| |                     +----------------+------------------------------------------------+--------------------------------------+
| +VarLengthExpand(All) |           1067 | anon[222], t -- anon[17], c, tenant            | (c)-[:HAS_PARENT*]->(t)              |
| |                     +----------------+------------------------------------------------+--------------------------------------+
| +AntiSemiApply        |            803 | anon[17], tenant -- c                          |                                      |
| |\                    +----------------+------------------------------------------------+--------------------------------------+
| | +Filter             |           4268 |   NODE89,   REL98, c                           | `  NODE89`:Person                    |
| | |                   +----------------+------------------------------------------------+--------------------------------------+
| | +Expand(All)        |           4268 |   NODE89,   REL98 -- c                         | (c)<-[  REL98:HAS_PARENT]-(  NODE89) |
| | |                   +----------------+------------------------------------------------+--------------------------------------+
| | +Argument           |           3213 | c                                              |                                      |
| |                     +----------------+------------------------------------------------+--------------------------------------+
| +Filter               |           3213 | anon[17], c, tenant                            | c:Person                             |
| |                     +----------------+------------------------------------------------+--------------------------------------+
| +Expand(All)          |           3213 | anon[17], c -- tenant                          | (tenant)<-[:IS_WHAKAPAPA_FOR]-(c)    |
| |                     +----------------+------------------------------------------------+--------------------------------------+
| +Filter               |              0 | tenant                                         | tenant.id = $`  AUTOSTRING0`         |
| |                     +----------------+------------------------------------------------+--------------------------------------+
| +NodeByLabelScan      |              3 | tenant                                         | :Tenant                              |
+-----------------------+----------------+------------------------------------------------+--------------------------------------+

Total database accesses: ?

(Andrew Bowman) #2

Have you tried with cypher-shell? neo4j-sh is deprecated at this time and we'd like to rule out anything related to that.

Could you try adding an upper bound to the var-length expansion, or cutting off your query earlier and seeing if you can reach that point? At the moment without a PROFILE it's hard to know what's going on that could be responsible.


(Jeremy) #3

Thanks for the reply Andrew.

Same thing happens in cypher-shell.

I put 1..10 on var-length relationship and I do get a profile:

neo4j> profile match (c:Person)-[:IS_WHAKAPAPA_FOR]->(tenant:Tenant)
       where tenant.id = 'TRONR' and not((:Person)-[:HAS_PARENT]->(c))
       match (t:Person)-[:IS_WHAKAPAPA_FOR]->(tenant)
       where t.tupunaId is not null
       with c, tenant, t
       match (c)-[:HAS_PARENT*1..10]->(t)
       return count(c);
+-----------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time | DbHits | Rows |
+-----------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.4" | "COST"  | "INTERPRETED" | 4490 | 0      | 1    |
+-----------------------------------------------------------------------------------------+

+-----------------------+----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| Operator              | Estimated Rows | Rows    | DB Hits | Cache H/M | Identifiers                                  | Other                                           |
+-----------------------+----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +ProduceResults       |             13 |       1 |       0 |       0/0 | count(c)                                     | 0.0; 3.4; 3.4                                   |
| |                     +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +EagerAggregation     |             13 |       1 |       0 |       0/0 | count(c)                                     | 0.0                                             |
| |                     +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +Expand(Into)         |            168 |    3449 |       4 |       0/0 | tenant, t, anon[135], anon[17], c, anon[222] | 0.0; (t)-[anon[135]:IS_WHAKAPAPA_FOR]->(tenant) |
| |                     +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +Filter               |            534 |    3449 | 1347753 |       0/0 | tenant, t, anon[17], c, anon[222]            | 0.0; t:Person                                   |
| |                     +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +VarLengthExpand(All) |           1067 | 1344304 | 2354092 |       0/0 | tenant, t, anon[17], c, anon[222]            | 0.0; (c)-[anon[222]:HAS_PARENT*..10]->(t)       |
| |                     +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +AntiSemiApply        |            803 |   14602 |       0 |       0/0 | anon[17], c, tenant                          | 0.0                                             |
| |\                    +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| | +Filter             |           4268 |       0 |   13614 |       0/0 |   NODE89,   REL98, c                         | 0.0; `  NODE89`:Person                          |
| | |                   +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| | +Expand(All)        |           4268 |   13614 |   41830 |       0/0 |   NODE89,   REL98, c                         | 0.0; (c)<-[  REL98:HAS_PARENT]-(  NODE89)       |
| | |                   +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| | +Argument           |           3213 |   28216 |       0 |       0/0 | c                                            | 0.0                                             |
| |                     +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +Filter               |           3213 |   28216 |   28216 |       0/0 | anon[17], c, tenant                          | 0.0; c:Person                                   |
| |                     +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +Expand(All)          |           3213 |   28216 |   28217 |       0/0 | anon[17], c, tenant                          | 0.0; (tenant)<-[anon[17]:IS_WHAKAPAPA_FOR]-(c)  |
| |                     +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +Filter               |              0 |       1 |       3 |       0/0 | tenant                                       | 0.0; tenant.id = $`  AUTOSTRING0`               |
| |                     +----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+
| +NodeByLabelScan      |              3 |       3 |       4 |       0/0 | tenant                                       | 0.0; :Tenant                                    |
+-----------------------+----------------+---------+---------+-----------+----------------------------------------------+-------------------------------------------------+

+----------+
| count(c) |
+----------+
| 3449     |
+----------+

1 row available after 4490 ms, consumed after another 0 ms

Compared to the profile on the non-restricted query in 3.0.4 you can see the VarLengthExpand is being done at a different place in the plan and has very different numbers.

neo4j-sh (?)$ profile match (c:Person)-[:IS_WHAKAPAPA_FOR]->(tenant:Tenant)
> where tenant.id = 'TRONR' and not((:Person)-[:HAS_PARENT]->(c))
> match (t:Person)-[:IS_WHAKAPAPA_FOR]->(tenant)
> where t.tupunaId is not null
> with c, tenant, t
> match (c)-[:HAS_PARENT*]->(t)
> return count(c);
+----------+
| count(c) |
+----------+
| 40607    |
+----------+
1 row
630 ms

Compiler CYPHER 3.0

Planner COST

Runtime INTERPRETED

+-----------------------+----------------+-------+---------+------------------------------------------------+-----------------------------------+
| Operator              | Estimated Rows | Rows  | DB Hits | Variables                                      | Other                             |
+-----------------------+----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +ProduceResults       |             13 |     1 |       0 | count(c)                                       | count(c)                          |
| |                     +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +EagerAggregation     |             13 |     1 |       0 | count(c)                                       |                                   |
| |                     +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +Expand(Into)         |            168 | 40607 |   23138 | anon[17] -- anon[135], anon[222], c, t, tenant | (c)-[:IS_WHAKAPAPA_FOR]->(tenant) |
| |                     +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +AntiSemiApply        |            534 | 40205 |       0 | anon[135], anon[222], t, tenant -- c           |                                   |
| |\                    +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| | +Filter             |           2835 |     0 |   17514 | anon[89], anon[98], c                          | anon[89]:Person                   |
| | |                   +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| | +Expand(All)        |           2835 | 17514 |   75233 | anon[89], anon[98] -- c                        | (c)<-[:HAS_PARENT]-()             |
| | |                   +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| | +Argument           |           2134 | 57719 |       0 | c                                              |                                   |
| |                     +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +Filter               |           2134 | 57719 |   57719 | anon[135], anon[222], c, t, tenant             | c:Person                          |
| |                     +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +VarLengthExpand(All) |           2134 | 57719 |  115439 | anon[222], c -- anon[135], t, tenant           | (t)-[:HAS_PARENT*]->(c)           |
| |                     +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +Filter               |           1607 |     1 |   56432 | anon[135], t, tenant                           | t:Person AND hasProp(t.tupunaId)  |
| |                     +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +Expand(All)          |           3213 | 28216 |   28217 | anon[135], t -- tenant                         | (tenant)<-[:IS_WHAKAPAPA_FOR]-(t) |
| |                     +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +Filter               |              0 |     1 |       3 | tenant                                         | tenant.id == {  AUTOSTRING0}      |
| |                     +----------------+-------+---------+------------------------------------------------+-----------------------------------+
| +NodeByLabelScan      |              3 |     3 |       4 | tenant                                         | :Tenant                           |
+-----------------------+----------------+-------+---------+------------------------------------------------+-----------------------------------+

Total database accesses: 373699

(Andrew Bowman) #4

Okay, looks like the planner in 3.4.7 is deciding to expand from tenant to c then evaluate the var-length expand to t then finish the filtering.

The key point is the Filter operation that covers the t:Person filter and where t.tupunaId is not null.

We'll need to see what we can do to nudge the planner to make this evaluation before performing the variable-length expand.

One thing you could try is reordering your MATCH/WHERE clauses so:

explain
match (t:Person)-[:IS_WHAKAPAPA_FOR]->(tenant:Tenant)
where tenant.id = 'TRONR' and t.tupunaId is not null
match (c:Person)-[:IS_WHAKAPAPA_FOR]->(tenant:Tenant)
where not((:Person)-[:HAS_PARENT]->(c))
with c, tenant, t
match (c)-[:HAS_PARENT*]->(t)
return count(c);

While that should work, you may want to instead use a means to better enforce order. If we introduce an aggregation before the var-length expansion, that should force the planner to always execute the prior matches before continuing:

explain
match (c:Person)-[:IS_WHAKAPAPA_FOR]->(tenant:Tenant)
where tenant.id = 'TRONR' and not((:Person)-[:HAS_PARENT]->(c))
match (t:Person)-[:IS_WHAKAPAPA_FOR]->(tenant)
where t.tupunaId is not null
with c, tenant, t, count(*) as cnt
match (c)-[:HAS_PARENT*]->(t)
return count(c);

Give that a try.


(Andrew Bowman) #5

Looking at the query again, there may be a better approach to get what you want, and it will naturally use collection, which should work to enforce the ordering. Does this one work for you?

explain
match (t:Person)-[:IS_WHAKAPAPA_FOR]->(tenant:Tenant)
where tenant.id = 'TRONR' and t.tupunaId is not null
with tenant, collect(t) as parents
match (c:Person)-[:IS_WHAKAPAPA_FOR]->(tenant)
where not ()-[:HAS_PARENT]->(c)
match (c)-[:HAS_PARENT*]->(t)
where t in parents
return count(DISTINCT c);

We get the count of distinct c nodes at the end, which was missing in previous queries. In your previous queries, the same c node could have been counted multiple times if there were multiple matches to t nodes.


(Jeremy) #6

Thanks so much for your efforts Andrew.

The first query of the three that you gave works like a charm. I haven't restricted the other two to get explain plans out of them as yet to compare, but at first try the first one looks like it will do the trick.

The count(c) is actually a bit of a red herring sorry. The return clause is actually much longer with all sorts in it, but I simplified it to a count for the purposes of testing.

Thanks again.


(Craig Taverner) #7

Hi @jeremy1, we're interesting in seeing if we can improve the 3.4 planner to deal better with this issue, but are unable to reproduce it. Is there any chance you can provide the database in some form (eg. CSV) for us to use to test?


(Jeremy) #8

Hi @craig.taverner, I would have to do some significant sanitising because at the moment it's full of identifiable client information.
I'll try and find some bandwidth to clean it up a bit, retest and send through.


(Craig Taverner) #9

In the meantime, we have created a fake dataset that appears to mimic the structure somewhat. We have 12 generations each about 600 people wide, with only a few people having the tupunaId. Our experiments show that the problem seems to be the planner does not know the fact that only a few people have this property, and plans accordingly. If you add an index on that property, the planner does a much better job. have you tried:

CREATE INDEX ON :Person(tupunaId);

This should help a lot.


(Jeremy) #10

Thanks Craig,

I'll give that a crack, but I won't be able to until at least tomorrow.

The tupunaId attribute is on all of the data, but it is low cardinality with probably less than a dozen different possible values.