Index is not always used


(Flavio) #1

The following query should use the index from my point of view. I have almost identical queries that do this. I haven't found the cause yet why it works on one node and not on another. Any help/input is welcome!

**This Query DOES NOT use the index on field uuidKey for NodeA **, but for another Node/Label, e.g. NodeB, which has the field uuidKey too, it does.

WITH [
{p: '48064:3879:64867961-78a9-11e8-8277-7fb03fe0dd72'},
{p: '41320::6484f12a-78a9-11e8-8277-7fb03fe0dd72'}
] as props
UNWIND props AS row
WITH row
MATCH (sa:NodeA)
USING INDEX sa:NodeA(uuidKey)
WHERE sa.uuidKey = row.p
RETURN *

**This Query DOES use the index on field uuidKey for NodeA **, which shows that the index exists and can be used.

WITH [
'48064:3879:64867961-78a9-11e8-8277-7fb03fe0dd72',
'41320::6484f12a-78a9-11e8-8277-7fb03fe0dd72'
] as props
UNWIND props AS row
WITH row
MATCH (sa:NodeA)
USING INDEX sa:NodeA(uuidKey)
WHERE sa.uuidKey = row
RETURN *

Context:

  • The Index is created by the query CREATE CONSTRAINT ON (e:NodeA) ASSERT e.uuidKey IS UNIQUE

  • The data and the indexes are recreated before the query. Therefore, the db will be deleted (MATCH n DETACH DETETE n) and the indexes will be recreated (deleted as APOC and recreated). Afterwards all data will be imported. After the Import a sleep of 40sec. ensures that all indexes are up to date (or whatever is going on in the backgroud). Then the queries gets executed. Event 30 min after the import the above query does not use the index.

  • There are other Labels which have also an index on the same fieldname. For each Label + field an own index exists.

  • NodeA has other additional labels set, but not all have the same additional labels set.


(Christophe Willemsen) #2

The Cypher planner does its best to use the index, depending on the number of nodes on one label or one side of the relationship it might choose or not to make use of the index.

That said, you can always hint to use the index :

WITH [
{p: '48064:3879:64867961-78a9-11e8-8277-7fb03fe0dd72'},
{p: '41320::6484f12a-78a9-11e8-8277-7fb03fe0dd72'}
] as props
UNWIND props AS row
WITH row
MATCH (sa:NodeA) 
USING INDEX sa:NodeA(uuidKey) // < hint is here :)
WHERE sa.uuidKey = row.p 
RETURN *

(Flavio) #3

Thanks for the answer. But even with the index hint neo4j does not use the index. I have update the question and added the index hint to the query.


(Christophe Willemsen) #4

Can you show the result of

:schema

in the browser


(Flavio) #5

Thanks for the quick reply - I'm impressed!

Yes, :schema schows that the index exists...:

Indexes
   ON :NodeA(uuidKey) ONLINE  (for uniqueness constraint)
   ON :NodeB(uuidKey) ONLINE  (for uniqueness constraint)
   ...
Constraints
   ON ( nodeA:NodeA) ASSERT nodeA.uuidKey IS UNIQUE
   ON ( nodeB:NodeA) ASSERT nodeB.uuidKey IS UNIQUE
   ...

Anyway, I was able to solve the problemby using the toString(...) function as shown below in the where clause:

WITH [
  {p: '48064:3879:64867961-78a9-11e8-8277-7fb03fe0dd72'},
  {p: '41320::6484f12a-78a9-11e8-8277-7fb03fe0dd72'}
] as props
UNWIND props AS row
  WITH row
  MATCH (sa:NodeA) 
    USING INDEX sa:NodeA(uuidKey) // < hint is here :)
    WHERE sa.uuidKey = toString(row.p) 
  RETURN *

Btw: is there a case where an index doesn't have the status ONLINE? Background of this question is the question Performance problems during Neo4J import / conversion with multiple steps from me on stackoverflow. Currently we still have several sleeps of 10, 20 and 40 seconds in our script. Or might be the new/refactored index implementation of Neo4j help here?


(Jacob McCrumb) #6

I am not positive on ONLINE status for index, but its possible for a VERY large dataset you would see something else.

There is a procedure to await indexes:

call db.awaitIndexes()

https://neo4j.com/docs/operations-manual/current/reference/procedures/

As far as using indices goes, I find that sometimes it helps to alias the value to be used. I've not run across a case like yours where using a map has it confused, but I've had problems using:

WHERE node.prop = SUBSTRING(val, 0, 3) + "something"

With your original query, try:
WITH [ {p: '48064:3879:64867961-78a9-11e8-8277-7fb03fe0dd72'}, {p: '41320::6484f12a-78a9-11e8-8277-7fb03fe0dd72'} ] as props UNWIND props AS row WITH row, row.p AS uuid MATCH (sa:NodeA) WHERE sa.uuidKey = uuid RETURN *

As to WHY this would work, that is beyond me. But if its like a case I have seen, it will. Note, I dropped the using hint because Neo4j should pick that up automatically and generally I think its best to just let it do its thing.


(Flavio) #7

Thanks for the input. It works, when the uuidKey is 'extracted' as a separate variable, the index gets picked up. It looks like the type of the field can't be recognized with p.uuidKey and therefore the index is not used. By extracting it to a separate variable or by converting it with toString(p.uuidKey) the type can be recognized, which allows neo to find the index. However, sincerest thanks for the quick help here!


(Michael Hunger) #8

Which version are you using? This was an issue in an older version, but should have been fixed.
Yes in general the workaround is to put your value into an identifier, not an expression. Then it's picked up more likely for the index lookup.


(Jacob McCrumb) #9

Glad to hear it worked!


(Flavio) #10

Hello @michael.hunger. We are using Neo4j 3.4.6 Community Edition.