cancel
Showing results for 
Search instead for 
Did you mean: 

Join the community at Nodes 2022, our free virtual event on November 16 - 17.

Cypher returns strange inconsistent results

Reiner
Graph Buddy

I encountered a very strange behaviour with this cypher. It's part of a larger cypher but as of the WITH part it should be enough to look at this last part:

WITH sub1, sub2
MATCH (w1:Word {premium:true, searchterm:sub1})
WITH sub1, sub2, w1
MATCH (w2:Word) 
WHERE w2.premium = true and w2.searchterm = sub2
RETURN sub1, w1, w1.searchterm as ws1, sub2, w2, w2.searchterm as ws2

I get the following result:

╒═════════╤════════════════════════════╤═════════╤═════════╤════════════════════════════╤═════════╕
│"sub1"   │"w1"                        │"ws1"    │"sub2"   │"w2"                        │"ws2"    │
╞═════════╪════════════════════════════╪═════════╪═════════╪════════════════════════════╪═════════╡
│"ab"     │{"count":2,"name":"ab","sear│"ab"     │"laderin"│{"name":"Ablader","count":1,│"laderin"│
│         │chterm":"ab","language":"de"│         │         │"searchterm":"ablader","lang│         │
│         │,"premium":true}            │         │         │uage":"de","premium":true}  │         │
├─────────┼────────────────────────────┼─────────┼─────────┼────────────────────────────┼─────────┤
│"ablader"│{"name":"Ablader","count":1,│"ablader"│"in"     │{"count":2,"name":"ab","sear│"in"     │
│         │"searchterm":"ablader","lang│         │         │chterm":"ab","language":"de"│         │
│         │uage":"de","premium":true}  │         │         │,"premium":true}            │         │
└─────────┴────────────────────────────┴─────────┴─────────┴────────────────────────────┴─────────┘

Problem 1: w2 should not be matched as sub2 is not matching the condition that w2.searchterm = sub2 (both lines)

Problem 2: ws2 is defined as w2.searchterm but is not equal in result (last both columns, both lilnes)

Problem 3: matching nodes with searchterm "laderin" and "in" exists in database but are not displayed as result. (there are just one id away from these mistaken returned nodes)

I can't explain this other than a bug.

Just to be complete, when I output this sub1, sub2 at the beginning, it is this result, and only first and last lines should find matching w1 and w2 nodes.

╒═════════╤═════════╕
│"sub1"   │"sub2"   │
╞═════════╪═════════╡
│"ab"     │"laderin"│
├─────────┼─────────┤
│"abl"    │"aderin" │
├─────────┼─────────┤
│"abla"   │"derin"  │
├─────────┼─────────┤
│"ablad"  │"erin"   │
├─────────┼─────────┤
│"ablade" │"rin"    │
├─────────┼─────────┤
│"ablader"│"in"     │
└─────────┴─────────┘

Initially I tried with MATCH (w1:Word), (w2:Word) WHERE ... and thought that this cartesian product might be problematic but not. Then I tried to put conditions into {} instead of where-clause - no difference.

And when I set sub1 and sub2 manually, the output is correct - that's really strange:

WITH "ab" as sub1, "laderin" as sub2
MATCH (w1:Word {premium:true, searchterm:sub1})
WITH sub1, sub2, w1
MATCH (w2:Word) 
WHERE w2.premium = true and w2.searchterm = sub2
RETURN sub1, w1, w1.searchterm as ws1, sub2, w2, w2.searchterm as ws2
╒══════╤══════════════════════════════╤═════╤═════════╤══════════════════════════════╤═════════╕
│"sub1"│"w1"                          │"ws1"│"sub2"   │"w2"                          │"ws2"    │
╞══════╪══════════════════════════════╪═════╪═════════╪══════════════════════════════╪═════════╡
│"ab"  │{"count":2,"name":"ab","search│"ab" │"laderin"│{"name":"Laderin","count":1,"s│"laderin"│
│      │term":"ab","language":"de","pr│     │         │earchterm":"laderin","language│         │
│      │emium":true}                  │     │         │":"de","premium":true}        │         │
└──────┴──────────────────────────────┴─────┴─────────┴──────────────────────────────┴─────────┘

This is the result I also expected from the first cypher.

I'm quite annoyed to get a w2.searchterm that differs from searchterm property in w2 in same line of result.

In Profile I often see the expression "cached[w2.searchterm]" - don't know what this mean.
Hope someone can help!

Best,
Reiner

4 REPLIES 4

Reiner
Graph Buddy

Addendum: when I change WHERE-clause to the following, the result is correct and as expected:

WHERE w2.premium = true and tolower(w2.searchterm) = sub2
╒═════════╤══════════════════════════════╤═════════╤═════════╤══════════════════════════════╤═════════╕
│"sub1"   │"w1"                          │"ws1"    │"sub2"   │"w2"                          │"ws2"    │
╞═════════╪══════════════════════════════╪═════════╪═════════╪══════════════════════════════╪═════════╡
│"ab"     │{"count":2,"name":"ab","search│"ab"     │"laderin"│{"name":"Laderin","count":1,"s│"laderin"│
│         │term":"ab","language":"de","pr│         │         │earchterm":"laderin","language│         │
│         │emium":true}                  │         │         │":"de","premium":true}        │         │
├─────────┼──────────────────────────────┼─────────┼─────────┼──────────────────────────────┼─────────┤
│"ablader"│{"name":"Ablader","count":1,"s│"ablader"│"in"     │{"searchterm":"in","premium":t│"in"     │
│         │earchterm":"ablader","language│         │         │rue,"gender":"unknown","fillwo│         │
│         │":"de","premium":true}        │         │         │rd":true,"count":10915,"name":│         │
│         │                              │         │         │"in","language":"en;,de","word│         │
│         │                              │         │         │type":"LocalPreposition"}     │         │
└─────────┴──────────────────────────────┴─────────┴─────────┴──────────────────────────────┴─────────┘

Now a NodeByLabelScan is performed instead of a NodeIndexSeek - so guess there is some trouble with index? How can this happen and how can this be fixed?

Neo4j is case sensitive : In w1, name Ablader starts with upper case 'A' and searching lower case 'a' didn't fetch the results in the first instance.
You got the result after converting name to all lower case.

@ameyasoft: I don't use the "name" property. I'm always comparing with property "searchterm" that always is filled with lowercase. The tolower(...) experiment was just intended to force neo4j to ignore the index and perform a node scan. Normally - and also in case of manually setting sub1 and sub2 - it should work without that tolower() as all conditions are using lowercase and comparing against lowercase searchterm property.

Reiner
Graph Buddy

After dropping index on :Word(searchterm) and create new index, the results are correct.

But it scares me that Neo4j delivered such unreliable results.