Really weird: query not returning expected data after upgrade from 3.5.16 community to 3.5.19 enterprise

alexchantavy · July 15, 2020, 6:31am

Hi there, encountered a strange situation after my Neo4j upgrade.

This query works as expected:

match (n:AWSTag) where n.key contains "aws:auto" return n.key, n.value order by n.key limit 300

as it returns a list of nodes with n.key = aws:autoscaling:groupName

However, when I just add an "s" to the key filter by running

match (n:AWSTag) where n.key contains "aws:autos" return n.key, n.value order by n.key limit 300, I get 0 results:

I have a second neo4j server running similar data and it shows this exact same problem as well.

Things I've tried:

I have tried deleting all indexes on this :AWSTag node but that didn't work.
I then tried turning off the server, deleting /var/lib/neo4j/data/databases/graph.db/schema/*, and turning it on again.

How I installed Neo4j Enterprise:

service stop neo4j
Uninstall Neo4j Community 3.5.16, keeping the old data
apt-get install neo4j-enterprise=3.5.19
service start neo4j

All other queries except for those related to :AWSTags work as expected.

which OS do you want to install on, use tags for that too

Ubuntu; whatever AWS is using

neo4j version, desktop version, browser version

3.5.19 Enterprise

what kind of API / driver do you use

The script that loads the data to the graph uses Python driver 1.7.6

alexchantavy · July 15, 2020, 6:32am

screenshot of PROFILE or EXPLAIN with boxes expanded (lower right corner)

PROFILE of the successful query:

PROFILE of the failing query:

which plugins / extensions / procedures do you use

None

webtic · July 15, 2020, 12:30pm

Is behaviour different when you first upgrade your community instance to the latest 3.5.x (.20 at the moment) and THEN upgrade to the enterprise version?

alexchantavy · July 15, 2020, 4:30pm

Thanks for your reply. Both my instances are on enterprise already so this might be a bit late. I think I could roll back to the original community 3.5.16, test, go to community 3.5.20, test, and then do enterprise 3.5.20.

If I roll back to community 3.5.16 and the issue is still present, what could this mean? Have you seen a similar problem before?

tony.chiboucas · July 15, 2020, 5:33pm

I suspect a bug in how Enterprise parses the CONTAINS argument.

Try using STARTS WITH instead:

MATCH (n:AWSTag)
WHERE n.key STARTS WITH "aws:autos"
RETURN n.key, n.value
ORDER BY n.key
LIMIT 300

alexchantavy · July 15, 2020, 5:56pm

Gave that a try, 0 results.

Here's some other weird behavior:

match (a:AWSTag) where a.key starts with "aws:auto" return distinct a.key
returns all these other results that don't start with 'aws:auto':

If I try an exact field search with
match (a:AWSTag{key:"aws:autoscaling:groupName"}) return a and match (a:AWSTag{key:"aws:ec2:fleet-id"}) return a,

the former returns nothing, and the latter works.

tony.chiboucas · July 15, 2020, 6:24pm

This is more than a little bizarre. I don't know where to start.

Have you tried explicitly defining and querying an index?

alexchantavy · July 15, 2020, 6:30pm

Have you tried explicitly defining and querying an index?

Yes, there is an id field on my :AWSTag node that I have created an index for.

I originally discovered this issue when I ran a query to find all tags of an EC2Instance node with
match(n:EC2Instance{publicipaddress:"1.2.3.4"})--(a:AWSTag) return a.id

and noticed that queries on a.id with prefixes of aws:autoscaling:groupName* did not work even though they show up in queries to connected nodes. For example, this returns 0 results even though I know this node exists:

match(t:AWSTag{id:"aws:autoscaling:groupName:MY_GROUP_NAME"} return t

michael.hunger · July 15, 2020, 7:56pm

Alex, that looks like a bug, can you please raise it as an issue at

And link to this thread as well?

michael.hunger · July 15, 2020, 7:57pm

Did you try to drop and recreate the index?

michael.hunger · July 15, 2020, 8:01pm

Could you run: CALL db.stats.retrieve("GRAPH COUNTS") either via cypher-shell or via http:

curl -H accept:application/json -H content-type:application/json -d '{"statements":[{"statement":"CALL db.stats.retrieve(\"GRAPH COUNTS\")"}]}' [http://________](http://________/)_:7474/db/data/transaction/commit > graphCounts.json

and share the results

alexchantavy · July 15, 2020, 8:39pm

can you please raise it as an issue

Sure, will do!

Did you try to drop and recreate the index?

Yup. With and without the index I get the same behavior.

Could you run: CALL db.stats.retrieve("GRAPH COUNTS")

Here are the count stats; I'm only including the ones related to AWSTags:

{
  "relationships": [
    {
      "count": 8882403
    },
    { ... snip ... },
    {
      "relationshipType": "TAGGED",
      "count": 97277,
      "endLabel": "AWSTag"
    },
    {
      "relationshipType": "TAGGED",
      "count": 97277,
      "endLabel": "Tag"
    },
    { ... snip ... }
  ],
  "nodes": [
    {
      "count": 2855520
    },
    { ... snip ... },
    {
      "count": 66181,
      "label": "AWSTag"
    },
    {
      "count": 66181,
      "label": "Tag"
    },
    { ... snip ... }
  ],
  "indexes": [
    { ... snip ... },
    {
      "updatesSinceEstimation": 2134,
      "totalSize": 64047,
      "properties": [
        "id"
      ],
      "labels": [
        "AWSTag"
      ],
      "estimatedUniqueSize": 64047
    },
    { ... snip ... }
  ],
  "constraints": []
}

alexchantavy · July 16, 2020, 12:53am

Submitted Query not returning expected data after upgrade from 3.5.16 community to 3.5.19 enterprise · Issue #1149 · neo4j/neo4j-browser · GitHub.

elwosto · July 17, 2020, 9:25pm

Hi there, I faced the same issue and ended up using Regular Expressions, though I'm not sure how much more resource consuming this is.

Look at the WHEN clause:

	FOREACH (ignoreMe IN CASE WHEN (m.body =~ '(?ms).+Esta informaci.n te result. .til.+') and (m)-[:SENT_BY]->(:Nubot) THEN [1] ELSE [] END | \
		SET m:Recommendation \
    ) \

I hope this helps, in the meantime.

Cheers,

alexchantavy · July 24, 2020, 12:56am

Just to add another data point, I noticed that this query does not work in Neo4j:

But it does work in a Linkurious instance connected to that same Neo4j server!

I saw this other post (Differences in Full Text between 3.5 and 4.1 in Bloom - #2 by webtic) on query result differences in Bloom. Since Linkurious also uses a full text index, I wonder if this is related.

oskar.damkjaer · October 19, 2020, 12:10pm

I looked into this issue and found the problem, we were removing ":auto" from cypher queries. The fix will make into this or next release, thanks for reporting this issue!

andrew_bowman · October 23, 2020, 12:20am

For a little more context, in the browser we can prefix a query with :auto to change how it's executed (this is required when you run a USING PERIODIC COMMIT LOAD CSV kind of query), and it looks like we were being a little lazy with how we removed the :auto part (since this would be a browser command that only the browser should interpret, and it would need to be removed from the Cypher before sending).

Topic		Replies	Views
Differences in Full Text between 3.5 and 4.1 in Bloom Neo4j Graph Platform	2	622	July 24, 2020
Upgrade Kills Query Cypher	9	1170	November 13, 2018
Querying on multiple labels and property values appear not to use indexes Neo4j Graph Platform migrated , cypher-tagged	1	169	December 27, 2022
SDN 6 custom query returns NULL Spring Data Neo4j & Neo4j-OGM	6	778	August 1, 2021
Query optimalisation Cypher	25	2410	February 25, 2019

August Summer Fun!

Really weird: query not returning expected data after upgrade from 3.5.16 community to 3.5.19 enterprise

Related topics