Neo4j version: 4.0.0
Driver: Python (py2neo for import)
We use py2neo to import nodes from a CSV using code like:
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM $csv_path
AS line
FIELDTERMINATOR ','
WITH %s
MERGE (n:has_uri {uri: uri})
ON CREATE SET n :«labels», %s
ON MATCH SET n :«labels», %s
After it's imported, there are some inconsistencies, some nodes do not appear to be returned from indexes for labels that they do have, here's an example:
When we run the consistency checker, we get lots of errors (we ran out of disk space at 10Gb...) of two types:
2020-04-15 11:36:52.499+0000 ERROR [o.n.c.ConsistencyCheckService] This node record has a label that is not found in the label scan store entry for this node
and
ERROR: This node was not found in the expected index.
Rebuilding the labelscanstore as suggested here: Creating a subset graph - #12 by andrew.bowman had no effect.
Upgrading to 4.0.3 had no effect.
We build the database from scratch frequently, and have never experienced this issue before upgrading to Neo4j version 4, and have never successfully built a database on Neo4j version 4, so I think that it is probably due to a change in version 4, although we have not yet run an identical comparison between v3 and v4 to confirm.
Any help would be hugely appreciated. Thank you!