Hello all,
I'm running in an issue with the incremental input functionality within neo4j 5 admin.
I use the neo4j admin tool for mass data input into a neo4j 5.5.0 DB.
The initial full import with the following command works like a charm as it already did within the 4th versions (command slightly different from 4 versions of course):
bin/neo4j-admin database import full --skip-bad-relationships --nodes=import/file1.csv --nodes=import/file2.csv ... --relationships=import/Relations.csv --ignore-empty-strings=true neo4j
for a number of about 40 files. The initial import goes into the default neo4j DB.
Afterwards I followed the admin manual and created a second DB with:
CREATE DATABASE db0;
I activated this DB and created the required node property uniqueness constraints prior to the following import as defined here:
After shutting the DB down I run the following incremental imports from the terminal
bin/neo4j-admin database import incremental --force --skip-bad-relationships --nodes=import/fileXYZ.csv --ignore-empty-strings=true db0
The header files for this single file are extended as described in the manual:
Incremental import
When using incremental import, you must have node property uniqueness constraints in place for the property key and label combinations that form the primary key, or the uniquely identifiable nodes. For example, importing nodes with a Person
label that are uniquely identified with a uuid
property key, the format of the header should be uuid:ID{label:Person}
.
My CSV header for the first ID column looks like this id:ID{label:myLabel_Name}
Exactly for this label I have created the node property uniqueness constraints described above.
Here is my issue...
When I run the incremental import on the second DB within the DBMS for just one CSV file it works perfectly fine. It creates the new nodes in the second (not default) DB using the incremental admin import. Successful execution with no errors reported. All imported nodes are available in den second DB after starting the DBMS.
But when I run an incremental import command that covers more than one CSV file like this:
bin/neo4j-admin database import incremental --force --skip-bad-relationships --nodes=import/fileXYZ.csv --nodes=import/fileABC.csv --nodes=import/fileDEF.csv ... --relationships=import/RELATIONS.csv --ignore-empty-strings=true db0
I get the following error message:
Import error: Multiple indexes for group global id space
Caused by:Multiple indexes for group global id space
java.lang.IllegalStateException: Multiple indexes for group global id space
at org.neo4j.util.Preconditions.checkState(Preconditions.java:181)
at org.neo4j.internal.batchimport.input.csv.CsvInput.lambda$collectReferencedNodeSchemaFromHeader$12(CsvInput.java:417)
at java.base/java.util.Optional.ifPresent(Optional.java:178)
at org.neo4j.internal.batchimport.input.csv.CsvInput.collectReferencedNodeSchemaFromHeader(CsvInput.java:398)
at org.neo4j.internal.batchimport.input.csv.CsvInput.referencedNodeSchema(CsvInput.java:384)
at com.neo4j.internal.batchimport.ParallelIncrementalBatchImporter.prepare(ParallelIncrementalBatchImporter.java:343)
at org.neo4j.importer.CsvImporter.doImport(CsvImporter.java:238)
at org.neo4j.importer.CsvImporter.doImport(CsvImporter.java:182)
at org.neo4j.importer.ImportCommand$Base.doExecute(ImportCommand.java:380)
at org.neo4j.importer.ImportCommand$Incremental.execute(ImportCommand.java:532)
at org.neo4j.cli.AbstractCommand.call(AbstractCommand.java:92)
at org.neo4j.cli.AbstractCommand.call(AbstractCommand.java:37)
at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
at picocli.CommandLine.access$1500(CommandLine.java:148)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
at picocli.CommandLine.execute(CommandLine.java:2170)
at org.neo4j.cli.AdminTool.execute(AdminTool.java:94)
at org.neo4j.cli.AdminTool.main(AdminTool.java:82)
Each CSV file carries nodes of a unique Label. But each file has different labels. The CSV file headers are exactly defined as mentioned in the manual:
id:ID{label:"label matching the content of the file"}, ..., ...
and the matching index for each id property | label combination is populated.
I think I'm missing something to make the incremental admin import deal with multiple CSV files in one go.
Any idea how to get this going for multiple CSV files in one go greatly welcome. I could run each CSV file through a single import call but for about 40 files thats a bit of work if needed on a regular basis.
Cheers
Krid