cancel
Showing results for 
Search instead for 
Did you mean: 
aanastasiou
Node Clone

Hello

I have setup two separate Neo4j servers, both of version 4.4.0.

I am using neo4j-admin dump from one to create an archive to then use neo4j-admin load on the other to transfer specific databases between them.

The tool fails with "Not a valid neo4j archive" when attempting to load the data to the second instance. I have come across this "fix" but it does not work.

I have also tried to list the contents of the backup archive via the gzip tool and a custom Python script and both give the same error, leading me to believe that indeed, the neo4j-admin tool is producing an invalid file.

There is also the point of which compressor is used in the end by neo4j-admin? Is it zstd, gzip or something else?

If i try to list files with either the gzip or zstd command line utilities, I am getting the same error in both of them ("No, this is not a gzip file", "No, this is not a zstd file")...

Is it possible to get some clarity on these issues?

(There are no logs from the servers because they are shutdown in both cases during the whole backup-restore process, but here is what --verbose from neo4j-admin says:

org.neo4j.cli.CommandFailedException: Not a valid Neo4j archive: ./backup
        at org.neo4j.commandline.dbms.LoadCommand.load(LoadCommand.java:155)
        at org.neo4j.commandline.dbms.LoadCommand.execute(LoadCommand.java:85)
        at org.neo4j.cli.AbstractCommand.call(AbstractCommand.java:60)
        at org.neo4j.cli.AbstractCommand.call(AbstractCommand.java:30)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1743)
        at picocli.CommandLine.access$900(CommandLine.java:145)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2101)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2068)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:1935)
        at picocli.CommandLine.execute(CommandLine.java:1864)
        at org.neo4j.cli.AdminTool.execute(AdminTool.java:78)
        at org.neo4j.cli.AdminTool.main(AdminTool.java:59)
Caused by: org.neo4j.dbms.archive.IncorrectFormat: ./backup
        at org.neo4j.dbms.archive.Loader.openArchiveIn(Loader.java:172)
        at org.neo4j.dbms.archive.Loader.load(Loader.java:74)
        at org.neo4j.commandline.dbms.LoadCommand.load(LoadCommand.java:131)
        ... 11 more
Caused by: java.io.IOException: Decompression error: Unknown frame descriptor
        at com.github.luben.zstd.ZstdInputStream.readInternal(ZstdInputStream.java:147)
        at com.github.luben.zstd.ZstdInputStream.read(ZstdInputStream.java:107)
        at java.base/java.io.FilterInputStream.read(FilterInputStream.java:107)
        at org.neo4j.dbms.archive.CompressionFormat$2.decompress(CompressionFormat.java:79)
        at org.neo4j.dbms.archive.CompressionFormat.decompress(CompressionFormat.java:148)
        at org.neo4j.dbms.archive.CompressionFormat.decompress(CompressionFormat.java:125)
        at org.neo4j.dbms.archive.Loader.openArchiveIn(Loader.java:156)
        ... 13 more
        Suppressed: java.util.zip.ZipException: Not in GZIP format
                at java.base/java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:166)
                at java.base/java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:80)
                at java.base/java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:92)
                at org.neo4j.dbms.archive.CompressionFormat$1.decompress(CompressionFormat.java:52)
                at org.neo4j.dbms.archive.CompressionFormat.decompress(CompressionFormat.java:148)
                at org.neo4j.dbms.archive.CompressionFormat.decompress(CompressionFormat.java:132)
                ... 14 more

Any ideas on this one?

All the best
AA

Comments

Any updates or resolutions on this? I have the same exact situation.

dana_canzano
Neo4j
Neo4j

@eserkandogan

this was initially reported with Neo4j 4.4.0. Are you also using 4.4.0 o some other version.
Are yo encountering the same stack trace as initially reported?
Does the dump file have read access by the user running neo4j-admin load ?

Using 4.3.4, Community Edition.
I had tried to change the file owner to neo4j but same problem.

Stack trace is pretty similar:
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/var/lib/neo4j/import/temp

Selecting JVM - Version:11.0.12, Name:OpenJDK 64-Bit Server VM, Vendor:Oracle Corporation

Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/var/lib/neo4j/import/temp

neo4j 4.3.4

VM Name: OpenJDK 64-Bit Server VM

VM Vendor: Oracle Corporation

VM Version: 11.0.12+7

JIT compiler: HotSpot 64-Bit Tiered Compilers

VM Arguments: [-XX:+UseG1GC, -XX:-OmitStackTraceInFastThrow, -XX:+AlwaysPreTouch, -XX:+UnlockExperimentalVMOptions, -XX:+TrustFinalNonStaticFields, -XX:+DisableExplicitGC, -XX:MaxInlineLevel=15, -XX:-UseBiasedLocking, -Djdk.nio.maxCachedBufferSize=262144, -Dio.netty.tryReflectionSetAccessible=true, -Djdk.tls.ephemeralDHKeySize=2048, -Djdk.tls.rejectClientInitiatedRenegotiation=true, -XX:FlightRecorderOptions=stackdepth=256, -XX:+UnlockDiagnosticVMOptions, -XX:+DebugNonSafepoints, -Dlog4j2.disable.jmx=true, -Dfile.encoding=UTF-8, -Djava.io.tmpdir=/var/lib/neo4j/import/temp]

org.neo4j.cli.CommandFailedException: Not a valid Neo4j archive: ontologies-comparison-neo4j-Mar-16-2022-16_11_54.dump

at org.neo4j.commandline.dbms.LoadCommand.load(LoadCommand.java:205)

at org.neo4j.commandline.dbms.LoadCommand.loadDump(LoadCommand.java:127)

at org.neo4j.commandline.dbms.LoadCommand.execute(LoadCommand.java:93)

at org.neo4j.cli.AbstractCommand.call(AbstractCommand.java:71)

at org.neo4j.cli.AbstractCommand.call(AbstractCommand.java:34)

at picocli.CommandLine.executeUserObject(CommandLine.java:1953)

at picocli.CommandLine.access$1300(CommandLine.java:145)

at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)

at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)

at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)

at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)

at picocli.CommandLine.execute(CommandLine.java:2078)

at org.neo4j.cli.AdminTool.execute(AdminTool.java:89)

at org.neo4j.cli.AdminTool.main(AdminTool.java:67)

Caused by: org.neo4j.dbms.archive.IncorrectFormat: ontologies-comparison-neo4j-Mar-16-2022-16_11_54.dump

at org.neo4j.dbms.archive.Loader.openArchiveIn(Loader.java:209)

at org.neo4j.dbms.archive.Loader.load(Loader.java:77)

at org.neo4j.commandline.dbms.LoadCommand.load(LoadCommand.java:181)

... 13 more

Caused by: java.io.IOException: Decompression error: Unknown frame descriptor

at com.github.luben.zstd.ZstdInputStreamNoFinalizer.readInternal(ZstdInputStreamNoFinalizer.java:171)

at com.github.luben.zstd.ZstdInputStreamNoFinalizer.read(ZstdInputStreamNoFinalizer.java:123)

at com.github.luben.zstd.ZstdInputStream.read(ZstdInputStream.java:87)

at java.base/java.io.FilterInputStream.read(FilterInputStream.java:107)

at org.neo4j.dbms.archive.CompressionFormat$2.decompress(CompressionFormat.java:79)

at org.neo4j.dbms.archive.CompressionFormat.decompress(CompressionFormat.java:143)

at org.neo4j.dbms.archive.CompressionFormat.decompress(CompressionFormat.java:120)

at org.neo4j.dbms.archive.Loader.openArchiveIn(Loader.java:193)

... 15 more

Suppressed: java.util.zip.ZipException: Not in GZIP format

at java.base/java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:166)

at java.base/java.util.zip.GZIPInputStream.(GZIPInputStream.java:80)

at java.base/java.util.zip.GZIPInputStream.(GZIPInputStream.java:92)

at org.neo4j.dbms.archive.CompressionFormat$1.decompress(CompressionFormat.java:52)

at org.neo4j.dbms.archive.CompressionFormat.decompress(CompressionFormat.java:143)

at org.neo4j.dbms.archive.CompressionFormat.decompress(CompressionFormat.java:127)

... 16 more

dana_canzano
Neo4j
Neo4j

@eserkandogan

Was the dump file created on a different machine / OS and then transferred via winscp, ftp, etc to a new machine and then you are trying to neo4j-admin load on this new machine? And if so was the file transferred in binary mode?

what does

file <name of file>

replacing with the name of the dump file report. and if transferred does this command report the same on the source and new machine as well?

What about

md5sum <name of file>
kaan
Node

I'm running into this issue while attempting to load a snapshot (taken from Community 4.4.5 running on Mac) into a GCP instance which was created with the latest public image (for 4.3.6; specifically, this: "neo4j-community-1-4-3-6-apoc").

I've verified that the md5 sums are the same on originating machine, as well as the GCP VM where the load attempt runs – using md5 on Mac and md5sum on GCP instance; both return "cbccec60523ddbae1a016a19afd3b785".

Here's what a load attempt looks like on the GCP VM:

$ whoami
neo4j

$ /usr/share/neo4j/bin/neo4j-admin load --database=neo4j --force --from=snapshot.dump

Selecting JVM - Version:11.0.15, Name:OpenJDK 64-Bit Server VM, Vendor:Private Build
Not a valid Neo4j archive: snapshot.dump

Perhaps this is due to version differences? I think my next step would be to upgrade the GCP instance from 4.3.6 to 4.4.7 and try to reload one more time. That seems like a straightforward idea, though the pre-built Neo4j install is in several places which doesn't align with upgrade steps (https://neo4j.com/docs/upgrade-migration-guide/current/upgrade/upgrade-4.4/deployment-upgrading/) so it might be a bit of exploration and trial-and-error.

$ ls /usr/share/neo4j/
bin  data  lib logs  run  tools

$ ls /var/lib/neo4j/
certificates  conf  data  import  labs licensing  logs  metrics  plugins

$ ls /etc/neo4j
neo4j.conf  pre-neo4j.sh

Any other suggestions to try?

kaan
Node

This does appear to be an issue with version differences.

I don't know if there's a general behavior that newer snapshot versions will not work on older database versions, but these are my observations about what works / doesn't:

  • snapshot from 4.4.5 does not work with 4.3.6
  • snapshot from 4.4.5 works fine with 4.4.7

I got this working by setting up a new GCP instance – not from the public image but instead just an empty VM. After various setup steps – add instance to firewall group, create "neo4j" user, download my snapshot from GCS, download JDK 11 from jdk.java.net/archive, install JDK, download Neo4j community 4.4.7 – I was able to load the snapshot taken earlier from 4.4.5 community.

$ neo4j-admin load --database=neo4j --force --from=snapshot.dump 
Selecting JVM - Version:11.0.2+9, Name:OpenJDK 64-Bit Server VM, Vendor:Oracle Corporation
Done: 46 files, 1.416GiB processed.

 

Version history
Last update:
‎12-15-2021 09:05 AM
Updated by:
Contributors