Neo4j container crashing on startup. transaction database 0 size

Hello, I wonder if anyone could help me out here with some approaches to troubleshooting an issue I am encountering.

I got Neo4j community edition running in a container in a k8s environment (OpenShift Container Platform).

We encountered an issue at a customer site, whereby Neo4j is failing to start. Full log attached. This extract points to issue reading the transaction database:

'org.neo4j.dbms.database.DefaultSystemGraphInitializer@d5b599d6' was successfully initialized, but failed to start. Please see the attached cause exception "Unable to read log version and last committed tx from '/data/transactions/system/neostore.transaction.db.0'. Was only able to read -1 bytes, but was expecting 8".

If we look on disk, we can see that the file /data/transactions/system/neostore.transaction.db.0 does exist, is readable, but it is 0 size.

This problem is only evident at one particular customer site. There are many hundreds of successful installations in house and at other customer sites.

If the pod that contains the neo4j container is started without persistence configured, the problem goes away. When persistence is configured, the problem comes back. The PV and PVC are successfully created and PVC is being written to by neo4j - so it doesn't look like a storage permissions problem - but that database transaction file is always 0 length.

We have been unable to reproduce the issue using the same storage type in house (IBM Spectrum Scale).

Any thoughts on this? Any reasons why the transactions file would have 0 size on startup? Any known possibly related issues?

Thanks a lot.

Would you mind re-attaching your log file in the migrated post?

Sounds like as if the volume was not properly mounted, .e.g. only the data/databases/* folder was mounted but not the data/transactions/* one so it found database files but no transaction logs?

Or an restore mishap? Or a misconfiguration where the tx logs directories were configured to someplace else (which is possible, but then needs to be made sure the tx-logs are also made available.

But the official docker container mounts the the folder under /data in the right place.

Thanks for the reply Michael. Seems like I'm not allowed to upload attachment ("New users not allowed to upload attachments") Let me paste a truncated version below.
In response to your questions:
It looks like the volume is mounted correctly. Its mounted to /data. On the PVC, I can see that the transactions, databases and dbms directories are successfully created. Many files are successfully created and written to during the startup, but the /data/transactions/system/neostore.transaction.db.0 remains at 0 size.

The issues happens on first time startup, so I don't think its anything to do with restore. Also, don't think its misconfig, the tx log is where it is supposed to be. Its created, but not written to and remains at 0 size.

Even with debug logging, I do no see any errors reported during the rite operation - only the error on read (when it tries to read from a 0 bytes file).

What about the "UseBiasedLocking" JVM option - it seems like that is disabled by default - Are there any known situations where it should be enabled? My thought is that is something around locking that is not supported by the underlying storage on the cluster. Are there any options around this that are worth exploring - turning on or off.

TRUNCATED DEBUG LOG:

                                                            --------------------------------------------------------------------------------
                                                                                            [ Library path ]                                
                                                            --------------------------------------------------------------------------------
                                                            /usr/lib/jdk-11.0.16.1+1/lib/default
                                                            /usr/lib/jdk-11.0.16.1+1/lib
                                                            /usr/lib64
                                                            /usr/lib
                                                            
                                                            --------------------------------------------------------------------------------
                                                                                         [ System properties ]                              
                                                            --------------------------------------------------------------------------------
                                                            awt.toolkit = sun.awt.X11.XToolkit
                                                            jdk.tls.rejectClientInitiatedRenegotiation = true
                                                            file.encoding.pkg = sun.io
                                                            jdk.extensions.version = 11.0.16.1
                                                            sun.jnu.encoding = ANSI_X3.4-1968
                                                            sun.arch.data.model = 64
                                                            com.ibm.vm.bitmode = 64
                                                            user.variant = 
                                                            com.ibm.oti.vm.library.version = 29
                                                            user.timezone = UTC
                                                            sun.java.launcher = SUN_STANDARD
                                                            user.country = US
                                                            sun.boot.library.path = /usr/lib/jdk-11.0.16.1+1/lib/default:/usr/lib/jdk-11.0.16.1+1/lib
                                                            sun.java.command = org.neo4j.server.CommunityEntryPoint --home-dir=/var/lib/neo4j --config-dir=/var/lib/neo4j/conf
                                                            io.netty.tryReflectionSetAccessible = true
                                                            sun.cpu.endian = little
                                                            user.home = /
                                                            user.language = en
                                                            ibm.signalhandling.rs = false
                                                            jdk.extensions.name = Extensions for OpenJDK for Eclipse OpenJ9
                                                            file.separator = /
                                                            com.ibm.util.extralibs.properties = 
                                                            sun.java.launcher.pid = 360
                                                            com.ibm.oti.configuration = scar
                                                            com.ibm.oti.shared.enabled = false
                                                            com.ibm.oti.vm.bootstrap.library.path = /usr/lib/jdk-11.0.16.1+1/lib/default:/usr/lib/jdk-11.0.16.1+1/lib
                                                            jdk.tls.ephemeralDHKeySize = 2048
                                                            com.ibm.cpu.endian = little
                                                            user.name = 1000930000
                                                            path.separator = :
                                                            jdk.nio.maxCachedBufferSize = 262144
                                                            ibm.signalhandling.sigint = true
                                                            file.encoding = UTF-8
                                                            jnidispatch.path = /tmp/jna2021555411582016347.tmp
                                                            ibm.signalhandling.sigchain = true
                                                            jna.loaded = true
                                                            com.ibm.system.agent.path = /usr/lib/jdk-11.0.16.1+1/lib
                                                            user.dir = /var/lib/neo4j
                                                            com.ibm.jcl.checkClassPath = 
                                                            ibm.system.encoding = ANSI_X3.4-1968
                                                            sun.io.unicode.encoding = UnicodeLittle
                                                            log4j2.disable.jmx = true
                                                            
                                                            --------------------------------------------------------------------------------
                                                                                  [ (IANA) TimeZone database version ]                      
                                                            --------------------------------------------------------------------------------
                                                              TimeZone version: 2022a (available for 601 zone identifiers)
                                                            
                                                            --------------------------------------------------------------------------------
                                                                                        [ Network information ]                             
                                                            --------------------------------------------------------------------------------
                                                            Interface eth0:
                                                                address: 
                                                                address: 
                                                            Interface lo:
                                                                address: 0:0:0:0:0:0:0:1%lo
                                                                address: 127.0.0.1
                                                            
                                                            --------------------------------------------------------------------------------
                                                                                     [ Native access information ]                          
                                                            --------------------------------------------------------------------------------
                                                            Native access details: Linux native access is available.
                                                            
                                                            --------------------------------------------------------------------------------
                                                                                            [ DBMS config ]                                 
                                                            --------------------------------------------------------------------------------
                                                            DBMS provided settings:
                                                            dbms.connector.bolt.enabled=true
                                                            dbms.connector.bolt.tls_level=OPTIONAL
                                                            dbms.connector.http.enabled=true
                                                            dbms.connector.https.enabled=true
                                                            dbms.default_listen_address=0.0.0.0
                                                            dbms.directories.import=/var/lib/neo4j/import
                                                            dbms.directories.logs=/logs
                                                            dbms.directories.neo4j_home=/var/lib/neo4j
                                                            dbms.jvm.additional=-XX:+UseG1GC

-XX:-OmitStackTraceInFastThrow
-XX:+AlwaysPreTouch
-XX:+UnlockExperimentalVMOptions
-XX:+TrustFinalNonStaticFields
-XX:+DisableExplicitGC
-XX:MaxInlineLevel=15
-XX:-UseBiasedLocking
-Djdk.nio.maxCachedBufferSize=262144
-Dio.netty.tryReflectionSetAccessible=true
-Djdk.tls.ephemeralDHKeySize=2048
-Djdk.tls.rejectClientInitiatedRenegotiation=true
-XX:FlightRecorderOptions=stackdepth=256
-XX:+UnlockDiagnosticVMOptions
-XX:+DebugNonSafepoints
-Dlog4j2.disable.jmx=true
dbms.logs.debug.level=DEBUG
dbms.memory.pagecache.size=512M
dbms.ssl.policy.bolt.base_directory=/var/lib/neo4j/certificates
dbms.ssl.policy.bolt.ciphers=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
dbms.ssl.policy.bolt.enabled=true
dbms.ssl.policy.bolt.private_key=/var/lib/neo4j/certificates/tls.key
dbms.ssl.policy.bolt.public_certificate=/var/lib/neo4j/certificates/tls.crt
dbms.ssl.policy.https.base_directory=/var/lib/neo4j/certificates
dbms.ssl.policy.https.ciphers=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
dbms.ssl.policy.https.enabled=true
dbms.ssl.policy.https.private_key=/var/lib/neo4j/certificates/tls.key
dbms.ssl.policy.https.public_certificate=/var/lib/neo4j/certificates/tls.crt
dbms.tx_log.rotation.retention_policy=100M size
dbms.tx_state.memory_allocation=ON_HEAP
dbms.windows_service_name=neo4j

2023-02-14 16:16:08.571+0000 INFO [o.n.s.c.SslPolicyLoader] Loaded SSL policy 'BOLT' = SslPolicy{keyCertChain=Subject: CN=.ta-neo4j.ibm-ta.svc, Issuer: CN=openshift-service-serving-signer@1663169102, ciphers=[TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256], tlsVersions=[TLSv1.2], clientAuth=OPTIONAL}
2023-02-14 16:16:08.571+0000 INFO [o.n.s.c.SslPolicyLoader] Loaded SSL policy 'HTTPS' = SslPolicy{keyCertChain=Subject: CN=
.ta-neo4j.ibm-ta.svc, Issuer: CN=openshift-service-serving-signer@1663169102, ciphers=[TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256], tlsVersions=[TLSv1.2], clientAuth=OPTIONAL}
2023-02-14 16:16:08.580+0000 INFO [o.n.d.i.DefaultIdentityModule] Found ServerId on disk: ServerId{af52aea0} (af52aea0-bf28-429d-b5b9-e2725cd1455f)
2023-02-14 16:16:08.581+0000 INFO [o.n.d.i.DefaultIdentityModule] This instance is ServerId{af52aea0} (af52aea0-bf28-429d-b5b9-e2725cd1455f)
2023-02-14 16:16:08.587+0000 INFO [o.n.g.f.EditionLocksFactories] Locking implementation 'forseti' selected.
2023-02-14 16:16:10.462+0000 INFO [o.n.s.CommunityNeoWebServer] ======== Neo4j 4.4.16 ========
2023-02-14 16:16:10.483+0000 INFO [o.n.d.d.DefaultDatabaseManager] Creating 'DatabaseId{00000000[system]}'.
2023-02-14 16:16:10.723+0000 DEBUG [i.n.h.s.JdkSslContext] Default protocols (JDK): [TLSv1.3, TLSv1.2]
2023-02-14 16:16:10.723+0000 DEBUG [i.n.h.s.JdkSslContext] Default cipher suites (JDK): [TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_256_CBC_SHA, TLS_AES_128_GCM_SHA256, TLS_AES_256_GCM_SHA384]
2023-02-14 16:16:10.849+0000 DEBUG [i.n.u.i.NativeLibraryLoader] -Dio.netty.native.workdir: /tmp (io.netty.tmpdir)
2023-02-14 16:16:10.850+0000 DEBUG [i.n.u.i.NativeLibraryLoader] -Dio.netty.native.deleteLibAfterLoading: true
2023-02-14 16:16:10.850+0000 DEBUG [i.n.u.i.NativeLibraryLoader] -Dio.netty.native.tryPatchShadedId: true
2023-02-14 16:16:10.850+0000 DEBUG [i.n.u.i.NativeLibraryLoader] -Dio.netty.native.detectNativeLibraryDuplicates: true
2023-02-14 16:16:10.864+0000 DEBUG [i.n.u.i.NativeLibraryLoader] Successfully loaded the library /tmp/libnetty_transport_native_epoll_x86_6412630054574465331921.so
2023-02-14 16:16:10.868+0000 DEBUG [i.n.u.NetUtil] -Djava.net.preferIPv4Stack: false
2023-02-14 16:16:10.868+0000 DEBUG [i.n.u.NetUtil] -Djava.net.preferIPv6Addresses: false
2023-02-14 16:16:10.870+0000 DEBUG [i.n.u.NetUtilInitializations] Loopback interface: lo (lo, 0:0:0:0:0:0:0:1%lo)
2023-02-14 16:16:10.871+0000 DEBUG [i.n.u.NetUtil] /proc/sys/net/core/somaxconn: 128
2023-02-14 16:16:10.872+0000 INFO [o.n.b.BoltServer] Bolt server loaded
2023-02-14 16:16:10.877+0000 DEBUG [i.n.c.MultithreadEventLoopGroup] -Dio.netty.eventLoopThreads: 24
2023-02-14 16:16:10.925+0000 DEBUG [o.n.b.r.s.ExecutorBoltScheduler] Initialized bolt thread pool.
2023-02-14 16:16:10.984+0000 DEBUG [o.n.k.i.c.VmPauseMonitorComponent] Starting VM pause monitor
2023-02-14 16:16:10.987+0000 INFO [o.n.d.d.DefaultDatabaseManager] Starting 'DatabaseId{00000000[system]}'.
2023-02-14 16:16:11.309+0000 INFO [o.n.k.i.s.f.RecordFormatSelector] [system/00000000] Selected RecordFormat:PageAlignedV4_3[AF4.3.0] record format from store /data/databases/system
2023-02-14 16:16:11.311+0000 INFO [o.n.k.i.s.f.RecordFormatSelector] [system/00000000] Selected format 'RecordFormat:PageAlignedV4_3[AF4.3.0]' for existing store /data/databases/system with format 'RecordFormat:PageAlignedV4_3[AF4.3.0]'
2023-02-14 16:16:11.453+0000 WARN [o.n.k.d.Database] [system/00000000] Exception occurred while starting the database. Trying to stop already started components.
java.lang.RuntimeException: org.neo4j.kernel.impl.transaction.log.entry.IncompleteLogHeaderException: Unable to read log version and last committed tx from '/data/transactions/system/neostore.transaction.db.0'. Was only able to read -1 bytes, but was expecting 8
at org.neo4j.kernel.impl.transaction.log.files.checkpoint.DetachedLogTailScanner.findLogTail(DetachedLogTailScanner.java:102) ~[neo4j-kernel-4.4.16.jar:4.4.16]
at org.neo4j.kernel.impl.transaction.log.files.checkpoint.AbstractLogTailScanner.getTailInformation(AbstractLogTailScanner.java:246) ~[neo4j-kernel-4.4.16.jar:4.4.16]
at org.neo4j.kernel.impl.transaction.log.files.checkpoint.AbstractLogTailScanner.getTailInformation(AbstractLogTailScanner.java:237) ~[neo4j-kernel-4.4.16.jar:4.4.16]
... 23 more
2023-02-14 16:16:11.459+0000 ERROR [o.n.d.d.DefaultDatabaseManager] Failed to start DatabaseId{00000000[system]}
org.neo4j.dbms.api.DatabaseManagementException: An error occurred! Unable to start DatabaseId{00000000[system]}.
at org.neo4j.dbms.database.AbstractDatabaseManager.startDatabase(AbstractDatabaseManager.java:236) ~[neo4j-4.4.16.jar:4.4.16]
... 7 more
2023-02-14 16:16:12.480+0000 INFO [o.n.g.f.DatabaseManagementServiceFactory] Shutdown started
2023-02-14 16:16:12.480+0000 DEBUG [o.n.b.r.s.ExecutorBoltScheduler] Stopping connections
2023-02-14 16:16:12.480+0000 DEBUG [o.n.b.r.s.ExecutorBoltScheduler] Connections stopped
2023-02-14 16:16:12.480+0000 DEBUG [o.n.b.r.s.ExecutorBoltScheduler] Shutting down thread pool
2023-02-14 16:16:12.481+0000 DEBUG [o.n.b.r.s.ExecutorBoltScheduler] Thread pool shut down
2023-02-14 16:16:12.481+0000 DEBUG [o.n.b.t.NettyServer] Shutting down event loop group

It should definitely not happen and looks like a bug, can you please raise this as a GitHub issue here:

Thank you