I have a casual cluster running in GKE that was set up via the Marketplace.
I've also setup the backup and restore processes as explained here How to backup Neo4j Running in Kubernetes & here How to Restore Neo4j Backups on Kubernetes.
They work as expected when the initial backup and subsequent restore processes are performed against the same casual cluster.
Now, I also have another casual cluster(setup via the GCP Marketplace as well) which I'd like to seed with the backup from the earlier casual cluster. When the restore process is set in the new cluster's deployment file, it runs successfully and the new cluster is initialized as expected with the database from the previous casual cluster. I am then able to read from the database as well as scale the core and replica deployments.
But the new casual cluster is now no longer accepting any writes.
To check whether this was an issue with the driver(neo4j-go-driver v1.8), I connected to the LEADER in the new cluster and tried to run a CREATE query via the cypher-shell tool -
create (t:Test) return t;
The terminal just hangs for an indefinite amount of time(I ended up closing the session as it had become non-responsive). I can then see the following output in the debug.log file -
2020-08-25 16:35:26.818+0000 WARN [c.n.c.c.s.m.t.ReplicatedTokenStateMachine] [neo4j] Ignored > ReplicatedTokenRequest{type='LABEL', name='Test'} because already committed (14 <= 3673)
On reconnecting to the LEADER for verifying whether a write had occurred -
match (t:Test) return t;
I see the following output -
+---+
| t |
+---+
+---+0 rows available after 802 ms, consumed after another 19 ms
Some additional Information regarding my setup -
- Neo4j 4.1 (available via the GCP Marketplace) deployed with 3 Cores and 3 Replicas
- GCP cluster node running 3 VMs, each providing 2CPU and 13GB of RAM each.
- In addition to the configuration defined in this post How to Restore Neo4j Backups on Kubernetes, I've also added the following lines before the export statements
echo "Fixing '/data/transactions ' folder permissions..."
chown -R neo4j:neo4j /data/transactions
## Disabling the following lines as they do not seem necessary to fix the AccessDenied error
# echo "Fixing '/data/databases' folder permissions..."
# chown -R neo4j:neo4j /data/databases
as I was otherwise facing this error on startup -
java.lang.RuntimeException: Error starting Neo4j database server at /data/databases
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.startDatabaseServer(DatabaseManagementServiceFactory.java:198)
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.build(DatabaseManagementServiceFactory.java:158)
at com.neo4j.causalclustering.core.CoreGraphDatabase.createManagementService(CoreGraphDatabase.java:38)
at com.neo4j.causalclustering.core.CoreGraphDatabase.(CoreGraphDatabase.java:30)
at com.neo4j.server.enterprise.EnterpriseManagementServiceFactory.createManagementService(EnterpriseManagementServiceFactory.java:34)
at com.neo4j.server.enterprise.EnterpriseBootstrapper.createNeo(EnterpriseBootstrapper.java:20)
at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:117)
at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:87)
at com.neo4j.server.enterprise.EnterpriseEntryPoint.main(EnterpriseEntryPoint.java:25)
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'com.neo4j.dbms.ClusteredDbmsReconcilerModule@49222187' was successfully initialized, but failed to start. Please see the attached cause exception "/data/transactions/system/neostore.transaction.db.0".
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:463)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:110)
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.startDatabaseServer(DatabaseManagementServiceFactory.java:189)
... 8 more
Caused by: org.neo4j.dbms.api.DatabaseManagementException: A triggered DbmsReconciler job failed with the following cause
at com.neo4j.dbms.ReconcilerResult.join(ReconcilerResult.java:57)
at com.neo4j.dbms.StandaloneDbmsReconcilerModule.startInitialDatabases(StandaloneDbmsReconcilerModule.java:95)
at com.neo4j.dbms.StandaloneDbmsReconcilerModule.start(StandaloneDbmsReconcilerModule.java:85)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:442)
... 10 more
Caused by: org.neo4j.dbms.api.DatabaseManagementException: Unable to start databaseDatabaseId{00000000[system]}
at com.neo4j.dbms.database.ClusteredMultiDatabaseManager.startDatabase(ClusteredMultiDatabaseManager.java:68)
at com.neo4j.dbms.database.ClusteredMultiDatabaseManager.startDatabase(ClusteredMultiDatabaseManager.java:30)
at com.neo4j.dbms.database.MultiDatabaseManager.forSingleDatabase(MultiDatabaseManager.java:134)
at com.neo4j.dbms.database.MultiDatabaseManager.startDatabase(MultiDatabaseManager.java:119)
at com.neo4j.dbms.Transition$Prepared.doTransitionAction(Transition.java:101)
at com.neo4j.dbms.Transition$Prepared.doTransition(Transition.java:88)
at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:346)
at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:347)
at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:347)
at com.neo4j.dbms.DbmsReconciler.lambda$doTransitions$11(DbmsReconciler.java:315)
at com.neo4j.dbms.DbmsReconciler.namedJob(DbmsReconciler.java:326)
at com.neo4j.dbms.DbmsReconciler.doTransitions(DbmsReconciler.java:316)
at com.neo4j.dbms.DbmsReconciler.lambda$doTransitions$9(DbmsReconciler.java:307)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.database.Database@7e7fe0bf' was successfully initialized, but failed to start. Please see the attached cause exception "/data/transactions/system/neostore.transaction.db.0".
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:463)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:110)
at com.neo4j.causalclustering.common.ClusteredDatabase.start(ClusteredDatabase.java:39)
at com.neo4j.dbms.database.ClusteredMultiDatabaseManager.startDatabase(ClusteredMultiDatabaseManager.java:64)
... 18 more
Caused by: java.lang.RuntimeException: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.impl.transaction.log.files.TransactionLogFiles@14069d4f' was successfully initialized, but failed to start. Please see the attached cause exception "/data/transactions/system/neostore.transaction.db.0".
at org.neo4j.kernel.database.Database.start(Database.java:497)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:442)
... 21 more
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.impl.transaction.log.files.TransactionLogFiles@14069d4f' was successfully initialized, but failed to start. Please see the attached cause exception "/data/transactions/system/neostore.transaction.db.0".
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:463)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:110)
at org.neo4j.kernel.database.Database.start(Database.java:481)
... 22 more
Caused by: java.nio.file.AccessDeniedException: /data/transactions/system/neostore.transaction.db.0
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at java.base/sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:182)
at java.base/java.nio.channels.FileChannel.open(FileChannel.java:292)
at org.neo4j.io.fs.DefaultFileSystemAbstraction.open(DefaultFileSystemAbstraction.java:71)
at org.neo4j.io.fs.DefaultFileSystemAbstraction.write(DefaultFileSystemAbstraction.java:102)
at org.neo4j.io.fs.DefaultFileSystemAbstraction.write(DefaultFileSystemAbstraction.java:53)
at org.neo4j.kernel.impl.transaction.log.files.TransactionLogChannelAllocator.allocateFile(TransactionLogChannelAllocator.java:143)
at org.neo4j.kernel.impl.transaction.log.files.TransactionLogChannelAllocator.createLogChannel(TransactionLogChannelAllocator.java:64)
at org.neo4j.kernel.impl.transaction.log.files.TransactionLogFiles.createLogChannelForVersion(TransactionLogFiles.java:230)
at org.neo4j.kernel.impl.transaction.log.files.TransactionLogFile.start(TransactionLogFile.java:89)
at org.neo4j.kernel.impl.transaction.log.files.TransactionLogFiles.start(TransactionLogFiles.java:79)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:442)
... 24 more
I'm failing to understand why the writes are being ignored by the cluster but reads have no issue. I've tried going through the docs several times but haven't managed to figure this out.
Any help would be greatly appreciated!