Neo4j 4.1.1 fails in kubernetes environment

Hi, we were running neo4j 4.0.2-enterprise in a development Amazon EKS cluster for some time.

I tore the cluster down and tried to rebuild it, this time bumping the container up to 4.1.1-enterprise.

This is what happens in the neo4j logs:

Changed password for user 'neo4j'.
Directories in use:
  home:         /var/lib/neo4j
  config:       /var/lib/neo4j/conf
  logs:         /var/lib/neo4j/logs/
  plugins:      /var/lib/neo4j/plugins
  import:       /var/lib/neo4j/import
  data:         /var/lib/neo4j/data
  certificates: /var/lib/neo4j/certificates
  run:          /var/lib/neo4j/run
Starting Neo4j.
2020-07-16 21:42:00.784+0000 WARN  Unrecognized setting. No declared setting with name: PORT.7687.TCP.PORT
2020-07-16 21:42:00.795+0000 WARN  Unrecognized setting. No declared setting with name: PORT.7474.TCP.ADDR
2020-07-16 21:42:00.795+0000 WARN  Unrecognized setting. No declared setting with name: PORT.7687.TCP.PROTO
2020-07-16 21:42:00.795+0000 WARN  Unrecognized setting. No declared setting with name: SERVICE.PORT.BROWSER
2020-07-16 21:42:00.796+0000 WARN  Unrecognized setting. No declared setting with name: PORT.7474.TCP.PROTO
2020-07-16 21:42:00.796+0000 WARN  Unrecognized setting. No declared setting with name: PORT
2020-07-16 21:42:00.796+0000 WARN  Unrecognized setting. No declared setting with name: PORT.7474.TCP.PORT
2020-07-16 21:42:00.797+0000 WARN  Unrecognized setting. No declared setting with name: PORT.7687.TCP.ADDR
2020-07-16 21:42:00.797+0000 WARN  Unrecognized setting. No declared setting with name: PORT.7687.TCP
2020-07-16 21:42:00.797+0000 WARN  Unrecognized setting. No declared setting with name: PORT.7474.TCP
2020-07-16 21:42:00.798+0000 WARN  Unrecognized setting. No declared setting with name: SERVICE.PORT
2020-07-16 21:42:00.798+0000 WARN  Unrecognized setting. No declared setting with name: SERVICE.PORT.BOLT
2020-07-16 21:42:00.798+0000 WARN  Unrecognized setting. No declared setting with name: SERVICE.HOST
2020-07-16 21:42:00.801+0000 INFO  Starting...
2020-07-16 21:42:05.876+0000 INFO  ======== Neo4j 4.1.1 ========
2020-07-16 21:42:12.666+0000 ERROR Failed to start Neo4j on dbms.connector.http.listen_address, a socket address. If missing port or hostname it is acquired from dbms.default_listen_address. Error starting Neo4j database server at /data/databases
java.lang.RuntimeException: Error starting Neo4j database server at /data/databases
	at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.startDatabaseServer(DatabaseManagementServiceFactory.java:198)
	at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.build(DatabaseManagementServiceFactory.java:158)
	at com.neo4j.server.enterprise.EnterpriseManagementServiceFactory.createManagementService(EnterpriseManagementServiceFactory.java:38)
	at com.neo4j.server.enterprise.EnterpriseBootstrapper.createNeo(EnterpriseBootstrapper.java:20)
	at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:117)
	at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:87)
	at com.neo4j.server.enterprise.EnterpriseEntryPoint.main(EnterpriseEntryPoint.java:25)
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'com.neo4j.dbms.StandaloneDbmsReconcilerModule@7f977fba' was successfully initialized, but failed to start. Please see the attached cause exception "Transaction logs are missing and recovery is not possible.".
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:463)
	at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:110)
	at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.startDatabaseServer(DatabaseManagementServiceFactory.java:189)
	... 6 more
Caused by: org.neo4j.dbms.api.DatabaseManagementException: A triggered DbmsReconciler job failed with the following cause
	at com.neo4j.dbms.ReconcilerResult.join(ReconcilerResult.java:57)
	at com.neo4j.dbms.StandaloneDbmsReconcilerModule.startInitialDatabases(StandaloneDbmsReconcilerModule.java:95)
	at com.neo4j.dbms.StandaloneDbmsReconcilerModule.start(StandaloneDbmsReconcilerModule.java:85)
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:442)
	... 8 more
Caused by: org.neo4j.dbms.api.DatabaseManagementException: An error occurred! Unable to start database with name `system`.
	at org.neo4j.dbms.database.AbstractDatabaseManager.startDatabase(AbstractDatabaseManager.java:191)
	at com.neo4j.dbms.database.MultiDatabaseManager.forSingleDatabase(MultiDatabaseManager.java:134)
	at com.neo4j.dbms.database.MultiDatabaseManager.startDatabase(MultiDatabaseManager.java:119)
	at com.neo4j.dbms.Transition$Prepared.doTransitionAction(Transition.java:101)
	at com.neo4j.dbms.Transition$Prepared.doTransition(Transition.java:88)
	at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:346)
	at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:347)
	at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:347)
	at com.neo4j.dbms.DbmsReconciler.lambda$doTransitions$11(DbmsReconciler.java:315)
	at com.neo4j.dbms.DbmsReconciler.namedJob(DbmsReconciler.java:326)
	at com.neo4j.dbms.DbmsReconciler.doTransitions(DbmsReconciler.java:316)
	at com.neo4j.dbms.DbmsReconciler.lambda$doTransitions$9(DbmsReconciler.java:307)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.RuntimeException: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.recovery.Recovery$MissingTransactionLogsCheck@10502f66' failed to initialize. Please see the attached cause exception "Transaction logs are missing and recovery is not possible.".
	at org.neo4j.kernel.database.Database.start(Database.java:497)
	at org.neo4j.dbms.database.AbstractDatabaseManager.startDatabase(AbstractDatabaseManager.java:187)
	... 17 more
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.recovery.Recovery$MissingTransactionLogsCheck@10502f66' failed to initialize. Please see the attached cause exception "Transaction logs are missing and recovery is not possible.".
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:424)
	at org.neo4j.kernel.lifecycle.LifeSupport.init(LifeSupport.java:65)
	at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:101)
	at org.neo4j.kernel.recovery.Recovery.performRecovery(Recovery.java:384)
	at org.neo4j.kernel.database.Database.start(Database.java:389)
	... 18 more
Caused by: java.lang.RuntimeException: Transaction logs are missing and recovery is not possible.
	at org.neo4j.kernel.recovery.Recovery$MissingTransactionLogsCheck.checkForMissingLogFiles(Recovery.java:549)
	at org.neo4j.kernel.recovery.Recovery$MissingTransactionLogsCheck.init(Recovery.java:522)
	at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:403)
	... 22 more
2020-07-16 21:42:12.669+0000 INFO  Neo4j Server shutdown initiated by request

My deployment spec looks like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: neo4j
  labels:
    app: neo4j
spec:
  selector:
    matchLabels:
      app: neo4j
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  replicas: 1
  template:
    metadata:
      labels:
        app: neo4j
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "2004"
    spec:
      nodeSelector:
        node.kubernetes.io/instance-type: "m4.large"
      schedulerName: stork
      containers:
        # configuration: https://hub.docker.com/_/neo4j/
        # source: https://github.com/neo4j/docker-neo4j
        - name: neo4j
          image: neo4j:4.1.1-enterprise
          imagePullPolicy: "IfNotPresent"
          ports:
            - containerPort: 7474
              name: browser
            - containerPort: 7687
              name: bolt
            - containerPort: 2004
              name: metrics
          resources:
            limits: 
              memory: 8Gi
            requests:
              memory: 2Gi
          env:
          # https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
            - name: NEO4J_ACCEPT_LICENSE_AGREEMENT
              value: "yes"
            - name: NEO4J_AUTH
              value: "neo4j/Salido4u-2.78"
            - name: NEO4J_dbms_mode
              value: "SINGLE"
            - name: NEO4J_metrics_prometheus_enabled
              value: "true"
            - name: NEO4J_metrics_prometheus_endpoint
              value: "0.0.0.0:2004"
            - name: NEO4J_dbms_default__listen__address
              value: "0.0.0.0"
            - name: NEO4J_dbms_logs_query_threshold
              value: "2s"
            - name: NEO4J_dbms_logs_query_rotation_size
              value: "20m"
            - name: NEO4J_dbms_logs_query_rotation_keep__number
              value: "7"
            - name: NEO4J_dbms_logs_query_time__logging__enabled
              value: "true"
            - name: NEO4J_dbms_logs_query_page__logging__enabled
              value: "true"
            - name: NEO4J_dbms_directories_logs
              value: "/var/lib/neo4j/logs/"
            - name: NEO4J_dbms_memory_pagecache_size
              value: "2G"
            - name: NEO4J_dbms_memory_heap_max__size
              value: "8G"
          volumeMounts:
            - mountPath: /var/lib/neo4j/data/
              name: neo4jdata
              readOnly: false
            - mountPath: /var/lib/neo4j/logs/
              name: neo4jlogs
              readOnly: false
      volumes:
        - name: neo4jdata
          persistentVolumeClaim:
            claimName: px-neo4j-pvc
        - name: neo4jlogs
          emptyDir: {}

Can someone clue me in -- what I need to do to resolve this? We are using the stork scheduler with a persistent volume claim and the tech from portworx.com which binds the pvc to the pod. That all seems fine and well. The pvc is bound. Something seems awry with the config or settings however.

Did you manage to solve this problem?