Hi, we were running neo4j 4.0.2-enterprise in a development Amazon EKS cluster for some time.
I tore the cluster down and tried to rebuild it, this time bumping the container up to 4.1.1-enterprise.
This is what happens in the neo4j logs:
Changed password for user 'neo4j'.
Directories in use:
home: /var/lib/neo4j
config: /var/lib/neo4j/conf
logs: /var/lib/neo4j/logs/
plugins: /var/lib/neo4j/plugins
import: /var/lib/neo4j/import
data: /var/lib/neo4j/data
certificates: /var/lib/neo4j/certificates
run: /var/lib/neo4j/run
Starting Neo4j.
2020-07-16 21:42:00.784+0000 WARN Unrecognized setting. No declared setting with name: PORT.7687.TCP.PORT
2020-07-16 21:42:00.795+0000 WARN Unrecognized setting. No declared setting with name: PORT.7474.TCP.ADDR
2020-07-16 21:42:00.795+0000 WARN Unrecognized setting. No declared setting with name: PORT.7687.TCP.PROTO
2020-07-16 21:42:00.795+0000 WARN Unrecognized setting. No declared setting with name: SERVICE.PORT.BROWSER
2020-07-16 21:42:00.796+0000 WARN Unrecognized setting. No declared setting with name: PORT.7474.TCP.PROTO
2020-07-16 21:42:00.796+0000 WARN Unrecognized setting. No declared setting with name: PORT
2020-07-16 21:42:00.796+0000 WARN Unrecognized setting. No declared setting with name: PORT.7474.TCP.PORT
2020-07-16 21:42:00.797+0000 WARN Unrecognized setting. No declared setting with name: PORT.7687.TCP.ADDR
2020-07-16 21:42:00.797+0000 WARN Unrecognized setting. No declared setting with name: PORT.7687.TCP
2020-07-16 21:42:00.797+0000 WARN Unrecognized setting. No declared setting with name: PORT.7474.TCP
2020-07-16 21:42:00.798+0000 WARN Unrecognized setting. No declared setting with name: SERVICE.PORT
2020-07-16 21:42:00.798+0000 WARN Unrecognized setting. No declared setting with name: SERVICE.PORT.BOLT
2020-07-16 21:42:00.798+0000 WARN Unrecognized setting. No declared setting with name: SERVICE.HOST
2020-07-16 21:42:00.801+0000 INFO Starting...
2020-07-16 21:42:05.876+0000 INFO ======== Neo4j 4.1.1 ========
2020-07-16 21:42:12.666+0000 ERROR Failed to start Neo4j on dbms.connector.http.listen_address, a socket address. If missing port or hostname it is acquired from dbms.default_listen_address. Error starting Neo4j database server at /data/databases
java.lang.RuntimeException: Error starting Neo4j database server at /data/databases
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.startDatabaseServer(DatabaseManagementServiceFactory.java:198)
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.build(DatabaseManagementServiceFactory.java:158)
at com.neo4j.server.enterprise.EnterpriseManagementServiceFactory.createManagementService(EnterpriseManagementServiceFactory.java:38)
at com.neo4j.server.enterprise.EnterpriseBootstrapper.createNeo(EnterpriseBootstrapper.java:20)
at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:117)
at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:87)
at com.neo4j.server.enterprise.EnterpriseEntryPoint.main(EnterpriseEntryPoint.java:25)
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'com.neo4j.dbms.StandaloneDbmsReconcilerModule@7f977fba' was successfully initialized, but failed to start. Please see the attached cause exception "Transaction logs are missing and recovery is not possible.".
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:463)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:110)
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.startDatabaseServer(DatabaseManagementServiceFactory.java:189)
... 6 more
Caused by: org.neo4j.dbms.api.DatabaseManagementException: A triggered DbmsReconciler job failed with the following cause
at com.neo4j.dbms.ReconcilerResult.join(ReconcilerResult.java:57)
at com.neo4j.dbms.StandaloneDbmsReconcilerModule.startInitialDatabases(StandaloneDbmsReconcilerModule.java:95)
at com.neo4j.dbms.StandaloneDbmsReconcilerModule.start(StandaloneDbmsReconcilerModule.java:85)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:442)
... 8 more
Caused by: org.neo4j.dbms.api.DatabaseManagementException: An error occurred! Unable to start database with name `system`.
at org.neo4j.dbms.database.AbstractDatabaseManager.startDatabase(AbstractDatabaseManager.java:191)
at com.neo4j.dbms.database.MultiDatabaseManager.forSingleDatabase(MultiDatabaseManager.java:134)
at com.neo4j.dbms.database.MultiDatabaseManager.startDatabase(MultiDatabaseManager.java:119)
at com.neo4j.dbms.Transition$Prepared.doTransitionAction(Transition.java:101)
at com.neo4j.dbms.Transition$Prepared.doTransition(Transition.java:88)
at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:346)
at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:347)
at com.neo4j.dbms.DbmsReconciler.doTransitionStep(DbmsReconciler.java:347)
at com.neo4j.dbms.DbmsReconciler.lambda$doTransitions$11(DbmsReconciler.java:315)
at com.neo4j.dbms.DbmsReconciler.namedJob(DbmsReconciler.java:326)
at com.neo4j.dbms.DbmsReconciler.doTransitions(DbmsReconciler.java:316)
at com.neo4j.dbms.DbmsReconciler.lambda$doTransitions$9(DbmsReconciler.java:307)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.RuntimeException: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.recovery.Recovery$MissingTransactionLogsCheck@10502f66' failed to initialize. Please see the attached cause exception "Transaction logs are missing and recovery is not possible.".
at org.neo4j.kernel.database.Database.start(Database.java:497)
at org.neo4j.dbms.database.AbstractDatabaseManager.startDatabase(AbstractDatabaseManager.java:187)
... 17 more
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.recovery.Recovery$MissingTransactionLogsCheck@10502f66' failed to initialize. Please see the attached cause exception "Transaction logs are missing and recovery is not possible.".
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:424)
at org.neo4j.kernel.lifecycle.LifeSupport.init(LifeSupport.java:65)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:101)
at org.neo4j.kernel.recovery.Recovery.performRecovery(Recovery.java:384)
at org.neo4j.kernel.database.Database.start(Database.java:389)
... 18 more
Caused by: java.lang.RuntimeException: Transaction logs are missing and recovery is not possible.
at org.neo4j.kernel.recovery.Recovery$MissingTransactionLogsCheck.checkForMissingLogFiles(Recovery.java:549)
at org.neo4j.kernel.recovery.Recovery$MissingTransactionLogsCheck.init(Recovery.java:522)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:403)
... 22 more
2020-07-16 21:42:12.669+0000 INFO Neo4j Server shutdown initiated by request
My deployment spec looks like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: neo4j
labels:
app: neo4j
spec:
selector:
matchLabels:
app: neo4j
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
replicas: 1
template:
metadata:
labels:
app: neo4j
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "2004"
spec:
nodeSelector:
node.kubernetes.io/instance-type: "m4.large"
schedulerName: stork
containers:
# configuration: https://hub.docker.com/_/neo4j/
# source: https://github.com/neo4j/docker-neo4j
- name: neo4j
image: neo4j:4.1.1-enterprise
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 7474
name: browser
- containerPort: 7687
name: bolt
- containerPort: 2004
name: metrics
resources:
limits:
memory: 8Gi
requests:
memory: 2Gi
env:
# https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
- name: NEO4J_ACCEPT_LICENSE_AGREEMENT
value: "yes"
- name: NEO4J_AUTH
value: "neo4j/Salido4u-2.78"
- name: NEO4J_dbms_mode
value: "SINGLE"
- name: NEO4J_metrics_prometheus_enabled
value: "true"
- name: NEO4J_metrics_prometheus_endpoint
value: "0.0.0.0:2004"
- name: NEO4J_dbms_default__listen__address
value: "0.0.0.0"
- name: NEO4J_dbms_logs_query_threshold
value: "2s"
- name: NEO4J_dbms_logs_query_rotation_size
value: "20m"
- name: NEO4J_dbms_logs_query_rotation_keep__number
value: "7"
- name: NEO4J_dbms_logs_query_time__logging__enabled
value: "true"
- name: NEO4J_dbms_logs_query_page__logging__enabled
value: "true"
- name: NEO4J_dbms_directories_logs
value: "/var/lib/neo4j/logs/"
- name: NEO4J_dbms_memory_pagecache_size
value: "2G"
- name: NEO4J_dbms_memory_heap_max__size
value: "8G"
volumeMounts:
- mountPath: /var/lib/neo4j/data/
name: neo4jdata
readOnly: false
- mountPath: /var/lib/neo4j/logs/
name: neo4jlogs
readOnly: false
volumes:
- name: neo4jdata
persistentVolumeClaim:
claimName: px-neo4j-pvc
- name: neo4jlogs
emptyDir: {}
Can someone clue me in -- what I need to do to resolve this? We are using the stork scheduler with a persistent volume claim and the tech from portworx.com which binds the pvc to the pod. That all seems fine and well. The pvc is bound. Something seems awry with the config or settings however.