We created an Enterprise cluster using GCP deployment manager.
Disk Size 1000GB each
Neo4J version Enterprise: 4.1.0
We did some data loading activity and transaction logs burst up the storage space causing everything to go down.
We cleaned up the transactional logs and restart the database. 2 nodes went back in healthy state, but 1 node database was crashed. Logs indicate that it needs some more transactional logs. We restored few more transactional logs until this neo4j stopped throwing reconcile error.
Now it throws no error in logs but Show Databases query still shows its offline (Unable to start database)
I have attached the logs for detailed analysis.
debug_trimmed_0708.txt (1.8 MB)
debug_trimmed_0709.txt (3.9 MB)
Any help or suggestions for getting this database back online. ??
- Take the node offline.
- Wipe out the customer360 directory under
neo4j-admin unbind on the node (this resets cluster state so it isn't out of sync with the removal of the database)
- Start up the node
The node should join the cluster and copy down the store for the customer360 db from the leader.
Thanks for the quick response.
We checked database status 2 hours later, it was in healthy state. logs indicate that it downloaded some files from sibling nodes.
Chapter reopend 3 days later when it was crashed again. We used your steps above It works like charm a pretty simple way to recover database on cluster node.
Few knowledge base questions:
- If 2 out of 3 nodes are down and we try to recover it by seeding it through sibling node, would it work ?
- We used sudo to start/stop neo4j service or wiping out files. Is it possible to switch user intended for this purpose. Tried login with
sudo -su neo4j it asks for password; databse default admin password didn't worked.