Neo4j - Lastest Version - Can't restart - sudden failure

operations

(Ranlandau) #1

Hello,

I'm running neo4j enterprise 3.4.9 on Ubuntu.

neo4j@ip-10-0-32-44:/home/ubuntu$ ulimit -n
1045590

Looks good? Can't start whatever i do.

-- Logs begin at Mon 2018-11-12 10:27:09 UTC. --
Nov 12 10:27:13 ip-10-0-32-44 neo4j[1225]:   logs:         /var/log/neo4j
Nov 12 10:27:13 ip-10-0-32-44 neo4j[1225]:   plugins:      /var/lib/neo4j/plugins
Nov 12 10:27:13 ip-10-0-32-44 neo4j[1225]:   import:       /var/lib/neo4j/import
Nov 12 10:27:13 ip-10-0-32-44 neo4j[1225]:   data:         /var/lib/neo4j/data
Nov 12 10:27:13 ip-10-0-32-44 neo4j[1225]:   certificates: /var/lib/neo4j/certificates
Nov 12 10:27:13 ip-10-0-32-44 neo4j[1225]:   run:          /var/run/neo4j
Nov 12 10:27:13 ip-10-0-32-44 neo4j[1225]: Starting Neo4j.
Nov 12 10:27:17 ip-10-0-32-44 neo4j[1225]: 2018-11-12 10:27:17.807+0000 INFO  ======== Neo4j 3.4.9 ========
Nov 12 10:27:17 ip-10-0-32-44 neo4j[1225]: 2018-11-12 10:27:17.849+0000 INFO  Starting...
Nov 12 10:27:19 ip-10-0-32-44 neo4j[1225]: 2018-11-12 10:27:19.117+0000 INFO  Initiating metrics...
Nov 12 10:28:18 ip-10-0-32-44 neo4j[1225]: 2018-11-12 10:28:18.177+0000 INFO  Sending metrics to CSV file at /var/lib/neo4j/metrics
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]: 2018-11-12 10:28:26.906+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@3451f8ff' was successfully initialized, but failed to start. Please see the attached cause exception "timerfd_create() failed: Too many open files". Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@3451f8ff' was successfully initialized, but failed to start. Please see the attached cause exception "timerfd_create() failed: Too many open files".
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]: org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@3451f8ff' was successfully initialized, but failed to start. Please see the attached cause exception"timerfd_create() failed: Too many open files".
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.server.exception.ServerStartupErrors.translateToServerStartupError(ServerStartupErrors.java:68)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:220)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:111)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:79)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at com.neo4j.server.enterprise.CommercialEntryPoint.main(CommercialEntryPoint.java:22)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]: Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.server.database.LifecycleManagingDatabase@3451f8ff' was successfully initialized, but failed to start. Please see the attached cause exception "timerfd_create() failed: Too many open files".
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:466)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:212)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         ... 3 more
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]: Caused by: java.lang.RuntimeException: Error starting org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory, /var/lib/neo4j/data/databases/graph.db
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.initFacade(GraphDatabaseFacadeFactory.java:212)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.enterprise.EnterpriseGraphDatabase.<init>(EnterpriseGraphDatabase.java:39)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.server.enterprise.OpenEnterpriseNeoServer.lambda$static$1(OpenEnterpriseNeoServer.java:78)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.server.database.LifecycleManagingDatabase.start(LifecycleManagingDatabase.java:88)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         ... 5 more
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]: Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.bolt.transport.NettyServer@2013a999' was successfully initialized, but failed to start. Please see the attached cause exception "timerfd_create() failed: Too many open files".
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:466)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.extension.KernelExtensions.start(KernelExtensions.java:84)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.initFacade(GraphDatabaseFacadeFactory.java:208)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         ... 9 more
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]: Caused by: java.lang.IllegalStateException: failed to create a child event loop
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:88)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:58)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:47)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:59)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:104)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:91)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:68)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.bolt.transport.configuration.EpollConfigurationProvider.createEventLoopGroup(EpollConfigurationProvider.java:40)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.bolt.transport.NettyServer.start(NettyServer.java:98)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         ... 16 more
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         Suppressed: org.neo4j.kernel.lifecycle.LifecycleException: Exception during graceful attempt to stop partially started component. Please use non suppressed exception to see original component failure.
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:                 at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:457)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:                 ... 16 more
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         Caused by: java.lang.NullPointerException
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:                 at org.neo4j.bolt.transport.NettyServer.stop(NettyServer.java:142)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:                 at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:453)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:                 ... 16 more
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]: Caused by: io.netty.channel.ChannelException: timerfd_create() failed: Too many open files
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.channel.epoll.Native.timerFd(Native Method)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.channel.epoll.Native.newTimerFd(Native.java:97)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.channel.epoll.EpollEventLoop.<init>(EpollEventLoop.java:104)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:135)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:35)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:84)
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]:         ... 25 more
Nov 12 10:28:26 ip-10-0-32-44 neo4j[1225]: 2018-11-12 10:28:26.907+0000 INFO  Neo4j Server shutdown initiated by request
Nov 12 10:28:26 ip-10-0-32-44 systemd[1]: neo4j.service: Main process exited, code=exited, status=1/FAILURE
Nov 12 10:28:26 ip-10-0-32-44 systemd[1]: neo4j.service: Unit entered failed state.
Nov 12 10:28:26 ip-10-0-32-44 systemd[1]: neo4j.service: Failed with result 'exit-code'.
Nov 12 10:28:27 ip-10-0-32-44 systemd[1]: neo4j.service: Service hold-off time over, scheduling restart.
Nov 12 10:28:27 ip-10-0-32-44 systemd[1]: Stopped Neo4j Graph Database.
Nov 12 10:28:27 ip-10-0-32-44 systemd[1]: Started Neo4j Graph Database.
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]: Active database: graph.db
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]: Directories in use:
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]:   home:         /var/lib/neo4j
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]:   config:       /etc/neo4j
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]:   logs:         /var/log/neo4j
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]:   plugins:      /var/lib/neo4j/plugins
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]:   import:       /var/lib/neo4j/import
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]:   data:         /var/lib/neo4j/data
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]:   certificates: /var/lib/neo4j/certificates
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]:   run:          /var/run/neo4j
Nov 12 10:28:27 ip-10-0-32-44 neo4j[1706]: Starting Neo4j.
^C


(Ranlandau) #2

from my log

2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 7229 on :Entity5713(word) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 1925 on :Entity4454(id) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 6265 on :Entity5358(index) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 1447 on :Entity4313(index) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 6747 on :Entity5535(lemmaLower) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 3855 on :Entity4136(word) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 8676 on :Entity6606(index) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 9158 on :Entity6757(id) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 7231 on :Entity5713(sentNum) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: index 6267 on :Entity5358(lemma) is ONLINE
2018-11-12 11:03:12.830+0000 INFO [o.n.k.i.a.i.IndexingService] IndexingService.start: indexes not specifically mentioned above are FAILED
2018-11-12 11:03:12.900+0000 INFO [o.n.k.i.DatabaseHealth] Database health set to OK
2018-11-12 11:03:13.240+0000 INFO [o.n.m.MetricsExtension] Sending metrics to CSV file at /var/lib/neo4j/metrics
2018-11-12 11:03:13.351+0000 INFO [o.n.b.i.BackupServer] BackupServer communication server started and bound to /127.0.0.1:6362
2018-11-12 11:03:15.680+0000 INFO [o.n.b.i.BackupServer] BackupServer communication server shutting down and unbinding from  /127.0.0.1:6362
2018-11-12 11:03:15.699+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by database shutdown @ txId: 4620940 checkpoint started...
2018-11-12 11:03:20.661+0000 INFO [o.n.k.i.s.c.CountsTracker] Rotated counts store at transaction 4620940 to [/var/lib/neo4j/data/databases/graph.db/neostore.counts.db.b], from [/var/lib/neo4j/data/databases/graph.db/neostore.counts.db.a].
2018-11-12 11:03:20.852+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by database shutdown @ txId: 4620940 checkpoint completed in 5s 153ms
2018-11-12 11:03:20.854+0000 INFO [o.n.k.i.t.l.p.LogPruningImpl] No log version pruned, last checkpoint was made in version 205
2018-11-12 11:03:22.757+0000 INFO [o.n.k.i.DiagnosticsManager] --- STOPPING diagnostics START ---
2018-11-12 11:03:22.757+0000 INFO [o.n.k.i.DiagnosticsManager] --- STOPPING diagnostics END ---
2018-11-12 11:03:22.757+0000 INFO [o.n.k.i.f.GraphDatabaseFacadeFactory] Shutdown started


(Andrew Bowman) #3

You're going to have to increase your open file limit. Here's the knowledge base article on this one:

It's possible that you only set this for your current user (or current session), here's some documentation on how you can set this system wide:

https://neo4j.com/docs/operations-manual/current/installation/linux/tarball/#linux-open-files