Neo4j Fails to Start - suddenly - tons of data

operations

(Ranlandau) #1

ubuntu@ip-10-1-32-39:~$ journalctl -u neo4j -f
-- Logs begin at Tue 2018-10-09 13:47:02 UTC. --
Oct 09 14:11:42 ip-10-1-32-39 neo4j[11014]: plugins: /var/lib/neo4j/plugins
Oct 09 14:11:42 ip-10-1-32-39 neo4j[11014]: import: /var/lib/neo4j/import
Oct 09 14:11:42 ip-10-1-32-39 neo4j[11014]: data: /var/lib/neo4j/data
Oct 09 14:11:42 ip-10-1-32-39 neo4j[11014]: certificates: /var/lib/neo4j/certificates
Oct 09 14:11:42 ip-10-1-32-39 neo4j[11014]: run: /var/run/neo4j
Oct 09 14:11:42 ip-10-1-32-39 neo4j[11014]: Starting Neo4j.
Oct 09 14:11:46 ip-10-1-32-39 neo4j[11014]: 2018-10-09 14:11:46.879+0000 INFO ======== Neo4j 3.4.1 ========
Oct 09 14:11:46 ip-10-1-32-39 neo4j[11014]: 2018-10-09 14:11:46.905+0000 INFO Starting...
Oct 09 14:11:47 ip-10-1-32-39 neo4j[11014]: 2018-10-09 14:11:47.812+0000 INFO Initiating metrics...
Oct 09 14:12:02 ip-10-1-32-39 neo4j[11014]: 2018-10-09 14:12:02.254+0000 INFO Sending metrics to CSV file at /var/lib/neo4j/metrics
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: 2018-10-09 14:12:27.890+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5dc651b3' was successfully initialized, but failed to start. Please see the attached cause exception "eventfd() failed: Too many open files". Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5dc651b3' was successfully initialized, but failed to start. Please see the attached cause exception "eventfd() failed: Too many open files".
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5dc651b3' was successfully initialized, but failed to start. Please see the attached cause exception "eventfd() failed: Too many open files".
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.server.exception.ServerStartupErrors.translateToServerStartupError(ServerStartupErrors.java:68)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:220)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:111)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:79)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at com.neo4j.server.enterprise.CommercialEntryPoint.main(CommercialEntryPoint.java:22)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5dc651b3' was successfully initialized, but failed to start. Please see the attached cause exception "eventfd() failed: Too many open files".
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:466)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:212)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: ... 3 more
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: Caused by: java.lang.RuntimeException: Error starting org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory, /var/lib/neo4j/data/databases/graph.db
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.initFacade(GraphDatabaseFacadeFactory.java:212)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.enterprise.EnterpriseGraphDatabase.(EnterpriseGraphDatabase.java:39)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.server.enterprise.OpenEnterpriseNeoServer.lambda$static$1(OpenEnterpriseNeoServer.java:78)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.server.database.LifecycleManagingDatabase.start(LifecycleManagingDatabase.java:88)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: ... 5 more
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.bolt.transport.NettyServer@450d6dbe' was successfully initialized, but failed to start. Please see the attached cause exception "eventfd() failed: Too many open files".
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:466)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.extension.KernelExtensions.start(KernelExtensions.java:84)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.initFacade(GraphDatabaseFacadeFactory.java:208)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: ... 9 more
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: Caused by: java.lang.IllegalStateException: failed to create a child event loop
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:88)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:58)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:47)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.channel.MultithreadEventLoopGroup.(MultithreadEventLoopGroup.java:59)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:104)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:91)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:68)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.bolt.transport.configuration.EpollConfigurationProvider.createEventLoopGroup(EpollConfigurationProvider.java:40)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.bolt.transport.NettyServer.start(NettyServer.java:98)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: ... 16 more
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: Suppressed: org.neo4j.kernel.lifecycle.LifecycleException: Exception during graceful attempt to stop partially started component. Please use non suppressed exception to see original component failure.
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:457)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: ... 16 more
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: Caused by: java.lang.NullPointerException
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.bolt.transport.NettyServer.stop(NettyServer.java:142)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:453)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: ... 16 more
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: Caused by: io.netty.channel.ChannelException: eventfd() failed: Too many open files
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.channel.epoll.Native.eventFd(Native Method)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.channel.epoll.Native.newEventFd(Native.java:93)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.channel.epoll.EpollEventLoop.(EpollEventLoop.java:98)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:135)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:35)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: at io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:84)
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: ... 25 more
Oct 09 14:12:27 ip-10-1-32-39 neo4j[11014]: 2018-10-09 14:12:27.891+0000 INFO Neo4j Server shutdown initiated by request
Oct 09 14:12:27 ip-10-1-32-39 systemd[1]: neo4j.service: Main process exited, code=exited, status=1/FAILURE
Oct 09 14:12:27 ip-10-1-32-39 systemd[1]: neo4j.service: Unit entered failed state.
Oct 09 14:12:27 ip-10-1-32-39 systemd[1]: neo4j.service: Failed with result 'exit-code'.
Oct 09 14:12:28 ip-10-1-32-39 systemd[1]: neo4j.service: Service hold-off time over, scheduling restart.
Oct 09 14:12:28 ip-10-1-32-39 systemd[1]: Stopped Neo4j Graph Database.
Oct 09 14:12:28 ip-10-1-32-39 systemd[1]: Started Neo4j Graph Database.
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: Active database: graph.db
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: Directories in use:
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: home: /var/lib/neo4j
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: config: /etc/neo4j
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: logs: /var/log/neo4j
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: plugins: /var/lib/neo4j/plugins
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: import: /var/lib/neo4j/import
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: data: /var/lib/neo4j/data
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: certificates: /var/lib/neo4j/certificates
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: run: /var/run/neo4j
Oct 09 14:12:28 ip-10-1-32-39 neo4j[11294]: Starting Neo4j.
Oct 09 14:12:32 ip-10-1-32-39 neo4j[11294]: 2018-10-09 14:12:32.576+0000 INFO ======== Neo4j 3.4.1 ========
Oct 09 14:12:32 ip-10-1-32-39 neo4j[11294]: 2018-10-09 14:12:32.611+0000 INFO Starting...
Oct 09 14:12:33 ip-10-1-32-39 neo4j[11294]: 2018-10-09 14:12:33.548+0000 INFO Initiating metrics...


(Michael Hunger) #2

Did you configure the open files setting as indicated in the docs?

https://neo4j.com/docs/operations-manual/3.4/installation/linux/tarball/#linux-open-files

Please let us know if that helped.

Michael


(Ranlandau) #3

Limits conf, and ulimit are all configured to very high values.
It is not related.


(Ranlandau) #4

core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 257594
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1000000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1000000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited


(Ranlandau) #5

-- Logs begin at Tue 2018-10-09 13:47:02 UTC. --
Oct 10 08:03:15 ip-10-1-32-39 neo4j[72762]: import: /var/lib/neo4j/import
Oct 10 08:03:15 ip-10-1-32-39 neo4j[72762]: data: /var/lib/neo4j/data
Oct 10 08:03:15 ip-10-1-32-39 neo4j[72762]: certificates: /var/lib/neo4j/certificates
Oct 10 08:03:15 ip-10-1-32-39 neo4j[72762]: run: /var/run/neo4j
Oct 10 08:03:15 ip-10-1-32-39 neo4j[72762]: Starting Neo4j.
Oct 10 08:03:19 ip-10-1-32-39 neo4j[72762]: 2018-10-10 08:03:19.994+0000 INFO ======== Neo4j 3.4.1 ========
Oct 10 08:03:20 ip-10-1-32-39 neo4j[72762]: 2018-10-10 08:03:20.022+0000 INFO Starting...
Oct 10 08:03:20 ip-10-1-32-39 neo4j[72762]: 2018-10-10 08:03:20.959+0000 INFO Initiating metrics...
Oct 10 08:03:35 ip-10-1-32-39 neo4j[72762]: 2018-10-10 08:03:35.307+0000 INFO Sending metrics to CSV file at /var/lib/neo4j/metrics
Oct 10 08:03:38 ip-10-1-32-39 systemd[1]: Started Neo4j Graph Database.
Oct 10 08:04:08 ip-10-1-32-39 neo4j[72762]: 2018-10-10 08:04:08.054+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5dc651b3' was successfully initialized, but failed to start. Please see the attached cause exception "timerfd_create() failed: Too many open files". Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5dc651b3' was successfully initialized, but failed to start. Please see the attached cause exception "timerfd_create() failed: Too many open files".
Oct 10 08:04:08 ip-10-1-32-39 neo4j[72762]: org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase@5dc651b3' was successfully initialized, but failed to start. Please see the attached cause exception "timerfd_create() failed: Too many open files".
Oct 10 08:04:08 ip-10-1-32-39 neo4j[72762]: at org.neo4j.server.exception.ServerStartupErrors.translateToServerStartupError(ServerStartupErrors.java:68)
Oct 10 08:04:08 ip-10-1-32-39 neo4j[72762]: at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:220)
Oct 10 08:04:08 ip-10-1-32-39 neo4j[72762]: at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:111)
Oct 10 08:04:08 ip-10-1-32-39 neo4j[72762]: at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:79)
Oct 10 08:04:08 ip-10-1-32-39 neo4j[72762]: at com.neo4j.server.enterprise.CommercialEntryPoint.main(CommercialEntryPoint.java:22)
Oct 10 08:04:08 ip-10-1-32-39 systemd[1]: neo4j.service: Main process exited, code=exited, status=1/FAILURE
Oct 10 08:04:08 ip-10-1-32-39 systemd[1]: neo4j.service: Unit entered failed state.
Oct 10 08:04:08 ip-10-1-32-39 systemd[1]: neo4j.service: Failed with result 'exit-code'.
Oct 10 08:04:08 ip-10-1-32-39 systemd[1]: neo4j.service: Service hold-off time over, scheduling restart.
Oct 10 08:04:08 ip-10-1-32-39 systemd[1]: Stopped Neo4j Graph Database.
Oct 10 08:04:08 ip-10-1-32-39 systemd[1]: Started Neo4j Graph Database.


(Ranlandau) #6

upgrade to 3.4.8 fixed it.
thanks.


(Michael Hunger) #7

Perhaps it was related to the user who ran the database?


(Ranlandau) #8

No... Checked that too.
I've verified all related to neo4j user.