Hi,
I am debugging why certain queries cause neo4j to become unresponsive. I was finding that the same query run with `profile cypher runtime=interpreted cypher replan=force` (and after resetting the database each time) takes vastly different amounts of time (of order 20 seconds versus 100 milliseconds).
On investigating further, I can see that there is a periodic huge increase in CPU usage every 5 mins or so lasting about 10 seconds. There are no queries being submitted here just an idle db instance. At the end of this spike of activity, I can always see a stop the world pause. This is not a result of checkpointing since that is logged elsewhere.
This is clearly impacting my ability to write and read. What is this? Is there anything I can do?
debug.log:
2022-11-15 11:06:42.940+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12398, gcTime=0, gcCount=0}
2022-11-15 11:12:47.553+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12733, gcTime=12818, gcCount=2}
2022-11-15 11:18:43.723+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=13377, gcTime=13546, gcCount=2}
2022-11-15 11:23:46.394+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12506, gcTime=12664, gcCount=2}
2022-11-15 11:30:42.660+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12325, gcTime=12461, gcCount=2}
2022-11-15 11:36:42.464+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12271, gcTime=12290, gcCount=2}
2022-11-15 11:43:39.413+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12805, gcTime=12963, gcCount=2}
2022-11-15 11:48:43.079+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12803, gcTime=12901, gcCount=2}
2022-11-15 11:53:41.940+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=11640, gcTime=11711, gcCount=2}
2022-11-15 11:59:40.950+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=13284, gcTime=13379, gcCount=2}
2022-11-15 12:05:40.255+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12184, gcTime=12234, gcCount=2}
2022-11-15 12:11:21.063+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12493, gcTime=12642, gcCount=2}
2022-11-15 12:17:02.112+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12988, gcTime=13104, gcCount=2}
2022-11-15 12:21:42.700+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12379, gcTime=12491, gcCount=2}
2022-11-15 12:21:49.179+0000 ERROR [o.n.b.t.p.MessageDecoder] Failed to read an inbound message:
b00f
2022-11-15 12:21:49.182+0000 ERROR [o.n.b.t.p.HouseKeeper] Fatal error occurred when handling a client connection: [id: 0xc7476cb4, L:/127.0.0.1:7687 - R:/127.0.0.1:58596]
java.lang.NullPointerException: null
at org.neo4j.bolt.transaction.StatementProcessorTxManager.interrupt(StatementProcessorTxManager.java:132) ~[neo4j-bolt-4.4.4.jar:4.4.4]
at org.neo4j.bolt.runtime.statemachine.impl.AbstractBoltStateMachine.interrupt(AbstractBoltStateMachine.java:189) ~[neo4j-bolt-4.4.4.jar:4.4.4]
at org.neo4j.bolt.runtime.DefaultBoltConnection.interrupt(DefaultBoltConnection.java:344) ~[neo4j-bolt-4.4.4.jar:4.4.4]
at org.neo4j.bolt.v3.messaging.decoder.ResetMessageDecoder.decode(ResetMessageDecoder.java:57) ~[neo4j-bolt-4.4.4.jar:4.4.4]
at org.neo4j.bolt.messaging.BoltRequestMessageReader.doRead(BoltRequestMessageReader.java:90) ~[neo4j-bolt-4.4.4.jar:4.4.4]
at org.neo4j.bolt.messaging.BoltRequestMessageReader.read(BoltRequestMessageReader.java:60) ~[neo4j-bolt-4.4.4.jar:4.4.4]
at org.neo4j.bolt.transport.pipeline.MessageDecoder.channelRead0(MessageDecoder.java:60) ~[neo4j-bolt-4.4.4.jar:4.4.4]
at org.neo4j.bolt.transport.pipeline.MessageDecoder.channelRead0(MessageDecoder.java:36) ~[neo4j-bolt-4.4.4.jar:4.4.4]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:327) [netty-codec-4.1.73.Final.jar:4.1.73.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:299) [netty-codec-4.1.73.Final.jar:4.1.73.Final]
at org.neo4j.bolt.transport.pipeline.MessageAccumulator.channelRead(MessageAccumulator.java:62) [neo4j-bolt-4.4.4.jar:4.4.4]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:327) [netty-codec-4.1.73.Final.jar:4.1.73.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:299) [netty-codec-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at org.neo4j.bolt.transport.pipeline.WebSocketFrameTranslator.channelRead(WebSocketFrameTranslator.java:43) [neo4j-bolt-4.4.4.jar:4.4.4]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.handler.codec.http.websocketx.Utf8FrameValidator.channelRead(Utf8FrameValidator.java:82) [netty-codec-http-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:327) [netty-codec-4.1.73.Final.jar:4.1.73.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:299) [netty-codec-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) [netty-transport-classes-epoll-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) [netty-transport-classes-epoll-4.1.73.Final.jar:4.1.73.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [netty-transport-classes-epoll-4.1.73.Final.jar:4.1.73.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-common-4.1.73.Final.jar:4.1.73.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.73.Final.jar:4.1.73.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.73.Final.jar:4.1.73.Final]
at java.lang.Thread.run(Thread.java:829) [?:?]
2022-11-15 12:27:43.465+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=13254, gcTime=13255, gcCount=2}
2022-11-15 12:33:03.503+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=13386, gcTime=13529, gcCount=2}
2022-11-15 12:38:42.647+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12092, gcTime=12229, gcCount=2}
2022-11-15 12:44:42.702+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12509, gcTime=12531, gcCount=2}
2022-11-15 12:49:42.761+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12396, gcTime=12552, gcCount=2}
2022-11-15 12:55:43.363+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=13075, gcTime=13157, gcCount=2}
2022-11-15 13:00:43.686+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=13285, gcTime=13480, gcCount=2}
2022-11-15 13:06:05.018+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12574, gcTime=12590, gcCount=2}
2022-11-15 13:12:42.878+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12593, gcTime=12706, gcCount=2}
2022-11-15 13:17:43.266+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12880, gcTime=13076, gcCount=2}
2022-11-15 13:22:42.858+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12482, gcTime=12664, gcCount=2}
2022-11-15 13:28:43.300+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12948, gcTime=13107, gcCount=2}
2022-11-15 13:34:27.204+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12701, gcTime=12700, gcCount=2}
2022-11-15 13:39:47.333+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12427, gcTime=12438, gcCount=2}
2022-11-15 13:45:48.433+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=12973, gcTime=13115, gcCount=2}
2022-11-15 13:51:44.088+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=13730, gcTime=13874, gcCount=2}
dbms.memory.heap.initial_size=24g
dbms.memory.heap.max_size=24g