cancel
Showing results for 
Search instead for 
Did you mean: 

Restarting graph due to failure

ppoliani
Node Link

I've setup a causal cluster using Neo4j 4.0.1 enterprise. I have three core members and two read replicas. The cluster looks to have been successfully formed. I can see the correct members in the routing table. I have successfully sent read and write transactions and everything looks to work like a charm

I checked logs the read replica logs, though (i.e. logs/debug.log) and I noticed the following warnings

2020-03-18 09:46:35.824+0000 WARN [a.s.Materializer] [outbound connection to [akka://cc-discovery-actor-system@10.42.5.184:5000], control stream] Upstream failed, cause: StreamTcpException: The connection has been aborted
2020-03-18 09:46:35.824+0000 WARN [a.s.s.RestartWithBackoffFlow] Restarting graph due to failure. stack_trace:  (akka.stream.StreamTcpException: The connection has been aborted)
2020-03-18 09:46:35.832+0000 WARN [a.s.Materializer] [outbound connection to [akka://cc-discovery-actor-system@10.42.9.147:5000], control stream] Upstream failed, cause: StreamTcpException: The connection has been aborted

The IPs in the above logs i.e. 10.42.5.184 and 10.42.9.147 belong to the pod of the core members.

I can equally see the same errors when digging into the core member logs:

2020-03-18 09:46:38.406+0000 WARN [a.s.Materializer] [outbound connection to [akka://cc-discovery-actor-system@10.42.9.146:5000], message stream] Upstream failed, cause: StreamTcpException: The connection has been aborted

Here the 10.42.9.146 belongs to the read replica pod.

The cluster on a whole doesn't seem to restart or malfunction, but the errors are very confusing and I'm not sure what it can be.

3 REPLIES 3

JnMik
Node

Same issue here on 4.0.3-enterprise

I have 3 core members and 1 read replica, the messages occurs on the core-2 and the read replica.
My core-2 as considered the Leader.

jasperblues
Graph Buddy

Getting same on a read replica. Neo4j 4.4.9

Ellesudo
Node

Happens (quite often both on read and core nodes) on our

4.4.14 enterprise cluster