We're experiencing a persistent memory leak in our Neo4j cluster (currently 5.26.0, also tested with 2026.06.2) running in Kubernetes pods. Pod memory continuously climbs until it hits critical levels - at which point we will intervene and restart it. Restarting temporarily fixes it but memory growth resumes immediately. We've checked RSS size, process smaps, and JVM native memory tracking - all show minimal increases that don't account for the pod memory growth. We've also tried reducing heap and page cache sizes below neo4j-admin recommendations, but the leak persists across both versions tested. Currently running 9.5 GB memory per node with a 4 node cluster
Additionally, we're seeing intermittent write timeout issues where all write operations completely will timeout but read operations continue normally with no latency increase. This is resolved by restarting the leader node manually. We're unsure if this is the leader node rejecting writes or replicas failing to commit.
We need advice on:
(1) advanced memory leak detection techniques for Neo4j in Kubernetes since standard JVM monitoring isn't revealing the source
(2) how to definitively diagnose whether write failures are leader or replica issues
(3) any known configuration or environmental factors that could cause both symptoms. We have complete logs, configs, and metrics available if needed.