We're using Neo4j as a read model in an event sourced system. It's job is basically to process a stream of events and build a database for query purposes.
The events are translated into cypher statements and sent to new4j using the bolt driver, in batches of 1000-5000 statements per transaction. (sequentially, single threaded)
This has been working great up to neo4j 3.5.16, with neo4j processing +/- 1000 events per second when rebuilding from scratch.
Now, we just updated to neo4j 4.2.1, and I'm seeing following strange behaviour when doing a full rebuild (i.e. starting from an empty DB and processing all events, approx 3 million in total)
- after about 40000 events, processing starts to slow down
- after about 100000 events, processing is about 5x slower than it should be
It basically keeps going at this rate from then on.
Now the strange part:
- when I pause the event stream for about 10-15 seconds, then resume, it picks up at full speed again and doesn't slow down anymore. To be clear: this is without restarting neo4j or the client, just a dumb "sleep 10000".
It's looks like neo4j isn't able to keep up somehow, but after waiting a while, all is back to normal.
What could be causing this?
Things I tried:
- increasing/decreasing heap space/page cache settings
- switching to the http driver
- community/enterprise edition
The logs do not contain any related information.
- Mac OS Big Sur
- neo4j 4.2.1 community
- SDN 5.2.6 / OGM 3.2.19
- plugins: apoc 220.127.116.11, spatial algorithms 0.2.4
- java 11.0.9