Read after write consistency

Hi,

in cluster mode, I have a problem when I am doing a READ transaction following a WRITE transaction : the WRITE transaction is always done on LEADER node, but READ transaction will be done on either 2 FOLLOWERS, and as the replication is not done immediately, the READ transaction does not find the data created by the previous WRITE transaction.

Is there a way to tell the java driver to do the READ transaction on the specific node which did the WRITE transaction ?

Thanks.

Hi Eric,

Please do have a look at Causal Consistency and the use of Bookmarks to assist with this.

Thanks,

yes, I guess Bookmarks is the thing I need.

However, just a question about Bookmarks : when we execute a READ transaction and we pass a bookmark to the session which will execute it, which node will finally execute the query ?
Is it always the node which previously executed the WRITE transaction (so the LEADER node) ? Or it can be a FOLLOWER node which will in this case wait for the data replication to be done ?

Thanks again.

Hi,

It could be a follower, which would mean that it will wait until the write transaction has replicated.

The amount of time waiting would usually not be very long because the leader sends out the write transaction to all followers when it receives it. So it could well be that the follower has already applied it by the time the READ query is executed.

Cheers, Mark

Thanks, it is very important that the READ query be executed on a FOLLOWER node for our use case. Actually, after each WRITE transaction, we need a scalable (out) way to perform READ requests on up-to-date data. If we can only query the LEADER node, then it is not scalable.

By the way, concerning the amount of time waiting for data replication, what do you mean by 'not be very long' ? I used to do some tests some months ago on a neo4j 3-nodes cluster installed on Amazon, I had to wait 4-5 seconds to get the data replicated from the LEADER to the FOLLOWER. 4-5 seconds is quite long for us. Do you think it is possible to wait less than 1 second ?

Thanks a lot.