Causal clustering plugins and write transactions


(Yurippe) #1

I am having a hard time figuring out how I can programatically detect whether I am in a single instance or clustering mode, and how I would transparently proxy queries done with GraphDatabaseService to the leader of a cluster.

Is it possible to let a core always redirect writes using the Neo4J java API (not a driver) from a plugin.

In reality what I need is read-only copies, but it says in the HA clustering section that HA clustering is somewhat depricated. Should I use it regardless?


(Michael Hunger) #2

Could you explain what you need this functionality for?

For redirection you'd still need to use the Java driver between instances.
Also the Java API itself doesn't support causal clusters, because you cannot determine upfront if you're running a read- or write-transaction.


(Yurippe) #3

I is not a problem if I have to specify the fact that it is a write operation, I'd just like to transparantly be able to execute a write query on any of the members of the core cluster.

Right now I just want read copies, but as far as I know, CA clusters need a minimum of 2 cores, and should really have at least 3. Therefore, the extra complexity of writing to the database is not something I am a big fan of.

What is the recommended way of setting up a main database (reads / writes) and have more read-only slaves ?
I do not need the guarantee of being able to write at all times, but I would like to have backups and the ability to offload some computationally heavy tasks to read-only slaves.


(Michael Hunger) #4

That's where/why you use a driver, with the bolt+routing protocol which is aware of the cluster topology and if you're using a read- or write-tx and routes appropriately (and even retries if the cluster changed topology during your operation).

It also takes care of load balancing between core and read-replica instances.


(Yurippe) #5

So when making plugins, one should never use GraphDatabaseService to do transactions, but instead use a driver?

Also, for the setup I described, does it make sense to use a CA cluster? Or is there some other way of getting read-replicas?


(M. David Allen) #6

There's no need to transparently proxy things to the leader; this is done for you by a bolt+routing driver, as Michael says. But it is also OK to use a GraphDatabaseService to do transactions. I think the missing piece here is that if I did transactions inside of a GraphDatabaseService, I'd generally be only doing them on the leader, automatically, with no extra code needed, because whatever that code is would only be invoked on the leader.

For example: you write a stored procedure with the annotation @Procedure(name = "myPlugin.writeSomeStuff", mode = Mode.WRITE)

Inside of that procedure, you use a GraphDatabaseService to write some stuff, and then stream some results back. All good.

Now, that plugin is installed on all 3 nodes, but the procedure never gets called anywhere but the leader, because the client writes explicit write transactions (and autocommit) transactions to the leader. So cypher that calls the procedure in addition to bolt+routing basically takes this away so you don't have to worry about it.

If you did manually call that write procedure on a follower, it would fail -- because followers cannot accept writes. Fortunately if you set things up right, this just won't arise. Extra cores and read replicas scale out your read workload.


(Yurippe) #7

This would not be true for asyncronous procedures though.

And what about transaction listeners. If a transaction listener mutated the database on certain queries, wouldn't that pose a problem? The last issue is a hypothetical, but could be solved by doing a dbms.cluster.overview() and check the result, but it is overly complicated. There should be an easy way of telling whether or not you are the leader or follower.

This also makes it more complicated to use the neo4j browser, as you have to manually find the leader and then execute queries, which I think is a poor user experience. It requires knowledge about the underlying structure of the clustering, and even though the Neo4J instances may communicate on a network, does not mean all replicas are reachable for all clients.


(M. David Allen) #8

To find whether a node is leader or follower, CALL dbms.cluster.role(); (Docs: https://neo4j.com/docs/operations-manual/current/monitoring/causal-cluster/procedures/#dbms.cluster.role)

The neo4j browser too can accept bolt+routing as the address you connect to. By default, say it attempts to connect to bolt://my-cluster. If you instead connect to bolt+routing://my-cluster, then the cluster topology is then transparent to you. You can run both read and write queries, and the browser will route them wherever is appropriate.

On the transaction listener, I'm not sure. I may look into this.


(Michael Hunger) #9

Best to write your extensions as procedure then you can call them from Cypher and they are executed in the right context (read vs. write) and transaction.

If you want to check within a procedure what state the current instance has, you can use something like this: