Currently I’m working on a project to build a prototype that integrates Neo4j with a blockchain framework (Exonum in our case). The idea is to have distributed nodes running a local neo4j database and use blockchain consensus algorithms to agree on changes and provide provenance by storing all modification on the blockchain. This way, you can retrieve the history of node modification via the blockchain.
Because we want to keep all nodes consistent, all database modifications need to be executed in the same order on every node. We provide a REST API call that an application can call to execute a transaction and in the body of the transaction message there is an ordered list of queries. Then, first all nodes agree on the transaction before it is added to a commit block, by executing the queries and then rolling back the transaction. If a QueryExecutionError is thrown, we know that the transaction contains invalid queries and therefore it will not be stored on the blockchain. If no error was thrown, it will be added to a blockchain block. When a new block arrives at a local node, all queries will be executed in that order. This way, if no error occurs, the local databases are eventually consistent.
To communicate between the blockchain service (some kind of smart contract in Exonum) that is running on every local node and the local Neo4j database, I’m creating an unmanaged extension. This extension is running an gRPC server with 2 commands, verify() and execute() a list of queries in one transaction. The execute() function should check if the queries do not modify a node/relationships UUID property (that we create), should assign UUID properties to new created nodes/relationships and provide all the changes to the blockchain. So, we do not care about ‘return’ values as applications can access the DB for read only commands directly using the bolt driver.
We built a prototype of the unmanaged extension but made some assumptions we would like to verify and/or receive feedback on. Also, to have a better understanding of the reasoning behind the Neo4j implementation.
Currently, as found on stackoverflow, we use a ThreadLocal variable to store the modifications from the TransactionEventHandler. But we prefer not to trust on the implementation details, but if there is no API method provided there is no other way.
As we need to assign UUIDS to each node/relationship that needs to be consistent in every local database, we need to process the TransactionData in each local database in the same order.
Currently, unmanaged extensions can still make changes in the beforeCommit() method as the transaction is not finished. However, these changes are not represented in the afterCommit(). I understand that you shouldn’t make any changes as they are not represented in other TransactionEventHandlers. However, if the transaction is finalized, shouldn’t you receive the final changes? It makes it easier to argue about the changes, currently you have no guarantees that the retrieved TransactionData is containing all the changes.
In my opinion, you should depreciate classes/methods if there is an improved/alternative method. In that case, you preserve backwards compatibility and stimulating to use the new method. However, I can’t find another encouraged solution to retrieve settings. So now my editor / compiler warns me that I’m using a depreciated method, but the improved method is not yet available. Am I missing something?
For every test, an initial database is setup and if a test fails, it would be handy to have the created log files for that test. I found a work around that copies every test file on stackoverflow, but it is not really a solution I would prefer. Perhaps there is another way to do it?