Neo4j unmanaged extension - blockchain integration


(Sjoerdwels) #1

Dear community,

Currently I’m working on a project to build a prototype that integrates Neo4j with a blockchain framework (Exonum in our case). The idea is to have distributed nodes running a local neo4j database and use blockchain consensus algorithms to agree on changes and provide provenance by storing all modification on the blockchain. This way, you can retrieve the history of node modification via the blockchain.

Because we want to keep all nodes consistent, all database modifications need to be executed in the same order on every node. We provide a REST API call that an application can call to execute a transaction and in the body of the transaction message there is an ordered list of queries. Then, first all nodes agree on the transaction before it is added to a commit block, by executing the queries and then rolling back the transaction. If a QueryExecutionError is thrown, we know that the transaction contains invalid queries and therefore it will not be stored on the blockchain. If no error was thrown, it will be added to a blockchain block. When a new block arrives at a local node, all queries will be executed in that order. This way, if no error occurs, the local databases are eventually consistent.

To communicate between the blockchain service (some kind of smart contract in Exonum) that is running on every local node and the local Neo4j database, I’m creating an unmanaged extension. This extension is running an gRPC server with 2 commands, verify() and execute() a list of queries in one transaction. The execute() function should check if the queries do not modify a node/relationships UUID property (that we create), should assign UUID properties to new created nodes/relationships and provide all the changes to the blockchain. So, we do not care about ‘return’ values as applications can access the DB for read only commands directly using the bolt driver.

We built a prototype of the unmanaged extension but made some assumptions we would like to verify and/or receive feedback on. Also, to have a better understanding of the reasoning behind the Neo4j implementation.

  1. Why is there no official way to retrieve the TransactionData after executing a transaction? Is there a specific reason for that?

Currently, as found on stackoverflow, we use a ThreadLocal variable to store the modifications from the TransactionEventHandler. But we prefer not to trust on the implementation details, but if there is no API method provided there is no other way.

  1. Is the TransactionData deterministically created? In other words, if I execute the same transaction on different Neo4j databases with e.g. different architectures (e.g. 64 vs. 86 bit). Are the modified transactions / properties always provided in the same order in the transaction data.

As we need to assign UUIDS to each node/relationship that needs to be consistent in every local database, we need to process the TransactionData in each local database in the same order.

  1. Why does the afterCommit() method in the TransactionEventHandler not provide the final list of modifications?

Currently, unmanaged extensions can still make changes in the beforeCommit() method as the transaction is not finished. However, these changes are not represented in the afterCommit(). I understand that you shouldn’t make any changes as they are not represented in other TransactionEventHandlers. However, if the transaction is finalized, shouldn’t you receive the final changes? It makes it easier to argue about the changes, currently you have no guarantees that the retrieved TransactionData is containing all the changes.

  1. The ‘org.neo4j.graphdb.config’ package is depreciated, as the settings API will completely be rewritten in version 4.0. The current version however, is 3.5. Is there another way to receive custom config settings in an unmanaged extension?

In my opinion, you should depreciate classes/methods if there is an improved/alternative method. In that case, you preserve backwards compatibility and stimulating to use the new method. However, I can’t find another encouraged solution to retrieve settings. So now my editor / compiler warns me that I’m using a depreciated method, but the improved method is not yet available. Am I missing something?

  1. Is there any method to store the log files when using the Neo4j testing harness?

For every test, an initial database is setup and if a test fails, it would be handy to have the created log files for that test. I found a work around that copies every test file on stackoverflow, but it is not really a solution I would prefer. Perhaps there is another way to do it?

https://stackoverflow.com/questions/45033699/neo4j-logging-in-a-server-extension-while-running-from-within-a-junit-test-in-in?r=SearchResults

Thanks