cancel
Showing results for 
Search instead for 
Did you mean: 

Neo4j to Kafka, log-based 'change data capture' (cdc)

neo4joe
Node Clone

It's exciting to see Neo4j Streams graduate from being a plugin to becoming a de-coupled bi-directional component that works with Aura!

In the blog article above a polling interval is selected for the frequency of data pulls from the source--Neo4j. But polling means that I may miss intermediate changes between two polls. I would like to have no polling and to instead have a log-based change data capture that will happen absolutely everytime the monitored entity changes, and only when it changes, whether it is 3 times per second or per week.


My question is: Does the new Kafka Connect Source support this? If not is there a non-Kafka Connect Source approach?


I am able to create something with triggers but I fear that too many triggers will place undue burden on the DBMS.


The meaning of 'log-based' is described well in the following text:

If you want to go “the whole hog” with integrating your database with Kafka, then log-based Change-Data-Capture (CDC) is the route to go. Done properly, CDC basically enables you to stream every single event from a database into Kafka. Broadly put, relational databases use a transaction log (also called a binlog or redo log depending on DB flavour), to which every event in the database is written. Update a row, insert a row, delete a row – it all goes to the database’s transaction log. CDC tools generally work by utilising this transaction log...

...from this Apache Kafka blog post:

1 ACCEPTED SOLUTION

a proper transaction log based approach would be preferred, yes but it requires a number of product changes that are currently not prioritized.

We tried to emulate that with the database plugin which tied into the tx manager, but that led to it's own set of problems.

So for the time polling it is.

View solution in original post

1 REPLY 1

a proper transaction log based approach would be preferred, yes but it requires a number of product changes that are currently not prioritized.

We tried to emulate that with the database plugin which tied into the tx manager, but that led to it's own set of problems.

So for the time polling it is.