How to sync the Postgre SQL database on RDS to Neo4j on EC2?

How to synchronise my Postgre SQL database which is running on AWS RDS with the Neo4j instance running on EC2. The data should be synchronised in real time. For every new entry added to the SQL database a node should be created on my graph database.

This is the kind of data synchronising issue addressed by products like Confluent, an event streaming platform built around Apache Kafka.

It requires setup, but you can use configure a component to watch the binary logs of your postgres db, send them to Kafka, and then sink those changes to Neo4j, all in realtime.

A much simpler option would be to schedule a cypher statement to execute (every minute for example) and use the jdbc connector to run a select query against postgres, and turn the results into nodes & relationships. It isn't strictly realtime, but your use case may be able to tolerate 1 minute delays, for the sake of a much easier implementation.

Good luck!

1 Like

Thank you Terry, will try with Kafka Streams :slightly_smiling_face:

Agree with @terryfranklin82, if you're looking for real-time data, look into Apache Kafka as a distributed data messaging system.

Otherwise you could have your API that is communicating to PostgreSQL also do the communicating with Neo4j, in a multi-cast style of sending data to two data stores. Not a solution that scales well in the mircro-service paradigm but simpler to implement.

Thank you all for the response, I have integrated Amazon Kinesis to create the data stream along with Lambda to trigger the Cypher query to push it into Graph database and it is working fine.