How to get staging data from MongoDB to Neo4j? Any Java connector available?


(Amit G) #1

I am having my around 100 million data sets as documents in MongoDb. Is there any real time pipeline (Java based) available to load those mongo documents to Neo4j?


(Michael Hunger) #2

Do you have an example for what you're looking for?

There is the mongo-connector example here: https://neo4j.com/developer/mongodb/

There are also apoc procedures for connecting and retrieving data from Mongo
https://neo4j-contrib.github.io/neo4j-apoc-procedures/#_interacting_with_mongodb

And you can probably use Kettle with the Neo4j Cypher steps


(Amit G) #3

Thank you Michael for the details.

I am getting 5-6 GB feed files of different formats like csv, fixed length, xml json etc. Earlier i was directly reading the files through Java and building the Neo4j model as data coming in csv is not straight forward to use CSV import utility from Neo4j. I am using Neo4j 3.1.2 Enterprise embedded mode.

In this use case, we do not have any control on file tracing and failed records. I want to stage the data first and then load to Neo4j model using Server mode instead embedded mode.

Currently we are trying to split the large files through NiFi and dumping to MongoDB for staging. Next process to read the data from MongoDB and load to Neo4j.

So we are looking some direction if we have any pipeline available from MongoDB to Neo4j which is event based. As soon as data is inserted into any mongo document, we need to load to Neo4j or any otherway to create some poller in Neo4j plugin to read the data from Mongo and load to Neo4j.