Neo4j Data Pipeline

Hey everyone! I've been working on a project to find a way to upload a lot of data into Neo4j and modify the data on its way in. I did not find a solution out there so I decided to make my own. The project as it stands can take in large json and csv data files and dynamically add nodes/relationships. The nodes and relationships are defined in a mapping file and allow you to break apart the data in a lot of different ways. You can even include some basic conditional logic!

I wanted to see if there would be anyone interested in using such a tool. My team and I are considering open sourcing the project soon.

Looking for any feedback or questions!

1 Like

Hi,
Your project Can be very intersting. Have you some metrics about importing speed rate?

Best regards

So we currently haven’t done to much performance testing yet but that will be a priority in the near future! What kind of metrics would you be interested in?

The most important metrics should be number nodes writing per second and same for relationships.

On inugami-project-analysis-maven-plugin-parent I write a lot of nodes when the plugin scan a project (800 nodes for one basic spring boot application , and 1300 relationships, with bigger application it can be much more). The analyze phase is very fast. Most time is loose on writing result into Neo4J, so a good example of massive nodes importation can be helpful.

Thank you! Yeah we can definitely look into getting those metrics. I’m sure we have a lot of room for optimization but we will keep working on it!

Hello, I am working on a project where I am facing a data ingestion issue in my project.

I'm very interested!
Does it work for aura instances?