Hey everyone! I've been working on a project to find a way to upload a lot of data into Neo4j and modify the data on its way in. I did not find a solution out there so I decided to make my own. The project as it stands can take in large json and csv data files and dynamically add nodes/relationships. The nodes and relationships are defined in a mapping file and allow you to break apart the data in a lot of different ways. You can even include some basic conditional logic!
I wanted to see if there would be anyone interested in using such a tool. My team and I are considering open sourcing the project soon.
So we currently haven’t done to much performance testing yet but that will be a priority in the near future! What kind of metrics would you be interested in?
The most important metrics should be number nodes writing per second and same for relationships.
On inugami-project-analysis-maven-plugin-parent I write a lot of nodes when the plugin scan a project (800 nodes for one basic spring boot application , and 1300 relationships, with bigger application it can be much more). The analyze phase is very fast. Most time is loose on writing result into Neo4J, so a good example of massive nodes importation can be helpful.