Hi All,
This is a great conversation. I'm also at a point where I need to begin scaling my graph database in order to efficiently update millions of nodes and relationships. I wrote a couple of Python scripts that pull data from two different sources, cleans it up, and transforms it into a neo4j-admin import compliant format. The management team at my company is now coming up with different ideas to add more data to our graph so the number of nodes and relationships will continue to grow.
Kettle, Apache Airflow, and NiFi are some of the tools suggested in this thread. If I were to spend this weekend to drill down on any one of these tools, which one would you all recommend?
I'm tending to lean towards Apache Airflow based on what I hear from the industry and how popular it has become. I've never heard of Kettle or NiFi until this month.
What do you all think is the best approach?
Thanks,
Tony