I'm new to Neo4j, and have been using Spark for ETL.
I have been playing with Neo4j Community version for the past two weeks. However, when the nodes reach 1 billion nodes with multiple relationships, the query is very slow, especially in streaming applications. Therefore, our team is considering utilizing GraphX of Spark for ETL and push the sub-graph to Neo4J. Basically, we want to use graphX to process any node creation, cypher query, run graph algorithms, ML, and just push the final results/sub-graphs to Neo4j for visualization.
From my online research, there used to be a project called "Mazerunner", that aimed to do what I need. However, the project seemed to stop in 2015. Its usage seemed to be limited. Besides, Neo4j used to list Mazerunner on its website as one of the recommended APIs to connect Neo4j and Spark. However, that has been removed from Neo4J website, too. It seemed that Mazerunner is no longer supported. My questions are:
Is the 2015 version of Mazerunner still any good for the current Neo4J? What can I do with Mazerunner now?
Is there any other new projects similar to Mazerunner that can help me achieve my goals?
Is it recommended to use graphX for ETL and push the results to Neo4j? Is it efficient and effective?
Does GraphX have all the processing capabilities such as node creation, cypher query, run graph algorithms, ML as Neo4J?
We are considering Tiger graph, too. Would that be a good option?