Hey Guys,
I have 2 needs as of now where I need some guidance desperately(I am new to Neo4J as well):
So, for one of my projects, I need to build ETL pipelines within GCP for:
1. Transferring the Historical data from BigQuery to Neo4J
2. Transferring the Incremental data from BigQuery to Neo4J
Now, I am done with creating my pipelines in BQ for incremental load.
How should I proceed?
I can see some approaches suggesting:
1. Airflow. Will it work for huge amount of data? Will it be able to scale? Also, I think it will take time to load the data into Neo4J if the data volume is huge.(p.s.: Although, client doesn't have an existing composer instance; so, it will be painful to convince them for the same. If need be, and if there is no option, then I will do it).
2. The second approach that I could find was using Cloud DataProc where Apache Spark jobs can be written with Neo4J Connector.
Any suggestions on which approach should I follow? If there is any other approach than these, feel free to let me know. It will help me a ton!
Regards,
Pinakin