Loading data into a Neo4j instance on AWS EC2

I'm trying to set up an EC2 server with Neo4j and a GraphQL API that our development team can access. I could hack together a solution but I was curious to know what others recommend, in case I'm missing anything.

We have data stored in Google Sheets which I import directly by running a Cypher query on my local machine. However, I need to automate the initial loading, or seeding, of the database. How is this usually done?

The two options I see are 1) run the loading script through cypher-shell once the EC2 instance and Neo4j are running or 2) download the graph.db file into the data/databases through the User Data. I'm using a public AMI running Neo4j. Just wondering if I am going about this correctly.

You could also use the "export to csv" link of your google sheet and use that URL with LOAD CSV WITH HEADERS FROM <google-csv-url>.

Yes, that is how it's loaded using the Cypher query and the url link. But I might not have been clear, how can I automate this when starting an EC2 instance with Neo4j?

You could use APOC cypher initialzers, see https://neo4j.com/docs/labs/apoc/current/operational/init-script/

Hi, you could use several things like some shells, but I think ansible could be your best fit. Here is a reference that you could use to deploy Neo4j