I want to write python code in order to manage huge volumes of bibliographic data. Specifically i want to write & execute efficiently python code in order to check dataset quality, access huge volumes of bibliographic data in streaming json format and produce csv files from them. Then from the csv files i want to create quickly a Neo4j database that will be accessible through internet. What is more, i want to use for my execution runtimes GPU in order to speed up execution. What infrastructure can implement this kind of work load efficiently and efficiency? This kind of workload must be executed in a fully automated way.
Thanks in advance for your time!