Creating masses of new nodes take a very long time
Hi There,
I am a little desperate with the following problem:
I have written a python routine for executing 4.8 million MERGE statements to create nodes in a db. The script uses neo4j python module to execute the MERGE clauses.
The statements look about this
</>MERGE (f:flights { date: date({year: 2015, month: 1, day: 1}), day_of_week: "Thursday", airline: "EV", flight_number: "5103", tail_number: "846AS", origin: "ABE", destination: "DTW", scheduled_departure: time({hour: 6, minute: 0}), real_departure: time({hour: 5, minute: 52}), taxi_out: toInteger(12), air_duration: toInteger(84), taxi_in: toInteger(5), scheduled_arrival: time({hour: 7, minute: 53})}) RETURN f </>
I did the following:
- When I started this python script I realized that with a execution time of 100 ms per statement it would take 5 and a half days to accomplish the node creation.
- Then I tried to just export the MERGE statements into a text file and execute this from the cypher-shell. No performance improvement.
- Then I found the apoc.commit procedure, but could not get it to work (I could not find any other use cases but with the "LOAD CSV" procedure). I tried something like this :
</>
call apoc.periodic.commit("
CREATE (:flights { date: date({year: 2015, month: 1, day: 1}), day_of_week: 'Thursday', airline: 'EV', time({hour: 7, minute: 53})})
CREATE (:flights { date: date({year: 2015, month: 4, day: 14}), day_of_week: 'Tuesday', airline: 'WN', time({hour: 19, minute: 28})})
", limit:1000)
</>
creating the following error:
Invalid input '1000': expected "%", "(", "YIELD" or an identifier (line 4, column 10 (offset: 845))
"", limit:1000) YIELD *"
So far I got. Does anybody have an idea how to improve the performance (or fix the error, if the commit proc. would be a solution)?
Thanks in advance for any response,
Alex