Loading data from pandas dataframe into Neo4j using Py2Neo or official neo4j driver


Can I use the official neo4j driver to load data from a pandas dataframe into Neo4j on a daily basis? If not, then can I use the py2neo connector to also efficiently execute cypher queries that create nodes and relationships, and/or delete nodes? According to the py2neo docs, it seems like the py2neo driver is the way to go for me when deciding between these two drivers.

I'm about to start loading data from a pandas dataframe into our neo4j database and py2neo seems to be the way to go based on these stackoverflow questions:

I was just curious to know the experiences of neo4j users who have implemented this python driver approach.


I can't edit my post above any more. I need to add another question here:

Which of these two python drivers is the better and faster approach to load data into Neo4j?

  1. Loading data using the official Python driver? I would think you have to pass a string in Python that tells the official neo4j driver something like:
query = """
FROM 'file:///data.csv' AS row
MERGE(p:Person {id: toInteger(row.id)}
  1. Using the py2neo driver and load the data from the pandas dataframe?

I found this Medium's article very interesting, where makes comparisons with the different python Neo4j drivers

But, my experience using the Neo4j's python driver is good, it's easy to use and I didn't found problems with having low data transfer speeds .

Thank you! Great article, and this sums it up nicely:

My recommendation? Definitely py2no is not an option . Although it is user-friendly in many respects, it is too slow for counting queries. Neo4jrestclient is not bad, but sometimes it returns nested list structure which we have to deal with using some trick (e.g. “sum(temp,)” which I want to avoid. So I think I would go with the Neo4j Python driver . After all it is the only official release supported by Neo4j. What is your recommendation?

I'll follow up here on this post which driver I ended up using.

Wouldn't it be cool if the official neo4j driver also supported pandas dataframes as a source of data? :thinking:

Pandas is specific, and processing a pandas dataframe in other frameworks maybe not a such optimal as using the power of that library gives to you in python.
So, my answer to this: Not really, because the trend for data types used on webservices and Apps are things like Json (A standard that works everywhere). In other cases, the non-dev usages, the spreadsheets are very common, and the csv appears here, compact and easy to be generated.