Database Management in Python

Hello,

I am building a Neo4j graph which will eventually be the central data-store for various reports. The operating model for updating the graph + creating these reports will be (at least initially) something like:

Table data (Cloud/Sql) -> Python -> Cypher CREATE/MERGE -> Neo4j -> Cypher MATCH/RETURN -> Python

My team are python users we do not have any experience/appetite for, e.g., Java. In addition we are mainly data scientists we do not have a lot of experience with DBMS.

My question is along the lines of best practice for developing RWI operations in python for a graph which will grow in size, evolving schema, and may need periodic data backfills. To be more specific I am thinking the following may be useful:

  • CREATE/MERGE operation for all nodes/relationships
  • CREATE/MERGE operation for individual node types (in the event of only needing to update one node type)
  • Node properties of insert/update time

I am wondering if there are any guidelines to do this in a clean way with appropriate code reuse, testing, further DBMS "must haves" that I may be missing.

Any thoughts/comments appreciated.

You could use the python driver to interact with your neo4j database. You would create methods to create, read, update, and delete your entities. then you can reuse this to update neo4j as needed. You can then write queries that your execute through the driver to get data for your analysis.

https://neo4j.com/developer/python/