Neo4j vs PostgreSQL for the new project

I'm starting a new project and have a core logic(some kind of business analytic) implemented with Neo4j and Spring Boot/SDN5.

Right now I'm struggling with the following architectural question - is it a good idea to build the entire application only on Neo4j database or for example leave analytical part on Neo4j and in additional implement the operational part of the application with RDBMS like PostgreSQL. I hope that Neo4j is mature enough to hold all kind of the application data… I mean the data that I typically store in PostgreSQL may be stored also in Neo4j. Please advise - should I go with Neo4j only or no.

Hi @myshareit

These two databases are quite different in terms of data types and functions.

In my case, if there is no problem with Neo4j, I will choose Neo4j.

This is because it is possible, but not simple, to exchange data with two databases.

1 Like

Where I might split up the data, is if a portion of your data fits the traditional RDBMS model and there's a TON of it.

E.g., suppose there is a huge amount of real-time log or transaction data that's very table oriented and you can run a PostgreSQL process that packages it up a summary of that data that then can periodically exported that into Neo4J. A mature RDBMS also has ways of being efficient with storage allocation, whereas Cypher is flexible, which makes it a less efficient. It's not clear if that would make a significant difference in your use case.

Otherwise, the attributes of Cypher can mimic tables pretty well (although Cypher doesn't come with the same level of safety features in a RDBMS to prevent "dumb" mistakes.). Cypher Enterprise does come with some CONSTRAINTS but it's not quite as good as RDBMS.

It would help if you specified the amount of data you are talking about... e.g. how many records, how many bytes per record, how fast does the data grow, etc.

The other thing, is how ad hoc are the queries going to be? Cypher is really great, when somebody starts wanting to make queries that nobody anticipated. In a RDBMS, going off the beaten path can become a nightmare.

The other thing, is if in the course of building out your schema, you discover something about the nature of the data that you hadn't anticipated. Most typically, it's all too easy to make simplifying assumptions to make a RDBMS schema easier, only to discover that there was a misunderstanding about the data that results in an unanticipated many-to-many relationship, which results in a schema migration plus newly re-formed queries that have to use ugly JOINs.

1 Like

I would say a lot depends on your application, size and purpose. Is your application going to be so large that you're going to want to consider a micro-service architecture? Also what are the queries you're going to be asking if the data? A query like "what is the average height of all the actors in Hollywood?" A graph database isn't the right choice. If your query is "what is the shortest path between Tom Hanks and Kevin Bacon?" That would be a good query for graph. Use the right technology for the right situation.

2 Likes

All of the answers provided above make a lot of sense to me! Thank you very much! I'll go with polyglot database setup and use both - Neo4j and PostgreSQL.

It also depends on where this project is going...

If it's intended for serious production work, I'd start small to test our your ideas. It would be a waste to build out a full scale system only to discover some glitch that invalidates your initial approach and your early effects are completely wasted.

If it's not in the critical path, then trying out the whole thing in Cypher might not be a bad approach, as it would give you a chance to learn more about Cypher and its possible drawbacks as you learn Cypher and the Neo4J system.

There are definitely some interesting subtleties in Cypher!

Hi @myshareit , I would agree with @mike_r_black . You need to understand the nature of the database to be used for the application. The application use cases drives what database to choose, and not the other way around.

Why do you want to go with polyglot ? Neo4j + PostgreSQL. I would always start with RDBMS, understand the query pattern, and based on that decide which one to choose. How many queries need to JOIN's and how many needs to be Graph related queries ?

Based on your initial thread, have you finish creating the conceptual design, and got approval of all the entities ?

Being Neo4j advisor, do remember that RDBMS like Oracle, SQL Server and PostgreSQL do still exists in Enterprise organization, and always on the Top list for any database selections. You have to prove your use case, to choose "a" NoSQL databases.