Neo4j Performance Doubts?

I have built an entire application on your platform only to found the following info on someone's site

Why Neo4J Sucks

  • issues in cluster mode and scaling
  • not built for distributed data (or even big data)
  • issues with bulk loading, indexing (e.g. range, sort, etc), slow upsert
  • bottleneck with large volume of writes due to slave/master topology
  • extremely expensive to replicate entire graph across each node
  • confusing license terms for production use
  • heap allocation and GC cause out-of-memory errors (buggy configs)
  • practically only useful for small dataset reads that require visualization
  • not appropriate for high read/write/index searches on KG
  • re-indexing can be very slow, require elasticsearch/solr as added dependency step

I have started experiencing some of them, especially the out of Memory exceptions.

What is your defense? Have I wasted my time on a software thats not scalable or I am doing something wrong?

Let me try to explain what I have seen from my experience.

  • issues in cluster mode and scaling

This is such a generic statement there is nothing to even say here. If there is a specific issue about clustering then it can be answered. It's like saying I have something in my mind and your system does not do it that way. I have worked with environments that involves large clusters with large amounts of data. As any complex environment which involves configurations like that requires better tuning and monitoring.

  • not built for distributed data (or even big data)

If you are looking for sharded data similar to elasticsearch then you are better off with any document database. Since each document is independent and not related to others anyway distributing data and eventual consistency is good enough for that aspect. What you gain here with data distribution and write performance will lose out when you want to traverse connected data.

Let me give you a simple example. A customer was using elasticsearch to store the manufacturing hierarchy. Most of the reads from that system are good and working well. Then came a scenario where they needed to pick a part and find the hierarchy up 3 levels. The data was not a lot, few 100,000 records. It was taking lot of time for them and not meeting their SLA's. They involved elastic engineers also to see how this can be solved and they couldn't do it. The problem here was using a wrong tool for work. When they loaded the data in neo4j it was way faster because of the way the data was stored. It is about using right tool to solve a given problem.

  • issues with bulk loading, indexing (e.g. range, sort, etc), slow upsert

Again I think you are comparing document databases here with graph database. Graph database optimizes traversals. They cannot be as fast as document databases for writes. But they can be lightning fast for traversals, which document databases cannot match in any for or fashion. So, look at your use cases and see what works best for you. Also, the data in Neo4j is pre joined data. so due to acid compliance the locking can be bit higher than RDBMS world. Models can be adjusted for read/write optimization.

  • bottleneck with large volume of writes due to slave/master topology

Clustering provides availability and redundancy. There is a cost involved in synchronizing transactions. If the servers are closer in networking perspective the cost is very small. If you have distributed the servers geographically and networking between them is really bad then it can have a negative effect. I have worked with customers who are able to process 5000 messages per second (create 4 nodes and 6 relationships for each msg) in a 3 node cluster.

  • extremely expensive to replicate entire graph across each node

Why do you want to replicate with a clean node. Take the latest backup and start the new node with that backup and let it catch up the latest transactions. If you are looking to rebuild the whole cluster, then it is way faster. I have worked with a customer who had around 1.5 TB of db. With that db we got a 3 node cluster up and running in 30 mins.

  • confusing license terms for production use

I guess you are referring to community version. It is what it is.

  • heap allocation and GC cause out-of-memory errors (buggy configs)

As with any system you need to tune the memory for your use case and be aware how your model and queries can affect it. I have built a patient claims db with 100 million nodes, and 1 billion relationships. We have built the model and wrote code that to generate a sankey chart of what procedures are done in the next 90 days after a condition is identified across all the patients takes around 3-4 seconds and uses less than 10 MB of heap.

Most possible reason GC causes issue is very badly written queries.

  • practically only useful for small dataset reads that require visualization

Most of the databases I worked with are > 50 million nodes and > 500 mil relationships and all of them are in production. If that is small dataset in your case then i guess you need a specialized implementation.

  • re-indexing can be very slow, require elasticsearch/solr as added dependency step

I have never heard a thing like that.

2 Likes

I've got a strong feeling this is an older complaint, possibly several years old and out of date. I can address some of these, some of our other users have addressed other parts as well.

bottleneck with large volume of writes due to slave/master topology

Our older HA clustering used slave/master topology. That has been deprecated for some time, and was removed with our Neo4j 4.0 major release. We've been using causal clustering for years now, which is based on the Raft protocol and doesn't fit the description they provided.

issues with bulk loading, indexing (e.g. range, sort, etc), slow upsert

Our indexing has improved over the years. We migrated away from lucene indexing (which was slow for insertion, possibly the reason for the older complaint) and went to a native indexing structure which is faster for index writes. As mentioned our indexing has improved as well, allowing composite indexes and a fulltext index for more complex substring matching.

confusing license terms for production use

We've addressed this over the past few years, it is simplified now.

heap allocation and GC cause out-of-memory errors

Heap and GCs are usually due to huge transactions. Neo4j is an ACID database, and as such, transactions must be committed atomically. This means any pending changes in a transaction must be held in memory at the same time and applied all at once, so it's possible to craft queries (by accident or design) that eat up all your heap space. You may need to break down larger transactions or batch changes if there's too much to apply in a single transaction. Managing that and understanding how this works is mostly the user's responsibility, but more capabilities have been added in our recent versions to restrict how much heap memory is allowed to be used per query, to keep things under control. We'll continue to add capabilities to manage this. Note that if you're comparing to a non-ACID database, they may not have this issue to deal with, but as a consequence you may not have atomicity in transactions, and you may have other issues due to the database being non-ACID. Choose the right tool for the job and your needs.

practically only useful for small dataset reads that require visualization

We're used by some of the top companies from around the world, so clearly that statement is not true.

If you have issues with out of memory exceptions, reach out to us. If you have an enterprise license with us, you probably have an enterprise support contract, so leverage that. If not, ask on the forums. We're happy to assist.

Neo4j is a powerful tool, but as with all powerful tools, you need to know how to use it well, and there is learning involved. We're happy to lend a hand.

1 Like