I want to test Neo4j for our customer data.
For the first step, I am trying to estimate capacity for our graph data as the follow.
I have about 10M nodes and 13B relationships between nodes, each node and relationship have 10 attributes.
In this case, I need approximately 5T memory as following the example guide.
Does it make sense? Could you give a helpful guide for manage and operate huge size of graph effectively?
Thank you for your help. 🙂
I ran your numbers through my calculator that uses RAM2Storage Ratios and fixed core per RAM values. We use that Neo4j internally to do first estimation of resources needed for a particular setup.
My calculation shows the following for your numbers above:
3TB of RAM, 384 Cores, 6.5TB Disk Space.
This is already a large config, but what stuck in my eyes is the huge ratio of node:relationships that seem a bit uncommon from my experience. Also 10 properties per relationship seems high. I do not know your data model and use case, but refactoring a data model can often save a good amount of resources. So I would highly recommend that first, before going into resource calculations, etc..
Thank you for your kind reply.
Our use case to test using Neo4j is banking domain, especially corporate banking customer data. So we have 10M corporate customer, and they make 10~50 banking transactions everyday. But I limited 90 days for saving transactions because of capacity.
As your guide, I will find out refactoring our data to construct knowledge graph.
I appreciate your answer again.