How to estimate Neo4j and system configuration for huge graphs

techiebits10 · March 1, 2021, 6:16am

Hello Guys, I’m trying to understand what would be the system configuration for my project.
I have to implement Neo4j for Entity relationship with an estimated userbase of more than 2B nodes and 4B relationships. Also I want to run cypher pattern based reasoning for the initial graph to enrich the knowledge. I understand we can use apocalyptic estimation query to estimate required memory for GDS algorithm but is there anything same for cypher pattern query and what would be the performance metric for the same it terms of time ?

How I can setup neo4j instance for this volume of data and what would be the system and neo4j configuration for it?
How to speed up cypher pattern based reasoning ?

How can achieve good performance for real time analysis ?

Would appreciated if anyone can provide information on above questions. Thanks

dana_canzano · March 2, 2021, 12:56am

regarding sizing, its not such a simple task. Even if we knew there were 2 Billion nodes, do said nodes average 3 properties per each node of 40 properties per each node. And then what are the datatypes of these properties. Are they all integers for example, or are they all strings and each string property could be any of 1 to 5000 characters? The same can be said for relationships.
You might want to first import a subset of the data, i.e. 100m nodes and 200million relationships get a rough estimate for graph size and then extrapolate up from there.
And in a perfect world it would be great to have an equivalent amount of RAM as the size of the database itself.

Also regarding the recommended system, again that would depend on the expected response time, the complexity of said queries and the number of concurrent queries. For example 100 near concurrent match (n) return n limit 1; is not going to be as performance intensive as match (n:Person {id:1})-[r*1..9]->(n2:Person) where n2.status='active' with n,n2, collect(r) as rels .... .... .... ....

Topic		Replies	Views
Capacity estimation for huge size of graph Neo4j Graph Platform migrated	2	303	August 8, 2022
Need advice on performance tuning for Neo4j Cypher on a large dataset with relationships Cypher performance , cypher	16	116	July 9, 2025
Hardware spec required for the given scenario Neo4j Graph Platform	1	269	August 23, 2021
Neo4j memory requirement for running gds algorithm Cypher apoc , graph	2	406	February 5, 2021
Help with Hardware Sizing for Neo4j Solution Operations	1	1220	June 20, 2019

Demystifying Neo4j UX Research

How to estimate Neo4j and system configuration for huge graphs

Related topics