Hi everyone, I'm working on my thesis project, where I'm developing a system that processes exports from business registries. The goal is to clean and store the data in a Neo4j database, with a strong focus on accurate entity matching and creating relationships between companies and individuals to …

What is the size of your infrastructure? It is obvious that performance will degrade - if it is linear (A companies = X time / 2A companies = 2X time) it is probably the nature of the queries you are running over the nodes/edges. If it is exponential, it is either lack of infrastructure or a probl…

Thanks for the response, My setup is: CPU: AMD Ryzen 5 5600X RAM: 32GB DDR4 Storage: Samsung 980 PRO 1TB (SSD) OS: Windows 11 Pro It feels more exponential than linear, so I suspect it's a query or indexing issue rather than hardware limitations

Or you are running out of RAM and moving in a cycle of SSD DB -> RAM -> Virtual memory (which is SSD) Have a look at your Memory and CPU when you run the query.

The red flag for me is the mention of neomodel. It shouldn't take that long to import 25k nodes and relationships. Can you share your code and/or Cypher queries?

Hey, at first I was only using Cypher but I struggled with a robust bullet-proof matching logic, since the data are not very reliable in terms of same formatting and correct matching of same entities is very important. I refactored to neomodel and I was able to do what I wanted much quicker. I coul…

the data are not very reliable in terms of same formatting and correct matching That's probably another reason for your performance handicap ... you should have created a uniform model before hand

I can't account for your design - if you have to do those conversions, you have to do them. I am just trying to explain what are the possible sources of performance bottlenecks (and how it can get exponentially slower)

I see. Do you think it would be better to process data and match entities correctly to postgres for example, then just use export to CSV and import this to neo4j? Like that? Is it bad a approach to do complex cypher/neomodel queries for correct matching?

Performance issues as database gets bigger

Neo4j Graph Platform Import / Export

grejty (Lukigrejtak) March 21, 2025, 11:31pm 9

Well what I meant is, that sometimes a person has a name like this:

“John Smith PhD.”, which needs to be correctly matched to a Person node with name “Smith John”

Or things like different address formatting:
Fashion Street 123/45, London, UK
Fashion Street 45, London, UK

There has to be some logic in the matching process here, right? Im not using any AI to parse everything into same form beforehand. However im doing some pre-procesing of the data, cleaning, or trying to put addresses in the uniform form like so:
Street StreetNumber, PostalCode City, Country

Beyond that, I tried to do the robust “algorithms” for comparing slightly different nodes, to be accurate, as the data wont always be in 100% state. Do u think this is wrong approach?

Topic		Replies	Views
Hi I'm new to Neo4j, very very excited on the possibilities but struggling with performance within the very first datasets Introduce-Yourself	9	944	August 1, 2019
How to improve relationships insertion performance in a Neo4j database Cypher performance , cypher , operations , pythondriver	13	251	April 24, 2025
Importing Relationships / Nodes very slow Import / Export performance , cypher , import	3	1120	March 5, 2020
Help me merge 170M relationships with LOAD CSV Cypher load-csv	10	3734	October 23, 2019
Tyler from Texas - Massive Dataset Introduce-Yourself performance , import	2	473	February 29, 2020

Take the Course Then Join The Aura Agent Hackathon

Performance issues as database gets bigger

Related topics

Take the Course Then Join
The Aura Agent Hackathon