cancel
Showing results for 
Search instead for 
Did you mean: 

Memory requirements seems to be dependent of the number of nodes

sr01
Node

Hi,
I'm trying to load a 5M nodes graph in the database. For different reasons I'm RAM constrained and can allocate only 4G to the database. I'm performing transactions in which I'm creating batches of 500 nodes.
At the beginning I get a throughput of about 2000 nodes created per second. But after loading around 250k nodes, the performance collapse and I eventually get an outOfMemory error.
I've tried various memory settings, this is the last one :
----
dbms.memory.heap.initial_size=1g
dbms.memory.heap.max_size=1g
dbms.memory.pagecache.size=1400m
# Limit the amount of memory that all of the running transaction can consume.
dbms.memory.transaction.global_max_size=500m
# Limit the amount of memory that a single transaction can consume.
dbms.memory.transaction.max_size=500m
# Transaction state location. It is recommended to use ON_HEAP.
dbms.tx_state.memory_allocation=OFF_HEAP
dbms.tx_state.max_off_heap_memory=400m

I don't understand why it is not possible to scale. Is there more caching taking place ? Is there a memory leak somewhere ? Fundamentally, my queries are independent from each other (just node creations) so the complexity should not increase with the number of nodes...

5 REPLIES 5

sr01
Node

As a follow-up. I've tried to pause the creation of the graph (putting a breakpoint in the debugger). Once the program that performs queries is paused, I've stopped and restarted neo4j. 

This solves the problem, and I'm able to load 250k more nodes... until it gets slow again. So there is apparently a leak somewhere... am I forgetting something after the transaction is run ? Should I close something ? Here's my code :

try (Session session = driver.session()) {

session.writeTransaction(tx -> {
tx.run(query);
return null;
});
session.close();
}

So.... I've actually found a solution... 

Diving into the heap dump, I found out that my queries where cached. So I'm calling now 

CALL db.clearQueryCaches()
every 50 queries and it seems to do the job !

Strange though that this function is not exhibited 

Hi @sr01 ,

Try running you transaction with parameters instead of parsing your query with the values. This way Neo4J will plan your query just once. Also, do you have all the indexes needed in order to perform your task?


Regards

Oh, y’all wanted a twist, ey?

Can you share the query.  As a note, I don’t see a need to return a result. 

tard_gabriel
Ninja
Ninja

Your memory settings are way too low

Nodes 2022
Nodes
NODES 2022, Neo4j Online Education Summit

On November 16 and 17 for 24 hours across all timezones, you’ll learn about best practices for beginners and experts alike.