Transactions Committing and Memory Management

MikerT86 · September 18, 2022, 10:20pm

Hi all.

I'm using Neo4j 4.4.4 Community Edition with docker and rolled up in k8s. The app designed in Python.

The resources reserved for pod: 80 GB RAM, 10 CPUs, 80GB ephemeral-storage. Neo4j.conf: 46GB page cache and automatic allocated heap size.

The problem is following: When DB is started up it consumes around 28GB, with workload the memory starts leaking up to reserved for pod memory limits. Should I strictly follow suggestions from neo4j-admin memrec?

I'm using HTTP API for transactions with committing. Which gave us pretty good performance in terms of sending a large number of small simultaneous requests. The query basically the same, but input data (starting nodes) constantly changing.

There is example of query:

MATCH (source_node: Person) WHERE source_node.name in $inputs
MATCH (source_node)-[r]->(child_id:InternalId)
WHERE r.valid_from <= datetime($actualdate) < r.valid_to
WITH [type(r), toString(date(r.valid_from)), child_id.id] as child_path, child_id, false as filtered
OPTIONAL MATCH p_path = (child_id)-[:HAS_PARENT_ID*0..50]->(parent_id:InternalId)
    WHERE all(a in relationships(p_path) WHERE a.valid_from <= datetime($actualdate) < a.valid_to) AND 
        NOT EXISTS{ MATCH (parent_id)-[q:HAS_PARENT_ID]->() WHERE q.valid_from <= datetime($actualdate) < q.valid_to}
    WITH DISTINCT last(nodes(p_path)) as i_source,
    reduce(st = [], q IN relationships(p_path) | st + [type(q), toString(date(q.valid_from)), endNode(q).id])
    as parent_path, CASE WHEN length(p_path) = 0 THEN NULL ELSE parent_id END as parent_id, child_path

    OPTIONAL MATCH (i_source)-[r:HAS_ISSUER_ID]->(issuer_id:IssuerId)
    WHERE r.valid_from <= datetime($actualdate) < r.valid_to
    RETURN DISTINCT CASE issuer_id WHEN NULL THEN child_path + parent_path + [type(r), NULL, "NOT FOUND IN RELATION"]
    ELSE child_path + parent_path + [type(r), toString(date(r.valid_from)), toInteger(issuer_id.id)]
    END as full_path, issuer_id, CASE issuer_id WHEN NULL THEN true ELSE false END as filtered

And the request example:

result = requests.post(
"http://neo4j.hostname.com:7474/db/neo4j/tx/commit",
json:json_data,
headers:headers
).json()

When the memory consuming face the limits, the performance rapidly drops.

1. Please, could you explain why exactly this happening and how to avoid the memory leaking? Does the performance drops because of GC?

2. Can use Python Driver instead the HTTP API with transaction committing option?

MikerT86 · September 19, 2022, 3:01pm

Additionally, would like to ask, when we initialize page_cache and heap_size parameters, why the memory consumption still constantly growing with running transaction? For example: we have 80GB RAM, page_cache 46G and heap size 23G. DB starts with around 68GB RAM consumed. Which is ok, but when transaction starts the memory consumption immediately starts to grow over 70GB...

Topic		Replies	Views
Neo4j Transaction Management (setting up CPU and memory limit) Operations	1	321	July 2, 2021
Neo4j TransactionMemoryLimit Neo4j Graph Platform	1	186	February 21, 2023
Neo4j ServiceUnavailable while no error in log output Neo4j Graph Platform performance , cypher	7	146	September 25, 2024
Neo4j 4.0 Memrec Neo4j Graph Platform neo4j-4	2	1374	February 12, 2020
Neo4j k8s pods start slowing down after a while Neo4j Graph Platform migrated	0	149	October 10, 2022

Transactions Committing and Memory Management

Related topics