Docker performance problems

Hi all,

I currently have a very strange behavior with performance when running neo4j community as Docker. First of all, I am currently importing JSON data into a neo4j database using the https://github.com/BloodHoundAD/BloodHound program. Boodhound uses the JS file for the import. The import itself works fine, however I have a very different behavior when I run the neo4j natively or as a docker container.

Now when I import the data on a native Neo4j database it takes about 1 min (even with multiple attempts).

All files imported in 66.52 s.

The memory settings are set to the following values:

dbms.memory.heap.max_size=5G
dbms.memory.pagecache.size=5G

Since I need multiple instances of neo4j for my tests, I came up with the idea to docker this and thus create multiple containers for multiple instances of neo4j. When I start the Docker container (and set the same settings as in the native Neo4j database) the import takes very different times and significantly longer:

started with:
sudo docker run --name "testneo4j" --rm -p 7687:7687 -p 7474:7474 --env NEO4J_AUTH=neo4j/test --env NEO4J_dbms_memory_pagecache_size=5G --env NEO4J_dbms_memory_heap_max__size=5G neo4j:4.2.3


Results:
All files imported in 1052.01 s.
All files imported in 383.83 s.
All files imported in 415.76 s.
All files imported in 661.72 s.
All files imported in 710.43 s.
All files imported in 415.76 s
All files imported in 642.58 s

Since it requires additional effort with Docker, I expected a longer runtime. However, the fact that the times increase roughly by a factor of 10 seems strange to me. In addition, the import times are not stable, which I can not really explain.

Now to my actual question, are there any settings on the Docker side or the neo4j Docker container that can cause such a slowdown (factor 10)? Also, can anyone explain why the times are not constant?

I have already tried several things with the parallelization of the import, as well as the order of the files to be imported, but the behavior itself has not changed.

Thanks in advance, I am open to any idea :)

If all things are equal, Neo4j in Docker is shouldn't be orders of magnitude slower.
Is docker running on the same physical hardware(?VM?) as the native test?
If you look in the neo4j logs of docker to you see GC stop the world pause?
I usually mount my neo4j data drive and logs to a drive on a host.
If it is slower it is almost always slow disk, not enough memory (does docker really get 10GB of RAM from the host?). The db cache should reflect the size of data - if it is only 2G give it 2G. Heap reflects data in flight, i.e. your transactions need to fit in there. Example launch script:

#!/bin/sh

NEO4J_VERSION=4.2.6
NEO4J_PASSWORD=xxxxx

docker run
--rm
-e NEO4J_AUTH=neo4j/${NEO4J_PASSWORD}
-e NEO4J_dbms_memory_pagecache_size=1G
-e NEO4J_dbms_memory_heap_initial__size=5G
-e NEO4J_dbms_memory_heap_max__size=5G
-e NEO4J_ACCEPT_LICENSE_AGREEMENT=yes
-e NEO4JLABS_PLUGINS='["apoc"]'
-e NEO4J_dbms_default__listen__address=0.0.0.0
-p 7474:7474
-p 7687:7687
-v "$PWD/data":/var/lib/neo4j/data
-v "$PWD/logs":/var/lib/neo4j/logs
-v "$PWD/import":/var/lib/neo4j/import
neo4j:${NEO4J_VERSION}

Hello David,

thanks for replying. About your questions:

Is docker running on the same physical hardware(?VM?) as the native test?

Yes it runs on the same VM.

If you look in the neo4j logs of docker to you see GC stop the world pause?

No I do not see this message. Only some of the following:

 [o.n.c.i.ExecutionEngine] [neo4j/771f7be2] Discarded stale query from the query cache after 90 seconds. Reason: CardinalityByLabelsAndRelationshipType(None,Some(RelTypeId(2)),None) changed from 128.0 to 519.0, which is a divergence of 0.7533718689788054 which is greater than threshold 0.7347342123050334. Query: UNWIND $props AS prop
 ...
 [o.n.c.i.ExecutionEngine] [neo4j/771f7be2] Discarded stale query from the query cache after 99 seconds. Reason: NodesWithLabelCardinality(Some(LabelId(0))) changed from 76.0 to 2791.0, which is a divergence of 0.9727696166248656 which is greater than threshold 0.7331233776116441. Query: UNWIND $props AS prop
 ...
 [o.n.c.i.ExecutionEngine] [neo4j/771f7be2] Discarded stale query from the query cache after 94 seconds. Reason: NodesAllCardinality changed from 10.0 to 2778.0, which is a divergence of 0.9964002879769619 which is greater than threshold 0.7340463519819572. Query: UNWIND $props AS prop

does docker really get 10GB of RAM from the host?

With docker, there should be no restrictions by default. When I check, the memory is used accordingly (but with this dataset only about 1GB for native and docker usage).

The mounts of the directories I have skipped due to clearness. In addition, I have tested different possibilities of docker, at the end with the same result. I also played around with different neo4j settings like NEO4J_dbms_memory_heap_initial__size and others. Unfortunately also without success.