When trying to diagnose performance issues with parallelized writes running on a kubernetes cluster, I found that running Runtime.getRuntime().availableProcessors() in a stored procedure returns 1, even though it's running on a node with 8 cores. Running nproc in the container returns 8, so it seems like the container is correctly provisioned with access to all the node's cores, but the JVM is not able to use them all.
I have tried setting JAVA_OPTS="-XX:ActiveProcessorCount=8" to force the JVM to use all cores, but no luck.
Are there any recommendations for how to tune neo CPU performance from within a kubernetes cluster?
What you have you specified for the pod's cpuRequest and cpuLimit?
Good call, setting resources.limits.cpu to 8 causes the JDK runtime availableProcessors() to return 8. E.g.
containers:
- name: neo4j
image: neo4j:3.5-enterprise
resources:
requests:
memory: "10G"
cpu: "4"
limits:
memory: "16G"
cpu: "8"
(note that setting the env variable NEO4J_dbms_jvm_additional to -XX:ActiveProcessorCount=8 also achieves the same thing)
However, the same can't be said for memory. Despite setting the memory request to 10G and limit to 16G, Runtime.getRuntime().maxMemory() returns only 512.0 MiB. Is there a way to set initial and max heap size via environment variables? If possible, I'd like to avoid having to maintain a custom neo4j.conf file in a k8 ConfigMap. It appears that in many cases, config settings can be overwritten as env variables when snaked cased and prepended with NEO4J (e.g. dbms.security.procedures.whitelist -> NEO4J_dbms_security_procedures_whitelist or dbms.directories.plugins -> NEO4J_dbms_directories_plugins). However, setting NEO4J_dbms_memory_heap_initial_size or NEO4J_dbms_memory_heap_max_size to 10G mangles the output and throws the error Invalid initial heap size: -Xms10G 512M.
Any thoughts on getting around this? The mangling of the heap initial/max size looks like a bug, but as I didn't see it documented anywhere, it might not be an official feature (would be useful though). In the meantime, I'll probably specify neo4j.conf as a ConfigMap.
Thanks for the pointer on resources.requests.limits.
(changed the thread title to better reflect the questions)
Figured out how to set heap size via env variable: NEO4J_dbms_jvm_additional=-Xms10g -Xmx10g, which seems preferable to maintaining the ConfigMap as mentioned above. In all, the StatefulSet definition looks like:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: neo
spec:
serviceName: neo
replicas: 1
selector:
matchLabels:
app: db
template:
metadata:
labels:
app: db
spec:
terminationGracePeriodSeconds: 10
containers:
- name: neo4j
image: neo4j:3.5-enterprise
env:
- name: NEO4J_ACCEPT_LICENSE_AGREEMENT
value: "yes"
- name: NEO4J_dbms_security_procedures_whitelist
value: "apoc.coll.*,apoc.load.*,apoc.*"
- name: NEO4J_dbms_security_procedures_unrestricted
value: "apoc.*"
- name: NEO4J_dbms_jvm_additional
value: "-Xms10g -Xmx10g"
resources:
requests:
memory: "10G"
cpu: "4"
limits:
memory: "16G"
cpu: "8"
ports:
- name: http
containerPort: 7474
- name: https
containerPort: 7473
- name: bolt
containerPort: 7687
volumeMounts:
- name: data
mountPath: /var/lib/neo4j/data/
volumes:
- name: data
persistentVolumeClaim:
claimName: neo-pvc
With the above config, the JDK has access to 8 cores and a total memory of 9.6 GiB.
Thanks David Allen for the help.