I am experiencing performance issues while reading data from disk. Here are some of the details regarding dataset and environment.
- Graph size is roughly 34G, out of which 6G is size of indexes
- Total no. of nodes in db: 24M, total relationships: 61M
- Page cache size is 12G
- we are using azure premium ssd (P30) for persistence (https://azure.microsoft.com/en-us/pricing/details/managed-disks/) which offers 5000 IOPS per disk and has a throughput of 200MB/s.
- Neo4j is community version 3.4.5 running on k8s cluster on azure as single pod.
I am trying to make a cypher query on one of the indexed field which is expected to return at most 1000 records of maximum size 4MB.
I understand since my graph is bigger than the page cache, some of the data will be read from disk. But the read operation takes more than even 30 seconds in some cases.
Is that normal behaviour when neo4j reads indexed data from disk? Any help would be appreciated