Indexing a string property is taking several days and it doesn't seem near to completion

toni · April 18, 2019, 7:15am

Hi,

We are indexing a property of a node and the server is doing something (I see increments in the physical disk size of the index) but it is running for several days (more than a week) and it doesn't seem near completion. For me several days of indexing in a powerful machine denotes that something is wrong but I don't know what, so asking help here in order to diagnose the problem.

The number of nodes is 999x10^6 , a billion.
The property is a string hash of size 64 completely random characters
We are using Neo4J Enterprise Causal Cluster created from Google Cloud Marketplace template. The cluster is just 3 core members:
- The leader has 32 CPU cores and 128Gb of RAM.
- Followers have 4 CPU cores and 128Gb of RAM
Memory is configured as:

dbms.memory.heap.iniitial_size=30600m
dbms.memory.heap.max_size=30600m
dbms.memory.pagecache.size=74800m

as was recommended by neo4j-admin memrec

Memory usage for the java process in Leader is reported as 64%
CPU usage is just at 10% consistently
Index size keeps growing although slowly

neo4j-enterprise-causal-cluster-1-core-vm-1:/var/lib/neo4j/data/databases/graph.db$ while true; do echo "$(date -Iseconds) $(du -ck schema/index/native-btree-1.0/* | grep total)" ; sleep 60; done
2019-04-18T07:06:07+00:00 145382404     total
2019-04-18T07:07:07+00:00 145384212     total
2019-04-18T07:08:07+00:00 145386312     total
2019-04-18T07:09:07+00:00 145388496     total
2019-04-18T07:10:07+00:00 145390420     total
2019-04-18T07:11:07+00:00 145392748     total

There is only one index being created at the moment and it has been running for 6 days.

I don't know if these numbers are "normal" for the problem's size, but I would appreciate any help on trying to diagnose if this is expected or if I should tweak some parameter or check any log to discover who us causing this slow down.

michael.hunger · April 18, 2019, 8:14am

Hello Toni,

no this definitely not normal.

Which Neo4j version is this running?
What kind of disk setup, do you have info about the disk performance.
How did you create the index?

michael.hunger · April 18, 2019, 9:02am

How is the I/O load? What kind of disk did you provision?

How big is that store on disk?

Can you add this to your config.

dbms.jvm.additional=-Dorg.neo4j.kernel.impl.index.schema.GenericNativeIndexPopulator.blockBasedPopulation=true

toni · April 18, 2019, 9:19am

Hi Michael,

I'm using Neo4J 3.5.3 from Google Cloud Marketplace (link)
As disks I'm using standard disks from Google. You can check specs here Storage options | Compute Engine Documentation | Google Cloud
I created the index by ussing a Cypher query from Neo4J Desktop: "CREATE INDEX ON :Transactions(hash)"
Store of the DB can be checked below, disk size is 2 terabytes

neo4j-enterprise-causal-cluster-1-core-vm-1:/var/lib/neo4j/data/databases$ du -h *
8.0K    graph.db/schema/index/native-btree-1.0/1/profiles
41G     graph.db/schema/index/native-btree-1.0/1
1.6M    graph.db/schema/index/native-btree-1.0/3/profiles
99G     graph.db/schema/index/native-btree-1.0/3
139G    graph.db/schema/index/native-btree-1.0
139G    graph.db/schema/index
139G    graph.db/schema
156K    graph.db/profiles
4.0K    graph.db/index
736G    graph.db
0       store_lock

The DB was created using neo4j-import tool and after that I launched index creation.

How could I check I/O load for the DB?

I'll add that config to the DB and I will report back.

toni · April 19, 2019, 7:33am

Ok, yesterday I restarted the DB with

dbms.jvm.additional=-Dorg.neo4j.kernel.impl.index.schema.GenericNativeIndexPopulator.blockBasedPopulation=true

at the beginning it was very fast when I was checking call db.indexes but now it is stuck. I have two cluster to test ideas about how to solve this, one has a leader with 128Gb RAM and the other with 64Gb.

In first cluster, index creation is stuck (although slowly advancing) at 37%
In second cluster, index creation is stuck (although slowly advancing) at 15%

They both were launched during same hour, and index creation status seems to follow memory ratios. Does that ring any bell on you? I'm running out of ideas. I don't know if it has something to do here, but this machine hasn't any swap memory just RAM

michael.hunger · June 26, 2019, 10:37am

Sorry for the delay, answer from our team:

Took a quick look - looks like they're on 3.5.3? I think they need to be on 3.5.5 or higher to get all the index population fixes.

(and will still need dbms.jvm.additional=-Dorg.neo4j.kernel.impl.index.schema.GenericNativeIndexPopulator.blockBasedPopulation=true)

dsolow · June 27, 2019, 11:55pm

Hi Michael. I'm running into a similar problem where my index building is taking a prohibitively long time (~7 hours to index a node property on 200M nodes). I've tried using this blockBasedPopulation but building the index with this setting enabled causes an out of memory crash. The index builds for a while (around an hour) with low memory usage. Then memory usage spikes, and the DB gets OOM-killed.

Any ideas on how this could be resolved? I'm using 3.5.6

dsolow · July 3, 2019, 3:35pm

It looks like this commit in 3.5.7 might fix my issue: Better reuse of buffers in BlockBasedIndexPopulator · neo4j/neo4j@2095b37 · GitHub

Giving it a try now.

dsolow · July 3, 2019, 4:43pm

It works! Index that was taking 7 hours to build now takes 45 minutes, and uses pretty much constant memory.

david_rosenblum · July 3, 2019, 9:39pm

Also affecting this is the type of disk, OS and Drivers/Firmware.
Data imported on Ubuntu and then indexed took a few hours.
Same data and database imported onto AWS Linux or CentOS 7.4 to less than 10 minutes.
Same SSD and IO provisioning on both OS's
Better drivers on some than others.

benjamin.squire · July 8, 2019, 4:13am

This was happening to my database as well, with 488 GB RAM on AWS i3.16xlarge, after 45 min the entire db was being killed. We are moving to 3.5.7 now to see if it fixes the issue.

Update - 3.5.7 has fixed it, the machine no longer dies on index creation.

Topic		Replies	Views
Create index spent too much time Conferences, Meetups, & Events migrated	0	186	January 18, 2023
CREATE INDEX ON :Label(prop) is really slow after neo4j-admin import Import / Export	4	2079	June 8, 2020
Memory settings for big databases Server performance	40	3223	August 1, 2019
Creating a specific index takes forever Cypher	3	154	February 24, 2022
Throughput for creation of nodes in Neo4J 3.5.7 decreases significantly with the number of properties Neo4j Graph Platform	7	893	August 16, 2019

July Summer Fun!

Indexing a string property is taking several days and it doesn't seem near to completion

Related topics