How are ONLINE indices created?

aanastasiou · November 23, 2018, 11:22am

Is it possible to get some sort of overview regarding the way ONLINE indices are created and used in neo4j?

I am importing a large number of items (order of magnitude of millions) and I was wondering if in the long term it would be worth triggering the indexing manually OR break the data import in batches, to allow the indices to include the latest items loaded (?).

The specific questions that I have are:

Is it possible to manually trigger the indexing process? (So that it starts working in the background earlier than it would be scheduled).
Are the indices block or incremental? So, do indices need to operate on a block of data to be more efficient or do they operate incrementally?
Is there any list that outlines what sort of indices are used for specific data types? (e.g. the spatial indices are space filling curves. Is it safe to assume that anything else is probably a B-Tree?)

All the best

michael.hunger · November 28, 2018, 10:52pm

Hi,

if you create an index and have no data it becomes online immediately.
From then all operations/transactions will use the index for reading and writing.

If you have already data, the index will undergo a concurrent background population phase until after which it's switched online.

So if you import data that needs to read from the index (e.g. to create relationships or assert uniqueness) then create the index upfront, before that part and make sure it's online.

There are procedures like db.awaitIndexes(timeout) to wait for that.
And for resampling indexes (which affects which selectivity they report to the cypher planner) you can use
call db.resampleOutdatedIndexes()

Not sure what you mean with (2)
(3) yes.

HTH

aanastasiou · November 28, 2018, 11:24pm

Hi Michael

Thanks, this is useful, although it won't help too much with the specific problem that motivated this question.

What I mean with 2 is, does the index get updated with each write operation or does it wait for N write operations and then sorts them out? But if they are k-trees then they probably get updated on each write.

All the best

michael.hunger · November 29, 2018, 9:02am

With each operation, as Neo4j is transactional and gives the transactional guarantee.

Only the Fulltext-Index in 3.5 can be configured to be eventual consistent.

Topic		Replies	Views
Index Creation doubt in neo4j Neo4j Graph Platform performance , cypher , index	3	183	February 7, 2024
Create index in the background Cypher	2	281	December 17, 2021
Concept Questions Random: Challenges, Polls, Fun Banter	2	888	August 14, 2019
Performance question - how to work with indexes Cypher performance , cypher , data-modeling	5	358	November 25, 2021
Neo4j-admin indexes Import / Export	2	286	April 1, 2022

How are ONLINE indices created?

Related topics