Index Store

Hello community, I have one question.

We have one node type that have more than 5 million nodes per month. To cover license, we have to offer our customers ability to browse last six month. Also, we have to be able to give them last twelve month on request. Since those data is updated constantly, we have 3 index on it (minimum we were able to reduce "indexes will fix speed").

Now, let's say like this, node name is "CustomerStats". Now, we would like to keep all data in store (and older than twelve month). So we decide to create another node, "CustomerStatsArchive" and put single index on it (only on creation time).

My question is, if I do something like this

MATCH (s:CustomerStats)
WHERE $from <= s.createdAt <= $to
SET s:CustomerStatsArchive
REMOVE s:CustomerStats

all indexes will be removed or they stay even if I change label (remove it from node)?

indexes are only removed when you submit an explicit DROP INDEX ON .... .
One can also view the currently configured indedxes via a call db.indexes();.

Finally you r example to set the new label of :CustomerStatsArchive and remove the current label of :CustomerStats is syntactically valid, however this does not move the index, presuming there was an index on :CustomerStats. Presumably you would want to create a new index on :CustomerStatsArchive

If you're talking about removing the node itself from any indexes that are on :CustomerStats nodes, then yes. Since you're removing the :CustomerStats label from that node, it will be removed from any index that is present for :CustomerStats nodes, and subsequently any index lookup using :CustomerStats will not find or return the node.

Thanks Andrew, that’s info I need.

No, it's not working. When I remove label from node, index is not removed from index store. Now, index store is completely destroyed and I have to rebuild index from ground.

35 million records are changed and index store growth from 30GB to 40GB. New label index created, old one are not removed.

I think we're talking about different things here. When a label is removed from a node and that transaction is committed, that node should not be able to be looked up from any index that was associated with the label.

Now, index store is completely destroyed and I have to rebuild index from ground.

What do you mean by this? Destroyed? What exactly are you doing here?

There's some kind of misunderstanding going on here. Can you describe in more detail what you're doing, what an "index store" is to you, and what you mean by the store being destroyed?

So let me start by showing what I'm talking about.

If you have an index on :Person(name), then when you use a MATCH or a MERGE on a node with a :Person label and include the appropriate predicate on the name property, depending on the rest of the query, there's the possibility for an index to be used.

If you remove the :Person label from a certain node (and I mean using the REMOVE clause, not just removing the label from the node in the query), then you cannot MATCH to it (or MERGE to it and have it match an existing node) by label, because that label no longer exists on that node. And since the presence of the label in the query itself is needed for an index lookup to occur, it shouldn't be able to be looked up via index (or it could try, but it wouldn't find it).

Like I said, we are removing one Label from node and create New Label (to use as archive). So, when old label is removed, index store (index folder) stay at same size as it was before we remove label from node.

When I said "destroyed", i mean search by index is now increased a lot, like it's not "live" but it is.

it's no problem, we will rebuild indexes (remove index folder and restart database), but to me it looks like there is problem with indexes when you remove label from node.

You likely wouldn't see the index shrink, especially if you're just adding another label (which could be associated with other indexes).

I really think you need to include the queries you're trying here, again, there seems to be some fundamental misunderstanding about how indexes work here, and we need to see some concrete queries and query plans to determine the misunderstanding.

For example i mean search by index is now increased a lot, can you add explanation here with the queries you're referring to and the query plans, along with the behavior you see vs what you would expect.

USING INDEX :CustomerStats(createdAt)

when I removed label :CustomerStats, and add :CustomerStatsArchive, node that remained with :CustomerStats now took triple time to filter them. Query that take like 3-5 seconds, now it's up to 10-12.

After I removed index "/schema/index/native-btree-1.0/xxx", and leave Neo4j to rebuild it, query is back to normal. No idea what's happening, but just want to share my findings with you.

Maybe it's problem in db installation (we are on 3.5.4), but who knows. Anyway, it's sorted now, take time but we have to pay attention on something like that in future.

Anyway, Andrew thanks for help.

Did you only try one execution, or multiple? What did the query look like? At this point we don't know where the time was spent on this, whether it was in planning, in index lookup, execution, or if there were pagefaults.

We would need more info to determine if this was really the fault of the index or something else. You should be profiling your query, maybe turn on query logging (and enabling timing logging for it) so you can see where the time was spent.