Why so many DB hits?

I have a cypher query:

MATCH (x:My_Label)
where x.DateProperty.epochMillis <= 1679443200000
RETURN count(x)
  • There are approximately 7M nodes in my DB.
  • My_Label refers to approximately 500k nodes.
  • Of the 500k nodes, about 45k match the where clause.
  • The query takes about 10 seconds to run from cold and about 4 seconds on reruns.
  • The query planner shows 500k db hits for stage 1 of the query (NodeByLabelScan), but then shows 28M db hits for stage 2 (Filter).
  • The server has 128gb of RAM and only runs Neo.
  • The server is not under any particular load.
  • Each node has a few small properties on it.

So here are my questions:

  1. With all that free RAM, I'd expect the initial result set of 500k nodes to be loaded into memory. Is this the case? If not, why not?
  2. Why are we seeing 28M db hits for a filter on a single property? If we have a 500k node dataset and we need to compare a property on each of those nodes, surely that's only 1M db hits.

I've redacted and altered a couple of bits of info from the plan, but this is essentially what I'm seeing:

Help plix?

Do you have an index on the date property? That will improve range comparisons on large datasets like this:

CREATE RANGE INDEX FOR (t:My_Label) ON t.DateProperty

Thanks. I'll try this, but I still don't get why I'm seeing circa 30M db hits.