I have a set of nodes with a creationDate property that is indexed, and I am trying to count how many nodes were created for every day in a given date range, including days for which none were created. To do this for the month of December, I tried:
// Query 1
MATCH (m:item)
WHERE datetime('2019-12-01') <= m.creationDate < datetime('2020-01-01')
RETURN date(m.creationDate) AS date, COUNT(*) AS count
ORDER BY date
However, this query does not return dates for which there are no nodes. I managed to get this working:
// Query 2
WITH date('2019-12-01') AS startDate, date('2019-12-31') AS endDate
WITH startDate, duration.inDays(startDate, endDate).days AS days
WITH [day in range(0, days) | startDate + duration({days: day})] AS dates
UNWIND dates AS date
MATCH (m:item)
WHERE date(m.creationDate) = date
RETURN date, COUNT(*) AS count
The problem is that query 2 takes 27 seconds, while query 1 takes 1 second. This makes sense–query 2 is essentially running query 1 once for each day in the range.
I know I can post-process the data in my code to fill in the gaps, but I would prefer not to have to do the extra step if possible. Is there a way to do this in Cypher without the time penalty of rerunning query 1 once for each day in the range?