Counting nodes based on date

mventimi · February 1, 2020, 2:58am

I have a set of nodes with a creationDate property that is indexed, and I am trying to count how many nodes were created for every day in a given date range, including days for which none were created. To do this for the month of December, I tried:

// Query 1
MATCH (m:item)
WHERE datetime('2019-12-01') <= m.creationDate < datetime('2020-01-01')
RETURN date(m.creationDate) AS date, COUNT(*) AS count
ORDER BY date

However, this query does not return dates for which there are no nodes. I managed to get this working:

// Query 2
WITH date('2019-12-01') AS startDate, date('2019-12-31') AS endDate
WITH startDate, duration.inDays(startDate, endDate).days AS days
WITH [day in range(0, days) | startDate + duration({days: day})] AS dates
UNWIND dates AS date
MATCH (m:item)
WHERE date(m.creationDate) = date
RETURN date, COUNT(*) AS count

The problem is that query 2 takes 27 seconds, while query 1 takes 1 second. This makes sense–query 2 is essentially running query 1 once for each day in the range.

I know I can post-process the data in my code to fill in the gaps, but I would prefer not to have to do the extra step if possible. Is there a way to do this in Cypher without the time penalty of rerunning query 1 once for each day in the range?

Umer · July 14, 2022, 12:43pm

just faced the same issue and got a solution:

MATCH (m:item)
return date(m.creationDate) as date,count(*) as total
UNION
WITH date('2022-06-01') AS startDate, date('2022-07-07') AS endDate
WITH startDate, duration.inDays(startDate, endDate).days AS days
WITH [day in range(0, days) | startDate + duration({days: day})] AS dates
UNWIND dates AS date
MATCH (m:item)
where date(m.creationDate)<>date
return date,0 as total

glilienfield · July 15, 2022, 2:27am

This seems to work. It may look a little convoluted, but it does not require a second search to fill in the missing dates. Instead it does a little processing to get the missing dates and creates a map using a apoc procedure. I borrowed @Umer's logic for generating all the dates between two given dates.

with date('2020-01-01') AS startDate, date('2020-01-10') AS endDate
with duration.inDays(startDate, endDate).days AS days, startDate, endDate
with [day in range(0, days) | toString(startDate + duration({days: day}))] AS allDates, startDate, endDate
match (s:Sample) where startDate <= s.creationDate <= endDate
with toString(s.creationDate) as date, count(s) as dateCounts, allDates
with collect(date) as dateList, collect(dateCounts) as countList, allDates
with [i in allDates where not i in dateList | i] as zeroDates, dateList, countList
with zeroDates, [i in zeroDates | 0] as zeros, dateList, countList
return apoc.map.fromLists(dateList + zeroDates, countList + zeros) as result

Test Data (I used @daveb's script):

unwind(['2020-01-01', '2020-01-01', '2020-01-02', '2020-01-03', '2020-01-06', '2020-01-09', '2020-01-09', '2020-01-02', '2020-01-03', '2020-01-06', '2020-01-09']) as dt 
create (:Sample { creationDate: date(dt) });

Result:

{
  "2020-01-10": 0,
  "2020-01-01": 2,
  "2020-01-02": 2,
  "2020-01-03": 2,
  "2020-01-04": 0,
  "2020-01-05": 0,
  "2020-01-06": 2,
  "2020-01-07": 0,
  "2020-01-08": 0,
  "2020-01-09": 3
}

glilienfield · July 15, 2022, 12:37pm

I refactored the query using a different apoc function. I think it is slightly less complicated as a result.

with date('2020-01-01') AS startDate, date('2020-01-10') AS endDate
WITH duration.inDays(startDate, endDate).days AS days, startDate, endDate
WITH [day in range(0, days) | toString(startDate + duration({days: day}))] AS allDates, startDate, endDate
match (s:Sample) where startDate <= s.creationDate <= endDate
with toString(s.creationDate) as date, count(s) as dateCounts, allDates
with collect([date, dateCounts]) as dateCounts, collect(date) as datesWithCounts, allDates
with [i in allDates where not i in datesWithCounts | [i, 0]] as datesWithZeroCounts, dateCounts
return apoc.map.fromPairs(dateCounts + datesWithZeroCounts) as result

Topic		Replies	Views
Counting nodes based on date Cypher	2	1508	February 2, 2020
Getting the node counts between 10 minutes periods Cypher cypher	1	222	October 15, 2021
Cypher for counting node property that meet certain conditions Cypher	4	280	April 19, 2021
Creating date nodes by looping through a start date and an end date Procedures & APOC	3	862	October 22, 2020
Count nodes by property value Cypher querying , cypher	2	445	July 29, 2021

Counting nodes based on date

Related topics