Count and nodes for each day of week and each hour

I'd like to query and count orders in my database and return a row for each day of the week (Mon-Sun) and 24 columns for each hour of the day (0:00 - 23:00). The return should be like this:

DayOfWeek	| 0:00 |	 1:00 |	 2:00 |	 3:00 |	 4:00 |	 5:00 |	 ... |	 22:00 |	 23:00 | 
Mon						0			0			 1			2				0			1			...			 0	 		5	
Tue						1			0			 0			0				0			2			...			 2	 		2
Wed						0			0			 1			3				5			0			...			 4	 		3
Thu						4			0			 1			0				0			1			...			 3	 		9
Fri						0			0			 0			2				2			6			...			10	 		6
Sat						2			0			 0			6				0			1			...			12	 		9
Sun						1			0			 1			2				1			4			...			 0	 		2

My cypher currently looks like this:

MATCH (o:Order)
WITH datetime(o.createdAt) as dt, o 
WITH ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"][dt.dayOfWeek] as WeekDay, dt.hour as Hour, o
RETURN WeekDay, Hour, count(o) as Orders

Which returns the number of orders for a single day and hour.

I need to transform this query that I have a fixed result of 7 rows (for each day) and add the results for the count of order into the respective columns 0-23. Furthermore, I'd need to fill the rows with zeros if there are no order for some hours.

Any ideas how I could achieve this without implementing 7*24 subquerys for each hour of the day?

It depends on how much data you have, but you could consider using list comprehensions to do your filtering. Something like this: (You will have to adjust)

UNWIND [{ "name": "Mon", idx: 0 }, { name: "Tue", idx: 1 } /* ... */ ] AS WeekDay
MATCH (o:Order)
WHERE datetime(o.createdAt).dayOfWeek = WeekDay.idx
WITH WeekDay, collect(datetime(o.created_at)) as OrderTimestamps
RETURN 
     WeekDay.name as DayName,
    size([ts IN OrderTimestamps WHERE ts.hour = 0]) as zeroHour,
    size([ts IN OrderTimestamps WHERE ts.hour = 1]) as oneHour,
 /* ...  */

The way this query works (will need some adjustment) is that it starts with the day names, and collects up all orders that happened on that day into a list of timestamps (OrderTimestamps)

Then, to output how many are in zero hour, 1st hour, etc -- it uses a list comprehension to repeatedly filter the list of timestamps to only a list of those in the zero hour. It then counts the size of that list with size(), so you get counts per-hour, per-day.

2 Likes

Thank you, that helped me quite a lot!

I continued with your example:

WITH [{idx: 1, name: 'Mon'}, {idx: 2, name: 'Tue'}, {idx: 3, name: 'Wed'}, {idx: 4, name: 'Thu'}, {idx: 5, name: 'Fri'}, {idx: 6, name: 'Sat'}, {idx: 7, name: 'Sun'} ] AS WeekDays
UNWIND WeekDays as WeekDay
MATCH (o:Order)
WHERE datetime(o.createdAt).dayOfWeek = WeekDay.idx
WITH WeekDay, collect(datetime(o.createdAt)) as OrderTimestamps
RETURN 
	WeekDay.name as DayName,
	size([ts IN OrderTimestamps WHERE ts.hour = 0]) as h00,
	size([ts IN OrderTimestamps WHERE ts.hour = 1]) as h01,
	size([ts IN OrderTimestamps WHERE ts.hour = 2]) as h02,
	size([ts IN OrderTimestamps WHERE ts.hour = 3]) as h03,
	size([ts IN OrderTimestamps WHERE ts.hour = 4]) as h04,
	size([ts IN OrderTimestamps WHERE ts.hour = 5]) as h05,
	size([ts IN OrderTimestamps WHERE ts.hour = 6]) as h06,
	size([ts IN OrderTimestamps WHERE ts.hour = 7]) as h07,
	size([ts IN OrderTimestamps WHERE ts.hour = 8]) as h08,
	size([ts IN OrderTimestamps WHERE ts.hour = 9]) as h09,
	size([ts IN OrderTimestamps WHERE ts.hour = 10]) as h10,
	size([ts IN OrderTimestamps WHERE ts.hour = 11]) as h11,
	size([ts IN OrderTimestamps WHERE ts.hour = 12]) as h12,
	size([ts IN OrderTimestamps WHERE ts.hour = 13]) as h13,
	size([ts IN OrderTimestamps WHERE ts.hour = 14]) as h14,
	size([ts IN OrderTimestamps WHERE ts.hour = 15]) as h15,
	size([ts IN OrderTimestamps WHERE ts.hour = 16]) as h16,
	size([ts IN OrderTimestamps WHERE ts.hour = 17]) as h17,
	size([ts IN OrderTimestamps WHERE ts.hour = 18]) as h18,
	size([ts IN OrderTimestamps WHERE ts.hour = 19]) as h19,
	size([ts IN OrderTimestamps WHERE ts.hour = 20]) as h20,
	size([ts IN OrderTimestamps WHERE ts.hour = 21]) as h21,
	size([ts IN OrderTimestamps WHERE ts.hour = 22]) as h22,
	size([ts IN OrderTimestamps WHERE ts.hour = 23]) as h23

Instead of executing 7 MATCH queries during UNWIND, I changed it to 1 MATCH query for all orders and do the filtering of DayOfWeek and Hour in the list comprehension on the timestamps. I'm not sure it is actually faster, I would just guess so?

MATCH (o:Order)
WITH collect(datetime(o.createdAt)) as OrderTimestamps, [{idx: 1, name: 'Mon'}, {idx: 2, name: 'Tue'}, {idx: 3, name: 'Wed'}, {idx: 4, name: 'Thu'}, {idx: 5, name: 'Fri'}, {idx: 6, name: 'Sat'}, {idx: 7, name: 'Sun'} ] AS WeekDays
UNWIND WeekDays as WeekDay
RETURN 
	WeekDay.name as DayName,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 0]) as h00,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 1]) as h01,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 2]) as h02,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 3]) as h03,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 4]) as h04,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 5]) as h05,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 6]) as h06,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 7]) as h07,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 8]) as h08,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 9]) as h09,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 10]) as h10,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 11]) as h11,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 12]) as h12,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 13]) as h13,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 14]) as h14,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 15]) as h15,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 16]) as h16,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 17]) as h17,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 18]) as h18,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 19]) as h19,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 20]) as h20,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 21]) as h21,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 22]) as h22,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 23]) as h23

Is there any way to further improve this cypher query?

I further improved my Cypher query with a variable StartDate and calculated EndDate, which will be calculated by adding X days/weeks/months etc. onto the StartDate. Moreover, instead of providing a static list of WeekDays like in my example above, I now generate a list of days starting from the StartDate and iteratively add 0, 1, 2, ..., 6 days. So my week does not necessarily start at Monday, but at the given StartDate and ends 6 days later (the EndDate is irrelevant here).

I found this post as a very helpful: Creating date nodes by looping through a start date and an end date - #3 by tony.chiboucas

WITH datetime("2021-03-01") AS StartDate
WITH StartDate, StartDate + duration({days: 7}) AS EndDate // Add 7 days because we use < symbol in the WHERE clause
MATCH (o:Order)
WHERE datetime(o.createdAt) >= StartDate AND datetime(o.createdAt) < EndDate //IMPORTANT: use less than (<) EndDate 
WITH StartDate, EndDate, [i IN range(0, 6) | StartDate+duration({days: i})] as WeekDays, collect(datetime(o.createdAt)) as OrderTimestamps
UNWIND WeekDays AS WeekDay
RETURN 
	apoc.date.format(WeekDay.epochMillis, 'ms', 'E') as WeekDay,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 0]) as h00,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 1]) as h01,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 2]) as h02,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 3]) as h03,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 4]) as h04,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 5]) as h05,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 6]) as h06,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 7]) as h07,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 8]) as h08,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 9]) as h09,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 10]) as h10,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 11]) as h11,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 12]) as h12,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 13]) as h13,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 14]) as h14,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 15]) as h15,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 16]) as h16,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 17]) as h17,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 18]) as h18,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 19]) as h19,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 20]) as h20,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 21]) as h21,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 22]) as h22,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 23]) as h23