Select top 3 within each group


I am just beginning to learn Cypher on Neo4j Desktop.

If I have a data set that looks like this:

year subject num_students
2014 python 24
2014 java 29
2014 c# 12
2015 python 34
2105 java 31
2015 c# 23
2015 angular 32
2015 react 51
2015 nodejs 12
2016 php 13
2016 python 44
2016 java 41
2016 go 23

How do I select the top 3 programming courses (subject) for each year based on the number of students enrolled. I tried LIMIT but it limits the whole result set.

Thank you!

There's a knowledge base article on limiting results per row that can help you out here.

You could use an APOC subquery, which would allow you to use LIMIT in the subquery. Or you can take the slice of a collection.

Here's the collection approach, assuming we have the given rows already using the headers as variables currently in-scope:

WITH year, subject, num_students
ORDER BY num_students DESC
RETURN year, collect(subject)[..3] as topThree

First the results are ordered, and that order is preserved when we collect() per year, and we take the top slice of 3 from the ordered collection.

We have some Cypher tricks in the to-be-announced 4.x which will give an easier approach for these kinds of queries.

1 Like

Collect did the trick. Thanks a ton for the explanation. Liking the product very much!