Me and my team are trying to write a query where we count the number of information that is possessed by more than a given number of people. In Cypher terms, here's our query
MATCH (info:Info)
WITH info, size((:Person)-[:HAS_INFO]->(info)) as peopleCount
WHERE peopleCount > 3
RETURN count(info)
We currently have around 150,000 info in the database, and the query profiling of this is pretty terrible. Here's what we can see
Try this:
MATCH (p:Person)-[:HAS_INFO]->(i:info)
WITH id(i) as ID, count(distinct p) as Cnt where Cnt >= 3
RETURN ID as infoID, Cnt as peopleCount ORDER BY peopleCount DESC LIMIT 20
Thank you for your reply. If I try to apply your suggestion, it is indeed a little faster (around 150ms), which still makes me wonder a big number of info (millions of it).
I had to edit your query to get what I want out of it, so here's what I have:
MATCH (p:Person)-[:HAS_INFO]->(info:Info)
WITH id(info) as ID, count(p) as peopleCount
WHERE (peopleCount >= 3)
RETURN count(ID)
Also, the actual query is a little bigger than that, but I tried to simplify the problem by providing only a part of it . If you want the real query, here it is
MATCH (info:Info)-[:MATCHES]->(pattern:Pattern)-[:PART_OF]->(patternGroup:PatternGroup)
WHERE ($sha256 = [] OR info.sha256 IN $sha256) AND
($patternGroups IS NULL OR patternGroup.id IN $patternGroups) AND
(info.likelihood >= 0.5)
WITH info, pattern, patternGroup, size((:Person)-[:HAS_INFO]->(info)) as peopleCount
WHERE ($minPeopleCount IS NULL OR peopleCount >= $minPeopleCount) AND
($maxPeopleCount IS NULL OR peopleCount < $maxPeopleCount)
RETURN count(info)