Getting a total, maximum and ratio in query

One of these will probably do it:

...but first, I need to reformat your Cypher so I can make better sense of it.

MATCH (n:Allele)
WITH
    n,
    size((n)<-[:HETEROZYGOUS]-()) AS hetCount,
    2*(size((n)<-[:HOMOZYGOUS]-())) AS homCount,
    (size((n)<-[:HETEROZYGOUS]-()) + 2*(size((n)<-[:HOMOZYGOUS]-()))) AS totCount
WHERE
    totCount > 0
RETURN
    n, hetCount, homCount, totCount,
    max(totCount) AS outOf,
    totCount / max(totCount) AS ratio
ORDER BY totCount ASC
LIMIT 100

Your root issue here is the max in the return clause has an implicit GROUP BY each row. To fix that, I'm sad to say, you're gonna have to run the query twice. Once to get the real totals, and a second time to get the values you're looking for, that needs the total and max to do the math.

MATCH (n:Allele)
WITH max((size((n)<-[:HETEROZYGOUS]-()) + 2*(size((n)<-[:HOMOZYGOUS]-())))) AS totCountMax
MATCH (n:Allele)
WITH
    n, totCountMax,
    size((n)<-[:HETEROZYGOUS]-()) AS hetCount,
    2*(size((n)<-[:HOMOZYGOUS]-())) AS homCount,
    hetCount + homCount AS totCount
WHERE
    totCount > 0
RETURN
    n, hetCount, homCount, totCount,
    totCountMax AS outOf,
    totCount / totCountMax AS ratio
ORDER BY totCount ASC
LIMIT 100

A better way.

Rather than doing all the path matching like this, it might be worth mutating the data a little, to make it all run more efficiently.

MATCH (n:Allele)
SET n.hetCount = size((n)<-[:HETEROZYGOUS]-())
SET n.homCount= 2*size((n)<-[:HOMOZYGOUS]-())
;

# Then, this should run better
MATCH (n:Allele)
WITH n, max(n.hetCount + n.homCount) as totCountMax
. . .

Lastly, a few references that might help with similar problems:

https://neo4j.com/developer/cypher/aggregation-returns-functions/

2 Likes