What sampling method is used when calculating standard deviation using "stDev()" function

khanhtran.ta.kt · June 6, 2025, 6:03am

On this Neo4j Documentation page, it says that the function stDev():

Returns the standard deviation for the given value over a group for a sample of a population.

But it does not say anything about the sampling method. So, my question is: what the sampling method is being used?

glilienfield · June 6, 2025, 11:05pm

it is an aggregating function. As such, it will calculate the standard deviation of the variable over the rows in the query result.

If you have a grouping value, it will aggregate over the rows for each grouping. Here are examples.

Sample Data:

create(:Employee{id:0,salary:45000,dept:"IT"}),(:Employee{id:1,salary:110000,dept:"IT"}),(:Employee{id:2,salary:75000,dept:"IT"}),(:Employee{id:3,salary:95000,dept:"HR"}),(:Employee{id:4,salary:225000,dept:"HR"}),(:Employee{id:5,salary:150000,dept:"FINANCE"}),(:Employee{id:6,salary:200000,dept:"FINANCE"}),(:Employee{id:7,salary:125000,dept:"FINANCE"})

Data:

Result of aggregating over the dept:

match(n:Employee)
return n.dept, n.salary

Result of aggregating over the entire company (no grouping):

match(n:Employee)
return stdev(n.salary)

This is how all the aggregation functions behave.

khanhtran.ta.kt · June 10, 2025, 3:07am

Hi Glilienfield,

Thanks for your answer. However, I'm interested in knowing about the sampling method used in the function itself.

john.stegeman · June 11, 2025, 11:21am

It's not that Neo4j takes a sample... the formula for std deviation has 2 versions - one if your data represents the whole population and another if your data represents a sample or subset of the population.

Topic		Replies	Views
Aggregation, Returns, & Functions Neo4j Website	0	1658	August 5, 2020
Aggregate with GroupBy SDN Spring Data Neo4j & Neo4j-OGM grouping , counts	1	905	January 30, 2019
Dynamic aggregation Cypher	2	299	September 22, 2021
How could to replace groupe by roll up and group by cube in cypher (aggregation) Cypher cypher	4	336	September 30, 2021
Collect_all how does it differ from regular collect? Neo4j Graph Platform cypher , operations	3	317	July 3, 2023

What sampling method is used when calculating standard deviation using "stDev()" function

Related topics