Making Sense of Sample Standard Deviation
I had an opportunity to explain sample standard deviation to my colleague last week. There are many technical jargons surrounding sample standard deviation like unbiased estimator and Bessel's correction but it's not that hard to make an intuitive sense of it.
Consider a set of n samples
An important thing to notice here is that
If we simply applied this formula to estimate the variance v as in:
Theorem: the sum of squares of differences between samples and x has its unique minimum at where x is equal to the sample mean #
This was a hard lemma to prove for me. I had a sleepless night or two but I couldn't come up with a proof for the life of me. Then Min Xu introduced me to a well known proof as follows:
Let
Expanding each term inside the sum yields:
Using the commutative property of additions, we get:
Now, substitute the definition of the sample mean and divide by n to get:
Since this is a second order polynomial,
we know that the function is divergent on both negative and positive infinities
and it has the unique minimum at where the first derivative is zero.
Take the derivative of this function and set it to zero:
Corollary: The Sum of the Squares of the Differences Between Samples and Their Mean Divided by the Sample Size is a Biased Estimator of the Variance #
Proof:
It follows from the previous theorem that
Bessel's correction #
Now that we've convinced ourselves that
With Bessel's correction, you're dividing the sum with n − 1 as opposed to n to account the fact
the sum is smaller than it supposed to be.
Note that n − 1 approaches n as n goes to infinity, and here's why.
Recall Chebyshev's inequality:
Suppose we had n independent and identically distributed random variables of the mean μ
and the variance
Dividing each side of the inequality by n yields:
Now, let
In other words,