statistics

The normal distribution

Ph.D. Topics : Statistics

The normal distribution, or bell curve, is probably the most important probability distribution in statistics. Many quantities we observe are roughly normally distributed; the central limit theorem provides a mathematical explanation for this.

The probability density function is given by: $f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$

statistics

The central limit theorem

Ph.D. Topics : Statistics

Why are so many quantities we measure in nature approximately normally distributed? The central limit theorem (CLT), a key tenet of probability theory, says that the average or sum of a large number of independent and identically distributed random variables will be approximately normally distributed, no matter whether those underlying variables are normally distributed themselves or not. Many outcomes we measure–someone’s height, their math aptitude, the temperature in New Orleans on a summer day–represent the sum of many independent factors; the CLT is a mathematical explanation of why these quantities follow a roughly bell-shaped distribution.

The CLT also provides justification for null hypothesis testing of mean and mean difference values. It tells us that no matter what the underlying distribution of the quantity we’re measuring, the distribution of means will look normal, so long as we take a large enough sample.

Understanding the central limit theorem

Here’s an easy-to-understand definition of the central limit theorem:

The distribution of an average tends to be normal, even when the distribution from which the average is computed is decidedly non-normal.

Let’s dig into this definition a little.