
Normal Distribution| Acknowledgments to Y. Brandvain for code used in some slides
2025-12-01
Because most quantitative variables are sums (or averages) of a bunch of things, the normal distribution is incredibly common!
For example:
Why that distribution? Why is it special? Why not some other distribution? Are there other statistical distributions where this happens?
Teaser: yes, there are other distributions that are special in the same way as the Normal distribution. The Normal distribution is still the most special because:



\[f[x]=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\] * Thus, across sample space, probability densities integrate to 1, meaning
\[\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}=\int_{-\infty}^{+\infty}f(x)dx=1\] , meaning the integral of the PDF (probability density function) over its entire range integrates to 1.
\(N(\mu, \sigma^2)\): These parameters - mean and variance (or standard deviation, \(\sigma\)) - fully specify a normal distribution
\(X\sim N(\mu, \sigma)\): \(X\) is normally distributed and the distribution is specified by a mean \(\mu\) and a standard deviation \(\sigma\)
\[P[a<X<b]=\int_{a}^{b}\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}dx\]
Ouch…
A normal with \(\mu=0\) and \(\sigma=1\)
The \(Z\)-distribution describes the probability that a random draw from the standard normal is greater than a given value.
In R pnorm(x = 1.5, lower.tail = FALSE) = 0.067 In R: pnorm(q = 1.5, lower.tail = F): 0.0668
Figure 10.3-2 from textbook
The standard normal is symmetric about zero, so: \[𝑃[𝑋 < -Z]=P[Z > X]\], i.e. the probability that a random sample, \(X\), is less than \(-Z\) equals the probability that a random sample is greater than \(Z\).
The normal integrates to one, so: \[P[X < Z]=1 − P[X > Z]\], i.e. the probability that a random sample is less than \(Z\) equals one minus the probability that a random sample is greater than \(Z\).
\[Z=\frac{X-\mu}{\sigma}\] * E.g., by a \(Z\) transform, a value of \(X=0.4\) from a normal distribution with \(\mu=0.5\) and \(\sigma=0.1\) will be \(\frac{0.4-0.5}{0.1}=\frac{-0.1}{0.1}=-1\)
\[Z=\frac{X-\mu}{\sigma}\]
The benefits are similar to using the CV to compare standard deviations across different scales
The other advantage: the standard Normal table
Let’s work on this!
\(\mu=177\)
\(\sigma=7.1\)
What are we looking for? \(P[\text{height}>180.3|\mu=177,\sigma=7.1]=?\)
Make a rough sketch:
\[P[Z>180.3|\mu=177,\sigma=7.1]=?\]
| 0 | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 | 0.07 | 0.08 | 0.09 | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0.3 | 0.382 | 0.378 | 0.374 | 0.371 | 0.367 | 0.363 | 0.359 | 0.356 | 0.352 | 0.348 |
| 0.4 | 0.345 | 0.341 | 0.337 | 0.334 | 0.330 | 0.326 | 0.323 | 0.319 | 0.316 | 0.312 |
| 0.5 | 0.309 | 0.305 | 0.302 | 0.298 | 0.295 | 0.291 | 0.288 | 0.284 | 0.281 | 0.278 |
Conclude that \(\sim 32.3\%\) of British men are too tall to be a spy.
# Probability of a random man in the UK being >= 180.3 cm tall.
pnorm(q = 180.3, mean = 177, sd = 7.1, lower.tail = FALSE)
## [1] 0.321
# probability of being <= 180.3 cm tall
pnorm(q = 180.3, mean = 177, sd = 7.1, lower.tail = TRUE)
## [1] 0.679
# or 1- P(>180.3)
1 - pnorm(q = 180.3, mean = 177, sd = 7.1, lower.tail = FALSE)
## [1] 0.679\[\mu=\bar Y, \sigma_{\bar{Y}}=\frac{\sigma}{\sqrt{n}}\] - The mean of the sample means equals μ. The standard deviation of the sample means is the standard error, and equals \(\frac{\sigma}{\sqrt{n}}\)