General framework | Acknowledgments: Y. Brandvain’s code snippets
2025-11-05
In addition to estimation, hypothesis testing is a major goal of statistics.
Example Hypotheses:
Statistical hypothesis testing automates decision making
Statistical hypothesis (null and alternative) \(neq\) Scientific hypothesis (statements about the existence and possible causes of natural phenomena)
We take samples to understand the world out there.
So, can our estimates be simply explained by chance, or are they special?
We can only take a sample and make estimates.
We can imagine taking infinite samples from a population
Build a sampling dist. from a boring population we can describe.
Where would or estimate fall on this distribution
Hypothesis testing assumes random sampling – or that we account for non-random sampling in building the null model.
Repeat after me:
Hypothesis tests account for sampling error, NOT for sampling bias.

We conduct hypothesis testing by checking if our estimate is surprising (unlikely) under the null model.
The hypothesis is about the population - not about your sample.
The null hypothesis (\(H_0\)) skeptically argues that data come from a boring population described by the null model.
Building a sampling distribution from the appropriate null model is key to hypothesis testing.
Is the population from which we sampled different from a boring population?
Null hypothesis: a specific statement about a boring population made for the sake of argument (aka — the skeptical view).
A Good 😒 Null Hypothesis: 😒
Reflects all aspects of the null / boring population, except those posed by \(H_0\)
Asks “Can the results be easily explained by chance?”
Would be interesting if proven wrong.
Reflects the process of sampling.
What does this mean?
It means that a null hypothesis specifies a model that can be used to build a sampling distribution
By contrast the alternative hypothesis is less specific.
Our goal is to learn about the World Out There (the population) from our finite view (the sample)
Rejecting the null hypothesis is not our goal (although it can be satisfying / exciting).
“Real world” question: Does wearing a red shirt help win a wrestling match
First: Turn this into a more “statistical” question, gather some data, and get to work!

Question: Does red influence the outcome of wrestling, taekwondo, and boxing?
Clearer (statistical) question: are the colors of outfits won by wrestlers predictors of victory/defeat?
“Scientific” question: could be something about the impact of the color red on aggression levels, hormones, neurotransmitters, etc.
Data / experiment:
State \(H_0\) and \(H_A\).
Calculate a test statistic.
Generate the null distribution.
Find critical value at specified \(\alpha\), and the p-value.
Decide:
Reject the \(H_0\) if the test stat is \(\geq\) the critical value (\(p\leq\alpha\)).
OR
Fail to reject \(H_0\) if the test stat is \(<\) the critical value (\(p>\alpha\)).
Notice that step 5 makes this a YES/NO kind of answer in the end, rather than a quantification of the effect of red on victories
Are the colors of outfits won by wrestlers predictors of victory/defeat?
\(H_0\): Red and blue-shirted athletes are equally likely to win (proportion of red among winners \(= 0.5\)). (notice how this is very specific)
\(H_a\): Red & blue-shirted athletes are not equally likely to win (proportion of red among winners \(\neq 0.5\)). (not specific)
One of these must be true. One is very specific, the other is not. Which one do the data support?
16 of 20 rounds had more red-shirted than blue-shirted winners.
Our test statistic here the difference between our observation (16 of 20 wins) and our expectation (10 of twenty wins).
Note: Test statistics: numbers calculated from the data and used to compare out results with those expected under the null hypothesis. They are variable and are related to the type of test we use. E.g. the statistic \(\chi^2\) (chi-square) is the test statistic used in the \(\chi^2\) test; the one-sample t-test uses the \(t\) statistic, etc.
This is a binomial distribution, which we will learn about next week.
B215: Biostatistics with R