Quantifying associations between two categorical variables
Relative risk
Odds-ratio
Testing associations between categorical variables
\(\chi^2\) contingency test
G-test
Fisher’s test
We Often Care About Associations
Does eating chocolate decrease the chance you’ll have a bad day?
Does the home team have an advantage?
Is the use of “bath salts” associated with cannibalism?*
Does fertilizing tomatoes increase the chance they set fruit?
Does a given drug affect the disease outcome in patients?
Two kinds of chi-squared tests (1/2)
Chi-square goodness-of-fit tests
one variable (observed vs. expected)
the chi-square distribution in tests where we compare observed frequencies for one variable in a sample to a theoretical expectation based on a distribution (e.g. the proportional model, the poisson model, etc
Two kinds of chi-squared tests (2/2)
Chi-square contingency (independence) tests (2x2)
the chi-square statistic in tests that compare frequencies/counts for two categorical variables
the \(\chi^2\) contingency test is a special application of the more general goodness-of-fit test for which the probability model being tested is the independence of variables
Quantifying Associations Between Categorical Variables
Relative risk
Odds ratio
Relative Risk: Definition
Relative risk is the probability of an undesired outcome in the treatment group divided by the probability of that outcome in the control group.
Alternatively, even without \(\hat{p_{w}}\) and \(\hat{p_{m}}\), we can use this table:
$===0.0766782
Odds Ratio: Pros and Cons
Good:
Not intuitive at all
The odds ratio does not equal the relative risk.
Bad:
The denominator drops out of the top and bottom – meaning this will work even if we don’t have true absolute probabilities.
This will be the case in studies of rare outcomes in which it makes sense to increase the number of these rare outcomes in our sample (as we do in case-control studies).