Hypothesis testing begins with framing the question we want to answer into two competing statements: the null hypothesis and the alternative hypothesis. Think of this as setting up a statistical "court case." The null hypothesis represents the initial assumption or the status quo (like "innocent until proven guilty"), and the alternative hypothesis represents the claim or effect we're trying to find evidence for.
The Null Hypothesis (H0)
The null hypothesis, denoted as H0, is a statement of no effect, no difference, or no relationship. It often represents the default state of belief or a baseline against which we test our evidence. It's the hypothesis that we tentatively assume to be true and seek to find evidence against.
Mathematically, the null hypothesis typically involves an equality (=) or includes equality in its statement (≤ or ≥).
Common Examples in Machine Learning:
- Model Comparison: Suppose we've developed a new classification algorithm and want to compare its average accuracy (μnew) to an existing one (μold). The null hypothesis might be that there is no difference in accuracy:
H0:μnew=μold
Or, if we only care if the new model is not worse, it could be:
H0:μnew≥μold
- Feature Importance: We want to test if a specific feature has any linear relationship with the target variable. The null hypothesis would state that the correlation coefficient (ρ) is zero:
H0:ρ=0
- A/B Testing: A company tests a new website design to see if it improves the user conversion rate (p). The null hypothesis might state that the new design (pnew) is no better than the old design (pold):
H0:pnew≤pold
The goal of hypothesis testing isn't to prove H0 is true, but rather to determine if there's enough statistical evidence in our sample data to reject it in favor of an alternative explanation.
The Alternative Hypothesis (H1 or Ha)
The alternative hypothesis, denoted as H1 (or sometimes Ha), is a statement that contradicts the null hypothesis. It represents the outcome we are actually interested in detecting or proving. It's the claim that requires evidence to be accepted.
The alternative hypothesis typically involves an inequality (=, <, or >). It must be mutually exclusive to the null hypothesis, meaning H0 and H1 cannot both be true. Together, they should ideally cover all possible outcomes for the parameter being tested.
Examples Corresponding to the H0 Above:
- Model Comparison:
- If we want to know if the accuracies are simply different:
H1:μnew=μold (This is a two-tailed test because we're looking for a difference in either direction).
- If we specifically hypothesize the new model is better:
H1:μnew>μold (This is a one-tailed or directional test).
- Feature Importance:
- If we want to know if there is any linear relationship (positive or negative):
H1:ρ=0 (Two-tailed test).
- A/B Testing:
- If the company hopes the new design improves conversion rate:
H1:pnew>pold (One-tailed test).
The choice between a one-tailed and a two-tailed alternative hypothesis depends entirely on the research question. Are you interested in detecting any difference, or a difference only in a specific direction?
Formulating Hypotheses: Best Practices
- Define Before Data Collection: It's essential to formulate H0 and H1 before you collect or analyze your data. This prevents the data from influencing the hypotheses, maintaining objectivity.
- Focus on Population Parameters: Hypotheses are statements about population parameters (like population mean μ, population proportion p, or correlation ρ), not sample statistics (like sample mean xˉ or sample proportion p^). We use sample statistics to make inferences about these population parameters.
- Ensure Mutually Exclusive and Exhaustive (Often): H0 and H1 should not overlap, and ideally, they should cover all possibilities for the parameter value. For example, if H0:μ=10, a two-tailed H1 is μ=10. If H0:μ≤10, the corresponding H1 must be μ>10.
Clearly defining the null and alternative hypotheses is the foundational first step in the hypothesis testing process. It sets the stage for collecting evidence and making a statistical decision about the claim being investigated. Once the hypotheses are set, the next steps involve choosing a significance level, collecting data, calculating a test statistic, and comparing it to a critical value or using a p-value, which we will cover next.