Hypothesis Testing : ANOVA

stethoscope lying on keyboard of a laptop

ANOVA (Analysis of Variance) is a generalization of the difference of means. Here we have multiple populations, and we want to see if any of the population means are different from the others. That means that the null hypothesis is that ALL the population means are equal. An example: suppose everyone who visits our retail website either gets one of two promotional offers, or no promotion at all. We want to see if making the promotional offers makes a difference. (The null hypothesis is that neither promotion makes a difference. If we want to check if offer 1 is better than offer 2, that’s a different question). We can do multi-way ANOVA (MANOVA) as well. For instance if we want to analyze offers and day of week simultaneously, that would be a two-way ANOVA. Multi-way AVNOVA is usually done by doing a linear regression on the outcome, using each of the (categorical) treatments as an input variable. Here, we will only talk about 1-way ANOVA.


Suppose we know that a properly functioning disk drive manufacturing process will produce between 9 and 17 defective disk drives per 1000 disk drives manufactured, 98% of the time. On one of our regularly scheduled inspections of a plant, we inspect 1000 randomly selected drives. If we find 13 defective drives, we can’t reject the assumption that the plant is functioning properly, because 13 defects is “in bounds” for our process. What if we find 25 defects? We know that this would happen less than 2% of the time in a properly functioning plant, so we should accept the alternate hypothesis that the plant is not functioning properly, and inspect it for problems.