This applet illustrates the fundamental principles of statistical hypothesis testing through the simplest example: the test for the mean of a single normal population, variance known (the Z test).
The basic set-up of the test is this: using only n independent observations from a normal distribution with unknown mean (but known variance), the task is to decide whether to accept a null hypothesis for a specified value of , or to reject the null hypothesis in favor of some alternative hypothesis. In most applications, there are only three alternative hypotheses of interest:
and then rejecting the null hypothesis if the appropriate condition is satisfied. In the order the alternative hypotheses are given above, the null hypothesis is rejected if
This hypothesis testing procedure is set up to give the null hypothesis ``the benefit of a doubt;'' that is, to accept the null hypothesis unless there is strong evidence to support the alternative. If is true, the above test statistic follows a standard normal distribution, so the probability of erroneously rejecting is just . If is true, however, the test statistic Z does not follow a standard normal distribution -- it follows a normal distribution with a different mean, and thus, the probability of (correctly) rejecting the null hypothesis is larger than . This probability is knows as the ``power'' of the test, and it depends on the true value of . (Clearly, a test would have more power for an extreme value of than for a that is very close to .
To use this applet, you must specify the null-hypothesized mean , the true mean , the value of , and select the appropriate alternative hypothesis. Clicking on the ``Show it!'' button will give a plot -- the black curve represents the distribution of the test statistic when the null hypothesis is true. The portion shaded in red represents the probability of being beyond the cut-off point(s) when the null hypothesis is true (the Type I error rate, or ). The blue curve represents the distribution of the test statistic under the particular value of you gave. The blue shaded area represents the power of the test for that particular value of . Note that the region shaded both blue and red appears purple.