This applet illustrates the fundamental principles of statistical hypothesis testing through the simplest example: the test for the mean of a single normal population, variance known (the Z test).
The
basic set-up of the test is this: using only n independent observations
from a normal distribution with unknown mean (but known variance), the
task is to decide whether to accept a null hypothesis
for a specified value of
, or to reject the null
hypothesis in favor of some alternative hypothesis. In most
applications, there are only three alternative hypotheses of interest:
and then rejecting the null hypothesis if the appropriate condition is satisfied. In the order the alternative hypotheses are given above, the null hypothesis is rejected if
This hypothesis testing procedure is set up to give the null
hypothesis ``the benefit of a doubt;'' that is, to accept the null
hypothesis unless there is strong evidence to support the alternative.
If
is true, the above test statistic follows a standard normal
distribution, so the probability of erroneously rejecting
is
just
. If
is true, however, the test statistic Z does
not follow a standard normal distribution -- it follows a normal
distribution with a different mean, and thus, the probability of
(correctly) rejecting the null hypothesis is larger than
.
This probability is knows as the ``power'' of the test, and it depends
on the true value of
. (Clearly, a test would have more power
for an extreme value of
than for a
that is very close to
.
To use this applet, you must specify the null-hypothesized mean
, the true mean
, the value of
, and select the
appropriate alternative hypothesis. Clicking on the ``Show it!''
button will give a plot -- the black curve represents the
distribution of the test statistic when the null hypothesis is true.
The portion shaded in red represents the probability of being beyond
the cut-off point(s) when the null hypothesis is true (the Type I
error rate, or
). The blue curve represents the distribution
of the test statistic under the particular value of
you gave.
The blue shaded area represents the power of the test for that
particular value of
. Note that the region shaded both blue and red appears
purple.