COMMON MISTEAKS MISTAKES IN
USING STATISTICS: Spotting and Avoiding Them
Overview of Frequentist Hypothesis
Testing
Most commonly-used
frequentist
hypothesis tests involve the following elements:
- Model assumptions (e.g., for
the t-test for the
mean, the model
assumptions can be phrased as: simple random sample1
of a random variable with a normal
distribution)
- Null
and alternative hypothesis
- A test statistic. This needs
to have the
property that extreme
values of the test statistic cast doubt on the null hypothesis.
- A mathematical theorem
saying, "If the model
assumptions and the
null hypothesis are both true, then the sampling distribution of the
test statistic has this
particular form."2
The exact details of these
four
elements will depend on the particular hypothesis test. We will
illustrate with an example.
Example:
In the case of the large-sample z-test for the mean, the
elements
are:
1. Model assumptions: We are
dealing with simple
random samples of the random variable X which has a normal
distribution.1
2. Null hypothesis: The mean
of the random
variable in question is a certain value µ0. The
alternative
hypothesis could be either "The mean of the random variable X is
not µ0,"
or
"The mean of the random variable X is less than µ0,"
or "The mean of the random variable X is greater than µ0."
For this example, we will use the first alternative, "The
mean of the random variable is not µ0."
(This is called the two-sided
alternative.)
3. Test statistic: x-bar
We now step back
and consider all possible
simple random samples of X of size n.
For each simple random sample of X of size n, we get a value of x-bar.
We
thus have a new random variable X-bar. (X-bar stands for
the
new random
variable; x-bar stands for the value of X-bar for a particular sample
of
size
n.) The distribution of X-bar is called the sampling distribution of X-bar.
4.
The theorem states: If the
model
assumptions are true and if the mean of X is µ0,
then the sampling distribution is normal, with mean µ0
and standard deviation σ/(√n), where σ
(sigma) is the
standard deviation of the random variable X. (Note: σ
is called the
population standard deviation
of X; it is not the same as the sample standard deviation s, although s
is an estimate of σ.)
The validity of the hypothesis test depends on the truth of the
conclusion of the theorem; the only way we know the conclusion is true
is if we know the hypotheses of the theorem are true. Thus: If the model assumptions are not true,
then we do not know that the theorem is true, so we do not know that
the hypothesis test is valid.
In the example , this translates to: If
the sample is not a simple random sample, then the reasoning
establishing the validity of the hypothesis test breaks down.
Comments:
- Different hypothesis tests have different
model
assumptions. Some tests
apply to random samples that are not simple; see Other Types of Random Samples. For
many tests, the model assumptions consist of several assumptions. If
any one of these model assumptions is not true, we do not know that the
test is valid.
- Many techniques are robust
to
departures from at least
some model assumptions. This means that if the particular assumption is
not too far from true, then the technique is still approximately
valid
1. This refers to a
simple
random
sample of a random variable; see the page More
Precise Definition of Sample Random Sample for more information.
2. The distribution of the test statistic, when considering all
possible suitably random samples of the same size, is called a sampling
distribution. For additional discussion of sampling
distributions, see Overview
of Frequentist Confidence Intervals and Frequentist
Hypothesis
Tests, p-values, and Type I Error. Those two pages
and this
one are
best read as a unit.
Last modified May 10, 2012