This
site is under construction. Please check back every few weeks for
updates
COMMON MISTEAKS
MISTAKES IN
USING STATISTICS: Spotting and Avoiding Them
Misinterpretations and misuses of p-values
Recall from the page Frequentist
Hypothesis Tests and p-values:
p-value = the
probability of
obtaining a test statistic at least
as extreme as the one from the data
at hand, assuming:
- the model assumptions for the
inference
procedure used are
all true, and
- the
null hypothesis
is true,
and
- the random variable is the same
(including
the same
population),
and
- the sample size is the same.
Notice that this is a conditional
probability: The probability that something happens, given that
various other conditions hold. One common misunderstanding is to
neglect some or all of the conditions.1
Example: Researcher 1 conducts
a clinical trial to test a
drug for a certain medical condition on 30 patients all having that
condition. The patients are randomly
assigned to either the drug or a look-alike placebo (15 each). Neither
patients
nor medical personnel know which patient takes which drug. Treatment is
exactly the same for both groups, except for whether the drug or
placebo is used. The hypothesis
test has null hypothesis "proportion improving on the drug is the same
as proportion improving on the placebo" and alternate hypothesis
"proportion improving on the drug is greater than proportion
improving on the placebo." The resulting p-value is p = 0.15.
Researcher
2 does another clinical trial on the same drug, with the same placebo,
and everything else the same except that 200 patients are randomized to
the treatments, with 100 in each group. The same hypothesis test is
conducted with the new data, and the resulting p-value is p = 0.03.
Are
these results contradictory? No -- since the sample sizes are
different, the p-values are not comparable, even though everything else
is the same. (In fact, a larger sample size typically results in a
smaller p-value; see the discussion of power).
Another common misunderstanding
of p-values is the belief that the
p-value is "the probability that the
null hypothesis is true". The basic assumption of frequentist
hypothesis testing is that the null hypothesis is either true (in which
case the probability that it is true is 1) or false (in which case the
probability that it is true is 0).2
1. Neglecting the
condition
that the populations are the same results in extrapolation of the
results, one form of over-interpretation.
2. In the Bayesian perspective, it makes sense to consider "the
probability that the null hypothesis is true" as having values other
than 0 or 1. In that perspective, we consider "states of nature;" in
different states of nature, the null hypothesis may have different
probabilities of being true. The goal is then to determine the
probability that the null hypothesis is true, given the data. This is
the reverse conditional probability from
the one considered in frequentist inference (the probability of the
data given that the null hypothesis is true).