This
site is under construction. Please check back every few weeks for
updates
COMMON MISTEAKS
MISTAKES IN
USING STATISTICS: Spotting and Avoiding Them
Pseudoreplication
The term pseudoreplication
was coined by Hurlbert to refer to "the use of inferential statistics
to test for treatment effects with data from experiments where either
treatments are not replicated (though samples may be) or replicates are
not statistically independent."1 The context of his paper
was ecological field experiments, but pseudoreplication can occur in
other contexts as well.
Here, replication2
refers to having more than one experimental (or observational) unit
with the same treatment. Each unit with the same treatment is called a replicate.
Heffner et al3 distinguish a pseudoreplicate from a true replicate, which they
characterize as "the smallest experimental unit to which a treatment is
independently applied."
Most models for statistical inference require true replication. True replication permits the
estimation of variability within a
treatment. Without estimating variability within treatments, it
is impossible to do statistical inference. Consider, for example,
comparing two drugs by trying drug A on person 1 and drug B on person
2. Drugs typically have different effects in
different people. So this simple experiment will give us no information
about generalizing to people other than the two involved. But if we try
each drug on several
people, then we can obtain some information about the variability of
each drug, and use statistical inference to gain some information on
whether or not one drug might be more effective than the other on
average.
True replicates are often confused with repeated measures or with
pseudoreplicates. The following illustrate some of the ways this can
occur.
Examples:
1. Suppose a blood-pressure lowering drug is administered to a patient,
then the patient's blood pressure is measured twice. This is a repeated measure, not a
replication. It can give information about the uncertainty in the
measurement process, but not about the variability in the effect of the
drug. On the other hand, if the drug were administered to two
patients,
and each patient's blood pressure was measured once, we can say the
treatment has been replicated, and the replication may give some
information about the variability in the effect of the drug.
2. A researcher is studying the effect on plant growth
of different
concentrations of CO2 in the air. He needs to grow the
plants in a growth chamber so that the
CO2 concentration can
be controlled. He has access to only two growth chambers, but each one
will hold five plants. However, since the five plants in each
chamber share whatever conditions are in that chamber besides the CO2
concentration, and in fact may also influence each other, they are
not
independent replicates but are pseudoreplicates. The growth chambers
are the experimental units; the treatments are applied to the growth
chambers, not to the plant independently.
3. Two fifth-grade math curricula are being studied. Two schools have
agreed to participate in the study. One is randomly chosen to use
curriculum A, the other to use curriculum B. At the end of the school
year, the fifth-grade students in each school are tested and the
results are used to do a statistical analysis comparing the two
curricula. There is no true replication in this study; the
students are pseudo-replicates. The schools are the experimental units;
they, not the students, are randomly assigned to treatment. Within each
school, the test results (and the learning) of the students in the
experiment are not independent; they are influenced by the teacher and
other school-specific factors (e.g., previous teachers and learning,
socioeconomic background of the school, etc.).
Consequences of doing statistical
inference using pseudoreplicates rather than true replicates
Variability will
probably be underestimated. This will result in
- Confidence intervals that
are too small.
- An inflated probability of
a Type I error (falsely rejecting
a true null hypothesis).
What to do about pseudoreplication
1. Avoid it if at all possible.
Key in doing this is to carefully determine what the
experimental/observational units are; then be
sure that each treatment is randomly applied to more than one
experimental/observational unit. For example, in comparing
curricula (Example 3 above), if ten schools participated in the
experiment and five were randomly assigned to each treatment (i.e.,
curriculum), then each treatment would have five replications; this
would give some information about the variability of the effect of the
different curricula.
2. If it is not possible to
avoid
pseudoreplication, then:
a. Do whatever is
possible to minimize lack of independence in the the pseudo-replicates.
For example, in the study of effect of CO2
on plant growth, the
researcher
rearranged the plants in each growth chamber each day to mitigate
effects of location in the chamber.
b. Be careful in analyzing and reporting results. Be open about the
limitations of the study; be careful not to over-interpret results. For
example, in Example 2, the researcher could calculate what might be
called "pseudo-confidence intervals" that would not be "true"
confidence intervals, but which could be interpreted as giving a lower
bound on the margin of error in the estimate of the quantity being
estimated.
c. Consider
the study as preliminary (for example, for giving insight into how to
plan a better study), or as one study that needs to be combined with
many others to give more informative results.
Comments
- Note that in
Example 2, there is no way to distinguish between
effect of treatment and effect of growth chamber; thus the two factors
(treatment and growth chamber) are confounded.
Similarly, in Example 3, treatment and school are confounded.
- Example 3 may
also be seen as applying the two treatments to two
different populations
(students in one school and students in the other school)
- Observational
studies are particularly prone to
pseudoreplication.
- Regression can
sometimes account for lack of replication,
provided data are close enough to each other. The rough idea is that
the responses for nearby values of the explanatory variables can give
some estimate of the variability. However, having replicates is better.
Notes:
1. S. H. Hurlbert (1984) Pseudoreplication
and the design of ecological field experiments, Ecological monographs
54(2) pp. 187 - 211. (The quote is from the abstract).
2. There are other uses of the word replication -- for example,
repeating an entire experiment is also called replication; each
repetition of the experiment is called a replicate. This meaning is
related to the one given above: If each treatment in an experiment has
the same number r of replicates (in the sense given above), then the
experiment can be considered as r replicates (in the second sense) of
an experiment where each treatment is applied to only one experimental
unit.
3. Heffner, Butler, and Reilly (1996) Pseudoreplication Revisited, Ecology
77(8) 1996 pp. 2558 - 2562 (quote from p. 2558)