## This site is under construction. Please check back every few weeks for updates

### COMMON MISTEAKS MISTAKES IN USING STATISTICS: Spotting and Avoiding Them

Pseudoreplication

The term pseudoreplication was coined by Hurlbert to refer to "the use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated (though samples may be) or replicates are not statistically independent."1 The context of his paper was ecological field experiments, but pseudoreplication can occur in other contexts as well.

Here, replication2 refers to having more than one experimental (or observational) unit with the same treatment. Each unit with the same treatment is called a replicate.

Heffner et al3 distinguish a pseudoreplicate from a true replicate, which they characterize as "the smallest experimental unit to which a treatment is independently applied."

Most models for statistical inference require true replication. True replication permits the estimation of variability within a treatment. Without estimating variability within treatments, it is impossible to do statistical inference. Consider, for example, comparing two drugs by trying drug A on person 1 and drug B on person 2.  Drugs typically have different effects in different people. So this simple experiment will give us no information about generalizing to people other than the two involved. But if we try each drug on several people, then we can obtain some information about the variability of each drug, and use statistical inference to gain some information on whether or not one drug might be more effective than the other on average.

True replicates are often confused with repeated measures or with pseudoreplicates. The following illustrate some of the ways this can occur.

Examples:
1. Suppose a blood-pressure lowering drug is administered to a patient, then the patient's blood pressure is measured twice. This is a repeated measure, not a replication. It can give information about the uncertainty in the measurement process, but not about the variability in the effect of the drug. On the other hand, if the drug were administered to two patients, and each patient's blood pressure was measured once, we can say the treatment has been replicated, and the replication may give some information about the variability in the effect of the drug.

2. A researcher is studying the effect on plant growth of different concentrations of CO2 in the air.  He needs to grow the plants in a growth chamber so that the
CO2 concentration can be controlled. He has access to only two growth chambers, but each one will hold five plants.  However, since the five plants in each chamber share whatever conditions are in that chamber besides the CO2
concentration, and in fact may also influence each other, they are not independent replicates but are pseudoreplicates. The growth chambers are the experimental units; the treatments are applied to the growth chambers, not to the plant independently.

3. Two fifth-grade math curricula are being studied. Two schools have agreed to participate in the study. One is randomly chosen to use curriculum A, the other to use curriculum B. At the end of the school year, the fifth-grade students in each school are tested and the results are used to do a statistical analysis comparing the two curricula. There is no true replication in this study; the students are pseudo-replicates. The schools are the experimental units; they, not the students, are randomly assigned to treatment. Within each school, the test results (and the learning) of the students in the experiment are not independent; they are influenced by the teacher and other school-specific factors (e.g., previous teachers and learning, socioeconomic background of the school, etc.).

Consequences of doing statistical inference using pseudoreplicates rather than true replicates

Variability will probably be underestimated. This will result in
• Confidence intervals that are too small.
• An inflated probability of a Type I error (falsely rejecting a true null hypothesis).

1. Avoid it if at all possible.

Key in doing this is to carefully determine what the experimental/observational units are; then be sure that each  treatment is randomly applied to more than one experimental/observational unit. For example, in comparing curricula (Example 3 above), if ten schools participated in the experiment and five were randomly assigned to each treatment (i.e., curriculum), then each treatment would have five replications; this would give some information about the variability of the effect of the different curricula.

2.  If it is not possible to avoid pseudoreplication, then:

a. Do whatever is possible to minimize lack of independence in the the pseudo-replicates. For example, in the study of effect of
CO2 on plant growth, the researcher rearranged the plants in each growth chamber each day to mitigate effects of location in the chamber.

b. Be careful in analyzing and reporting results. Be open about the limitations of the study; be careful not to over-interpret results. For example, in Example 2, the researcher could calculate what might be called "pseudo-confidence intervals" that would not be "true" confidence intervals, but which could be interpreted as giving a lower bound on the margin of error in the estimate of the quantity being estimated.

c. Consider the study as preliminary (for example, for giving insight into how to plan a better study), or as one study that needs to be combined with many others to give more informative results.

• Note that in Example 2, there is no way to distinguish between effect of treatment and effect of growth chamber; thus the two factors (treatment and growth chamber) are confounded. Similarly, in Example 3, treatment and school are confounded.
• Example 3 may also be seen as applying the two treatments to two different populations (students in one school and students in the other school)
• Observational studies are particularly prone to pseudoreplication.
• Regression can sometimes account for lack of replication, provided data are close enough to each other. The rough idea is that the responses for nearby values of the explanatory variables can give some estimate of the variability. However, having replicates is better.

Notes:
1.
S. H. Hurlbert (1984) Pseudoreplication and the design of ecological field experiments, Ecological monographs 54(2) pp. 187 - 211. (The quote is from the abstract).
2. There are other uses of the word replication -- for example, repeating an entire experiment is also called replication; each repetition of the experiment is called a replicate. This meaning is related to the one given above: If each treatment in an experiment has the same number r of replicates (in the sense given above), then the experiment can be considered as r replicates (in the second sense) of an experiment where each treatment is applied to only one experimental unit.
3. Heffner, Butler, and Reilly (1996) Pseudoreplication Revisited, Ecology 77(8) 1996  pp. 2558 - 2562 (quote from p. 2558)