COMMON MISTEAKS
MISTAKES IN
USING STATISTICS: Spotting and Avoiding Them
Using an Inappropriate Method of Analysis
"...
all models are limited by the validity of the assumptions on which they
ride."
David Collier, Jasjeet S. Sekhon, and
Philip B. Stark, Preface (p. xi) to Freedman David A., Statistical Models and Causal Inference: A
Dialogue with the Social Sciences.
"Assumptions
behind models are rarely articulated, let alone defended. The problem
is exacerbated because journals tend to favor a mild degree of novelty
in statistical procedures. Modeling, the search for significance, the
preference for novelty, and the lack of interest in assumptions --
these
norms are likely to generate a flood of nonreproducible results."
David Freedman, Chance 2008, v. 21 No 1, p. 60
Each frequentist1 inference technique (hypothesis
test or confidence interval) involves model
assumptions. Different
techniques have different model assumptions. The
validity of the technique depends (to varying extents) on whether or
not the model assumptions are true for the context of the data being
analyzed.
Many techniques are robust to
departures from at least
some model assumptions. This means that if the particular assumption is
not too far from true, then the technique is still approximately
valid.2 Thus, when using
a
statistical technique, it is important to ask:
- What are the model assumptions for that
technique?
- Is the technique robust to some departures
from the
model assumptions?
- What reason is there to believe that the
model
assumptions (or something close enough, if the technique is robust) are
true for the situation being studied?
Neglecting
these questions is a common mistake in using statistics.
Sometimes researchers check only some of the assumptions, perhaps
missing some of the most important ones.
Unfortunately, the model assumptions vary from technique to
technique, so there are few if any general rules. One general rule of
thumb, however is:
Techniques are
least likely to be robust to departures from assumptions of
independence.3, 4
Note: Assumptions of
independence are
often phrased in terms of "random sample"
or "random assignment", so
these
are very important.
"The
independence assumption is fragile. ... Even modest violations of
independence can introduce substantial biases into conventional
procedures."
David A. Freedman, Statistical Models and Causal Inference: A
Dialogue with the Social Sciences, p. 31
" The independence assumption ...
is a dangerous assumption in practice!"
Bradley Efron, Large Scale Inference, p. 26
How do I know whether or not model assumptions are satisfied?
Unfortunately, there are no one-size-fits-all methods, but here are
some rough guidelines that can help sometimes:
1.
When selecting samples or dividing into
treatment groups, be very careful in randomizing according to the
requirements of the method of analysis to be used.
2.
Sometimes (not too often) model
assumptions can be justified plausibly by well-established5
facts, mathematical theorems, or theory that is well-supported by sound
empirical evidence.
3. Sometimes a rough idea
of
whether or not model assumptions might fit can be obtained by either
plotting the data or plotting residuals obtained from a tentative use
of the model.
Specific Situations Where Mistakes Involving Model Assumptions
Are Often Made
For More Discussion of
Inappropriate Methods of Analysis
- Freedman, David A., ed by David Collier,
Jasjeet S. Sekhon, and
Philip B. Stark (2010), Statistical
Models and Causal Inference, A
Dialogue with the Social Sciences, Cambridge University Press.
I heartily recommend this. (Many of the articles in this book are also
available in preprint form at http://www.stat.berkeley.edu/~census/)
- Harris, A. H. S., R. Reeder and J. K. Hyun
(2009),
Common
statistical and research design problems in manuscripts submitted to
high-impact psychiatry journals: What editors and reviewers want
authors to know, Journal of
Psychiatric Research, vol 43 no15, 1231
-1234
1. Bayesian statistical
techniques also involve assumptions; this web
site focuses mostly on frequentist techniques.
2. The Rice Virtual Lab in Statistics' Robustness
Simulation can be used to demonstrate the effect of some violations
of model assumptions on the two-sample t-test.
3. However, there is some robustness to some types of departures from
independence.
One is that, for large enough populations, sampling without replacement is good enough,
even though "independent" technically means sampling with replacement; see More Precise Definition of Simple Random
Sample.
4. For more discussion of the independence assumption and possible
effects of violations of it, see the Freedman (2010) reference above,
especially chapters 1 - 3 and 19.
5. Here, "well established" means well
established by sound empirical evidence and/or sound mathematical
reasoning.
This is not the same as
"well-accepted," since sometimes things may be well-accepted without
sound evidence or reasoning.
Last updated August 28,
2012