COMMON MISTEAKS
MISTAKES IN
USING STATISTICS: Spotting and Avoiding Them
Resources
Here are some of the
resources
that have been used in creating this website, plus others that are
worth reading or consulting. Also see references in the footnotes of
individual pages on this site.
Agresti, Alan (2010) Analysis
of Ordinal Categorical Data, Wiley
American Statistical Association, Ethical Guidelines for
Statistical Practice,
http://www.amstat.org/committees/ethics/index.html.
Some of the items
mentioned in this website (e.g. cautions regarding multiple inference)
are considered matters of ethical practice.
Beimer P and L Lyberg (2003), Introduction to Survey Quality,
Wiley.
An introduction to sources of
errors in surveys.
Berk,
Richard (2004).
Regression Analysis: A
Constructive
Critique, Sage.
Preview available on Google Books; you can
read the preface (by Jan De Leeuw) online at http://escholarship.org/uc/item/8db1942z.
The
title aptly describes the
spirit of the book.
Bethlehem, Jelke (2009). Applied
Survey Methods: A Statistical Perspective, Wiley.
Includes
an overview of the survey process; questionnaire design; sampling
designs; sources of error; nonresponse; online surveys; guidelines on
use of graphs. Unfortunately, the coverage of confidence intervals is
weak, and there is no discussion of multiple
inference.
F. Betz, T. Hothorn, P. Westfall (2010). Multiple Comparisons Using R, CRC
Press
A concise yet quite comprehensive
account of multiple testing, covering a variety of methodologies, with
a unifying theme of maximum statistics. Includes descriptions of
software implementations available in the R package.
K. P. Burnham and D. R. Anderson (2002), Model Selection and
Multimodel Inference: A Practical Infomation-Theoretic Approach,
2nd
ed., Springer
Athorough discussion of Akaike's
Information
Criterion and related methods, plus methods of taking model-selection
uncertainty into account when estimating parameters. Definitely
recommended, especially if you are working with observational data.
Chance News, http://chance.dartmouth.edu/chancewiki/index.php/Main_Page
Quoting from the home page:
"Chance News reviews current issues in the news that use probability or
statistical concepts. It uses Wikipedia software to allow readers to
add articles or change existing articles using the edit option." Most
entries include discussion questions for use in class.
Cook, R. Dennis and Sanford Weisberg (1999). Applied Regression
Including Computing and
Graphics, Wiley.
Stronger
on model checking,
diagnostics, and cautions about common misapplications than most
regression textbooks. Also serves as a user's manual for the regression
software arc,
which has
user-friendly features for transforming toward multivariate normality
and for various regression diagnostic techniques.
(Unfortunately,
the Unix and Macintosh versions of arc are no longer well supported.) I
used it for a number of years as a textbook. You can find my lecture
notes at http://www.ma.utexas.edu/users/mks/384Gfa08/384G08home.html.
(However, I think that I might do some things differently were I to
teach the course again -- e.g., place more emphasis on cautions
regarding multiple inference and importance of model validation.)
Cressie, Noel and Christopher K. Wikle (2011), Statistics for Spatio-Temporal Data, Wiley.
Probably the premier reference for
analyzing spatial and temporal data. Chapter 1 is an
easy-to-read discussion of the importance of the subject and the
problems it may involve; Chapter 2 requires considerable
background in mathematics and mathematical statistics.
Dean, Angela and Daniel Voss
(1999). Design
and Analysis of Experiments,
Springer.
Doshi P., M Jones and T. Jefferson (2012). Rethinking
credible evidence synthesis, British
Medical Journal 344, Article Number: d7898 DOI: 10.1136/bmj.d7898S.
Dudoit and M. J. van der Laan
(2008), Multiple Testing Procedures
with Application to Genomics, Springer
Points out how published reports
of
clinical trials may omit important information that is in the clincial
trial reports.
Eddington, Eugene S., Randomization
Tests, 1995, Marcel Dekker
B. Efron (2010), Large-Scale
Inference: Empirical Bayes Methods for Estimation, Testing, and
Prediction, Cambridge.
Probably the most up-to-date
reference on methods for dealing with multiple inference, esepcially
for large data sets. Also includes a good summary of older developments.
Freedman. David A. (2005).
Statistical Models:
Theory and Practice.
Cambridge
University Press
A
text for a second course in
statistics, focusing mainly on applications to the social and health
sciences and on regression and related topics; full of cautions.
Freedman, David A. (2010), ed. by David Collier, Jasjeet S. Sekhon, and
Philip B. Stark, Statistical Models
and Causal Inference: A Dialogue with the Social Sciences,
Cambridge
Freedman
passed away in 2008, but several of his writings were collected
posthumously in this book. Definitely worth reading.
Also, Philip Stark maintains a website at http://www.stat.berkeley.edu/~census/,
where many of Freedman's preprints and other notes may be
downloaded. "On types of
Scientific Enquiry" and "Oasis or Mirage?" are particularly recommended.
Gigerenzer, Gerd et al
(2007)."Helping doctors and patients make
sense of health statistics," Psychological
Science in the Public Interest,
vo. 8, No. 2, pp. 53 - 96.
Download from http://www.psychologicalscience.org/journals/index.cfm?journal=pspi&content=pspi/8_2
Discusses
a
number of
confusions that affect medical care. Also discusses ways to
explain the topics that can help improve understanding. A
somewhat shortened variation has appeared as "Knowing your chances:
What health stats really mean," Scientific
American Mind, April/May/June
2009, pp. 44 - 51.
Good, Phillip I. and James W. Hardin (2006), Common errors in
Statistics (and How to Avoid Them),
Wiley (Third edition 2009)
Recommended
reading. A notable
quote (p. ix): "...access to statistical software will no more make one
a statistician, than access to a chainsaw will make one
a lumberjack. Allowing these tools to do our thinking for us is a sure
recipe for disaster -- just ask any emergency room physician." Includes
discussion of formulating hypotheses, experimental design,
choice
of estimator and test statistic, model assumptions, strengths and
limitations of various statistical procedures, reporting results,
interpreting results, graphics, model selection, validation. Extensive
references for further reading. One weakness is lack of discussion of
assumptions of statistical procedures.
Good, P. (2005) Introduction to
Statistics Through Resampling Methods and Microsoft Office Excel.
Wiley
Harris, A. H. S., R. Reeder and J. K. Hyun (2009), Common
statistical and research design problems in manuscripts submitted to
high-impact psychiatry journals: What editors and reviewers want
authors to know, Journal of Psychiatric Research, vol 43 no15, 1231
-1234
Discussion of common serious
statistical and design problems in manuscripts submitted to major
psychiatry journals, based on a survey of editors and reviewers of
those journals. Intended to help researchers and authors improve the
quality of research and manuscripts submitted to journals, and to
forestall the waste of time and resources that occurs when papers are
rejected because of poor quality and then resubmitted to other journals
in the hope that another journal will accept them.
Hochberg, Y. and Tamhane,
A. (1987) Multiple Comparison
Procedures, Wiley
Ioannidis
JPA (2005) Why Most Published
Research Findings Are False.
PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124, available at http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124
Popular press accounts include: David
H. Freedman, Lies, Damned Lies, and Medical Science, The Atlantic, November 2010, http://www.theatlantic.com/magazine/archive/2010/11/lies-damned-lies-and-medical-science/8269/1/
Jonah Lehrer, The Truth Wears Off: Is there something wrong with
the scientific method?, The New Yorker, December 13, 2010, http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_lehrer
Ioannidis, John P. A., An Epidemic of False Claims, Scientific American, June, 2011,
http://www.scientificamerican.com/article.cfm?id=an-epidemic-of-false-claims
Koenker, Roger (2005). Quantile
Regression, Econometric
Society Monographs, Cambridge.
Liu, Wei (2011) Simultaneous Inference in Regression,
CRC Press. Liu also has Matlab® programs for calculating the
confidence bands available from his website.(Click
on the link to the book.)
Marshall, E. (2011). Unseen world of
clincical trials emerges from US database, Science 333:145.
An interview with the director of
ClinicalTrials.gov, pointing out some potential problems with the
design and analysis of clinical trials.
Meng, Xiao-Li (2009) "Desired
and Feared -- What Do We Do Now and
Over the Next 50 Years?", The
American Statistician, vol.
63 No. 3, pp. 202 - 210. Download
pdf from Andrew Gelman's website.
A
discussion, by the chair of the
Harvard Statistics Department, of some of the challenges and
opportunities facing the profession. Sections 6 - 8 (pp. 205 - 208) are
particularly relevant to the topic of this web site.
Moore, David S., together with various co-authors, has written
various introductory statistics texts (e.g., The Basic Practice of Statistics,
and Introduction to the
Practice of Statistics, with George P. McCabe), published by
Freeman, that are
among the best for pointing out many of the common
errors in using statistics.
Moore, Thomas (2010), Using baboon “mothering”
behavior to teach permutation tests, Cause Webinar, http://www.causeweb.org/webinar/teaching/2010-09/.
Video and power-point slides. A gentle introduction to permutation
tests.
Rice Virtual Lab in Statistics, Simulations/Demonstrations, http://onlinestatbook.com/stat_sim/index.html.
Several simulations that can help
illustrate various concepts and
potential pitfalls in using statistics.
Robbins, N. (2004), Creating
More Effective Graphs, Wiley
Many examples of poor graphs and better
alternativieis.
Ryan, Thomas P. (2009), Modern
Regression Methods, Wiley.
A
good resource for those
teaching, using or
interpreting regression. Points out many common misunderstandings,
misapplications, and misinterpretations. An extensive chapter on
diagnostics and remedial measures. Discussion of many controversies.
Extensive references are included.
Seber, George A. F. and Mohammad M. Salehi (2013) , Adaptive Sampling Designs: Inference for
Sparse and Clustered Populations, Springer
S. Senn and S. Julious (2009), Measurements in clinical trials: A
neglected issue for statisticians? Statistics in Medicine 28: 3189-3209
Discussion of some statistical
issues involved in choosing predictor and outcome variables.
Strasak, A. M et al (2007a). Statistical errors in medical
reseaerch - a review of common pitfalls, Swiss Medical Weekly
2007; 137, 44 - 49, available at http://www.smw.ch/for-readers/archive/pdf-1999-2010/2007/2007-01-27/
Discussion (as well as
presentation in table form) of 47 common statistical pitfalls in
medical research. Although aimed at medical researchers, the article
can serve as a guideline for researchers in many other fields. See also
teh companion article by Young.
Strasak, A. M. et al (2007b), The Use of Statistics in
Medical
Research, The American Statistician.
February 1, 2007, 61(1): 47-55
A survey of articles in the 2004
volumes of The New England
Journal of Medicine and Nature
Medicine examining use of statistics and errors in using or
reporting statistical techniques.
Utts, Jessica (2005) Seeing
Through Statistics,
Brooks/Cole
(Thompson)
An
introduction to statistics
aimed at the consumer rather than the producer. Each chapter starts
with several "thought questions." Includes a sections on reading a news
report of a study and one on wording of questions; several "Cautions",
"Warnings" and "Difficulties and Disasters" sections; and
lots of
case studies.
Van Belle, Gerald
(2008). Statistical
Rules of Thumb, 2nd ed.,
Wiley.
Lots
of suggestions that can help
forestall mistakes, but not all-inclusive. Includes a section
on Evidence Based Medicine.
Wainer, Howard (1997) Visual
Revelations, Copernicus
(Springer-Verlag).
Chapter
1 ("How to Display Data
Badly," pp. 11 - 46) gives many examples of poor graphical displays, as
well as better alternatives. Chapters 8 - 10 (pp. 87 - 102) point out
shortcomings of three commonly used graphical formats (pie charts,
double Y-axis graphs, and tabular presentations), with suggestions on
improvements, alternatives, and/or when to use and when not to use
these formats. The whole book has lots of interesting examples.
Wainer, Howard
(2009) Picturing
the Uncertain World,
Princeton University Press
Full
of case studies focusing on
how the way data are presented can influence what we see or don't see.
P. H. Westfall and S. S. Young (1993), Resampling-based Multiple Testing:
Examples and Methods for p-Value Adjustment, Wiley
Uses an "adjusted p-value"
approach to multiple testing, based on resampling methods.
Woloshin, Steven, Schwartz, Lisa, and Welch, H. Gilbert
(2008). Know
Your Chances, University of
California Press.
A
primer on health risks at a very
basic level of quantitative literacy.
Young, James (2007), Statistical errors in medical research - a
chronic disease? Swiss Medical Weekly 2007; 137, 41 - 43,
available at http://www.smw.ch/for-readers/archive/pdf-1999-2010/2007/2007-01-27/
A commentary and elaboration on
Strasak et al (2007b) above. (Young is Statistical Advisor for the
Swiss Medical Weekly)
Last updated June 4, 2013