Common Mistakes in Using Statistics - Spotting Them and Avoiding
Them
2016 Summer Statistics Institute Course, University of Texas at
Austin
May 23 - 26, 2016
Teasers
I invite you to look at some of these before class
to pique your interest. (If you have some time, you might also try
looking at some of the External Links further down on this page.)
Vox
Science and Health article by Julia Belluz (including an interview with
Ben Goldacre) about some of the problems contributing to the
unfortunate approval of Paxil for children.
Online articles on the American
Statistical Association's March, 2016 Statement on Statistical
Significance and p-Values:
Nature
Science Daily
FiveThirtyEight
Science News
Retraction Watch
(The ASA press
release is here;
the actual statement here.)
Response to a critic of the
website PubPeer
And if you're up for a 20-minute
irreverantly humorous video, see Last
Night with John Oliver: Scientific Studies
Course Notes
Please
Note:
- Files are in pdf format.
- Most students will want to download the slides and either print them to
take notes on in class, or follow along in class on their
laptop.
- The appendices contain
additional material and references
for the corresonding day's slides.
They are available for your
reference later according to your own needs.
- Copies of course materials will not
be handed out in class.
- Computers for individual use will not be available in the classroom
for this particular course.
- If you need a different print size or would prefer a .doc file to
take notes on, please email me so
I can email you .doc files to adjust
to your needs. (Please note: Past experience is that .doc files often
acquire
changes in formatting or symbols when posted on the web; this might
also happen with email attachments on some platforms.)
Additional Appendices
Suggestions for
Readers of
Research
Suggestions for
Researchers
Suggestions
for Teachers
Suggestions
for Reviewers,
Editors, and IRB Members
External Links
Please note: Some of these
links use Java applets, which your computer might block (depending on
the verison of Java you have and your security settings.)
Zachary
David's Graph of Probability of First Marriage Disruption by Duration
of Marriage
Empirical
Probability Example
Matthew Hankins' slideshare How
Does Health Psychology Measure Up?
Rice
Virtual Lab in Statistics Sampling Distribution Simulation
ArtofStat shinyapp
of sampling distribution of the mean
How
Not to be Misled by the Jobs Report
Includes two simulations showing how sampling
variability can tempt people to see patterns that aren't there.
Rossman-Chance
Confidence Interval Simulation
Try settings: Means, Normal, t, with defaults for
the rest of the settings.
Click "sample" several times, watching how the CI
changes.
Set "intevals" to 20 to see 20 CI's at once. Notice
the Running Total.
ArtofStat
Confidence Interval Simulation
Wise
Confidence Interval Simulation
Matthew
Hankins' Still Not Significant List
Rice
Virtual Lab in Statistics Robustness Simulation
ArtofStat Error and Power
Simulation
Jerry Dallal's
Simulation of Multiple Testing
This simulates the results of 100
independent hypothesis tests, each at 0.05 significance level. Click
the "test/clear" button to see the results of one set of 100
tests (that is, for one sample of data). Click the button two more
times (first to clear and then to do another simulation) to see the
results of another set of 100 tests (i.e., for another sample of data).
Notice as you continue to do this that i) which tests give type I
errors (i.e., are statistically significant at the 0.05 level) varies
from sample to sample, and ii) which samples give type I errors for a
given test varies from test to test. (To see the latter point, it may
help to focus just on the first column.)
Jelly Beans (A Folly of Multiple
Testing and Data Snooping)
More Jerry Dallal Simulations: More Jelly Beans
Cellphones and
Cancer Coffee and ...
Crowdsourced
Research
Spurious Correlations
Distrust
Your Data: Jacob Harris on Six Ways to Make Mistakes with Data
A case study illustrating six common mistakes
(including "sloppy proxies" in
analyzing data.)
NIH
funds training in behavioral intervention to slow progression of cancer
by improving the immune system Both the blog post by
James Coyne and many of the comments provide examples of several
questionable practices.
Negative
Consequences of Dichotomizing Continuous Predictor Variables
(applet demo)
Websites for Future Reference
Cross Validated "a
question and answer site for people interested in statistics, machine
learning, data analysis, data mining, and data visualization. It's 100%
free" You can search to see if there is already discussion of a
question
you have about ststistics, ask a question, or contribute answers or
suggestions to other prople's questions. (A community of Stack
Exchange)
PubPeer: The online journal club You
can search for comments on a publication, provide feedback, or start a
converstation.
Retraction Watch "Tracking
retractions as a window into the scientific process."
Content similar to the content of the course notes,
but includes embedded links and more information. (However, needs some
updates!)
Blog: Musings on
Using and Misusing Statistics
A companion to the preceding
website Common Mistakes in Using Statistics. It contains updates to
that site and occasional comments on other things related to statistics
that come to my attention. It may be of interest to the following
categories of people:
Teachers of statistics (especially those, such as
myself, who come from backgrounds other than statistics)
Undergraduates and early graduate students in
statistics
Users of statistics (especially people who read
research using statistics)
See especially the series of eight "Beyond the Buzz" posts (June 24 -
August 26, 2024) discussing two of the articles in the May, 2014
special issue of the journal Social
Psychology devoted to registered reports. These posts show how
registered replications can exemplify poor practices and thus do not
alone solve the problem of possibly misleading findings.
Last updated May 23, 2016