Common Mistakes in Using Statistics - Spotting Them and Avoiding Them

2016 Summer Statistics Institute Course, University of Texas at Austin

May 23 - 26, 2016

Teasers

    I invite you to look at some of these before class to pique your interest. (If you have some time, you might also try looking at some of the External Links further down on this page.)

       Vox Science and Health article by Julia Belluz (including an interview with Ben Goldacre) about some of the problems contributing to the unfortunate approval of Paxil for children.

        Online articles on the American Statistical Association's March, 2016 Statement on Statistical Significance and p-Values:

            Nature

            Science Daily

        FiveThirtyEight

            Science News

            Retraction Watch

            (The ASA press release is here; the actual statement here.)

        Response to a critic of the website PubPeer

        And if you're up for a 20-minute irreverantly humorous video, see Last Night with John Oliver: Scientific Studies

Course Notes

Please Note:

Files are in pdf format.
Most students will want to download the slides and either print them to take notes on in class, or follow along in class on their laptop.
The appendices contain additional material and references for the corresonding day's slides. They are available for your reference later according to your own needs.
Copies of course materials will not be handed out in class.
Computers for individual use will not be available in the classroom for this particular course.
If you need a different print size or would prefer a .doc file to take notes on, please email me so I can email you .doc files to adjust to your needs. (Please note: Past experience is that .doc files often acquire changes in formatting or symbols when posted on the web; this might also happen with email attachments on some platforms.)

    Day                                          Slides (2 per sheet)                                Appendices

1 (Mon May 23)                            Slides Day 1                                        Appendix Day 1

2 (Tues May 24)     (Be sure to download all three files for Day 2)             (No appendix for Day 2; references in slides)
                                                 Slides Day 2 part 1 (pp. 1 - 32)
                                                   Slides Day 2 part 2 (p. 33)
                                                       Slides Day 2 part 3 (pp. 34 - 70)

3 (Wed May 25)
                                                       Slides Day 3                                        Appendix Day 3


4 (Th May 26)                               Slides Day 4                                              Appendix Day 4

Additional Appendices

Suggestions for Readers of Research Suggestions for Researchers

Suggestions for Teachers Suggestions for Reviewers, Editors, and IRB Members

External Links

Please note: Some of these links use Java applets, which your computer might block (depending on the verison of Java you have and your security settings.)

Zachary David's Graph of Probability of First Marriage Disruption by Duration of Marriage

Empirical Probability Example

Matthew Hankins' slideshare How Does Health Psychology Measure Up?

Rice Virtual Lab in Statistics Sampling Distribution Simulation

ArtofStat shinyapp of sampling distribution of the mean

How Not to be Misled by the Jobs Report
    Includes two simulations showing how sampling variability can tempt people to see patterns that aren't there.

Rossman-Chance Confidence Interval Simulation
    Try settings: Means, Normal, t, with defaults for the rest of the settings.
    Click "sample" several times, watching how the CI changes.
    Set "intevals" to 20 to see 20 CI's at once. Notice the Running Total.

ArtofStat Confidence Interval Simulation

Wise Confidence Interval Simulation

Matthew Hankins' Still Not Significant List

Rice Virtual Lab in Statistics Robustness Simulation

ArtofStat Error and Power Simulation

Jerry Dallal's Simulation of Multiple Testing

This simulates the results of 100 independent hypothesis tests, each at 0.05 significance level. Click the "test/clear" button to see the results of one set of 100 tests (that is, for one sample of data). Click the button two more times (first to clear and then to do another simulation) to see the results of another set of 100 tests (i.e., for another sample of data). Notice as you continue to do this that i) which tests give type I errors (i.e., are statistically significant at the 0.05 level) varies from sample to sample, and ii) which samples give type I errors for a given test varies from test to test. (To see the latter point, it may help to focus just on the first column.)

Jelly Beans (A Folly of Multiple Testing and Data Snooping)

More Jerry Dallal Simulations: More Jelly Beans Cellphones and Cancer Coffee and ...

Crowdsourced Research

Spurious Correlations

Distrust Your Data: Jacob Harris on Six Ways to Make Mistakes with Data
A case study illustrating six common mistakes (including "sloppy proxies" in analyzing data.)

NIH funds training in behavioral intervention to slow progression of cancer by improving the immune system Both the blog post by James Coyne and many of the comments provide examples of several questionable practices.

Negative Consequences of Dichotomizing Continuous Predictor Variables (applet demo)

Websites for Future Reference

Cross Validated "a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization. It's 100% free" You can search to see if there is already discussion of a question you have about ststistics, ask a question, or contribute answers or suggestions to other prople's questions. (A community of Stack Exchange)

PubPeer: The online journal club You can search for comments on a publication, provide feedback, or start a converstation.

Retraction Watch "Tracking retractions as a window into the scientific process."

Website on Common Misteaks Mistakes in Using Statistics

Content similar to the content of the course notes, but includes embedded links and more information. (However, needs some updates!)

Blog: Musings on Using and Misusing Statistics

A companion to the preceding website Common Mistakes in Using Statistics. It contains updates to that site and occasional comments on other things related to statistics that come to my attention. It may be of interest to the following categories of people:

    Teachers of statistics (especially those, such as myself, who come from backgrounds other than statistics)
    Undergraduates and early graduate students in statistics
    Users of statistics (especially people who read research using statistics)

See especially the series of eight "Beyond the Buzz" posts (June 24 - August 26, 2024) discussing two of the articles in the May, 2014 special issue of the journal Social Psychology devoted to registered reports. These posts show how registered replications can exemplify poor practices and thus do not alone solve the problem of possibly misleading findings.

Last updated May 23, 2016