Common Mistakes in Using Statistics - Spotting Them and Avoiding Them

2016 Summer Statistics Institute Course, University of Texas at Austin

May 23 - 26, 2016 

Teasers

    I invite you to look at some of these before class to pique your interest. (If you have some time, you might also try looking at some of the External Links further down on this page.)

        Vox Science and Health article by Julia Belluz (including an interview with Ben Goldacre) about some of the problems contributing to the unfortunate approval of Paxil for children.  

        Online articles on the American Statistical Association's March, 2016 Statement on Statistical Significance and p-Values:

            Nature

            Science Daily

            FiveThirtyEight         
           
            Science News

            Retraction Watch
       
            (The ASA press release is here; the actual statement here.)

        Response to a critic of the website PubPeer
   
        And if you're up for a 20-minute irreverantly humorous video, see Last Night with John Oliver: Scientific Studies

Course Notes 

Please Note:

    Day                                             Slides (2 per sheet)                                Appendices

1 (Mon May 23)                            Slides Day 1                                           Appendix Day 1

2 (Tues May 24)     (Be sure to download all three files for Day 2)             (No appendix for Day 2; references in slides)                                                                                                                           
                                                       Slides Day 2 part 1 (pp. 1 - 32)
                                                       Slides Day 2 part 2 (p. 33)
                                                       Slides Day 2 part 3 (pp. 34 - 70)

3 (Wed May 25)
                                                       Slides Day 3                                             Appendix Day 3   
                                                            

4 (Th May 26)                               Slides Day 4                                              Appendix Day 4                                      

Additional Appendices

    Suggestions for Readers of Research            Suggestions for Researchers

    Suggestions for Teachers                            Suggestions for Reviewers, Editors, and IRB Members

External Links

Please note: Some of these links use Java applets, which your computer might block (depending on the verison of Java you have and your security settings.)

Zachary David's Graph of Probability of First Marriage Disruption by Duration of Marriage
 
Empirical Probability Example

Matthew Hankins' slideshare How Does Health Psychology Measure Up?

Rice Virtual Lab in Statistics Sampling Distribution Simulation

ArtofStat shinyapp of sampling distribution of the mean

How Not to be Misled by the Jobs Report
    Includes two simulations showing how sampling variability can tempt people to see patterns that aren't there.

Rossman-Chance Confidence Interval Simulation
    Try settings: Means, Normal, t, with defaults for the rest of the settings.
    Click "sample" several times, watching how the CI changes. 
    Set "intevals" to 20 to see 20 CI's at once. Notice the Running Total.
   
ArtofStat Confidence Interval Simulation

Wise Confidence Interval Simulation

Matthew Hankins' Still Not Significant List

Rice Virtual Lab in Statistics Robustness Simulation

ArtofStat Error and Power Simulation

Jerry Dallal's Simulation of Multiple Testing
This simulates the results of 100 independent hypothesis tests, each at 0.05 significance level. Click the "test/clear" button  to see the results of one set of 100 tests (that is, for one sample of data). Click the button two more times (first to clear and then to do another simulation) to see the results of another set of 100 tests (i.e., for another sample of data). Notice as you continue to do this that i) which tests give type I errors (i.e., are statistically significant at the 0.05 level) varies from sample to sample, and ii) which samples give type I errors for a given test varies from test to test. (To see the latter point, it may help to focus just on the first column.)

Jelly Beans (A Folly of Multiple Testing and Data Snooping)

More Jerry Dallal Simulations: More Jelly Beans    Cellphones and Cancer    Coffee and ...

Crowdsourced Research

Spurious Correlations

Distrust Your Data: Jacob Harris on Six Ways to Make Mistakes with Data
    A case study illustrating six common mistakes (including "sloppy proxies" in analyzing data.)

NIH funds training in behavioral intervention to slow progression of cancer by improving the immune system  Both the blog post by  James Coyne and many of the comments provide examples of several questionable practices.

Negative Consequences of Dichotomizing Continuous Predictor Variables  (applet demo)


Websites for Future Reference


Cross Validated "a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization. It's 100% free" You can search to see if there is already discussion of a question you have about ststistics, ask a question, or contribute answers or suggestions to other prople's questions.  (A community of Stack Exchange)

PubPeer: The online journal club You can search for comments on a publication, provide feedback, or start a converstation.

Retraction Watch "Tracking retractions as a window into the scientific process."

Website on Common Misteaks Mistakes in Using Statistics

    Content similar to the content of the course notes, but includes embedded links and more information. (However, needs some updates!)

Blog: Musings on Using and Misusing Statistics
A companion to the preceding website Common Mistakes in Using Statistics. It contains updates to that site and occasional comments on other things related to statistics that come to my attention. It may  be of interest to the following categories of people:

    Teachers of statistics (especially those, such as myself, who come from backgrounds other than statistics)
    Undergraduates and early graduate students in statistics
    Users of statistics (especially people who read research using statistics)

See especially the series of eight "Beyond the Buzz" posts (June 24 - August 26, 2024) discussing two of the articles in the May, 2014 special issue of the journal Social Psychology devoted to registered reports. These posts show how registered replications can exemplify poor practices and thus do not alone solve the problem of  possibly misleading findings.

Last updated May 23, 2016