Mistakes in Thinking About Causation

Confusing correlation and causation

Any statistics text worth its salt will caution the reader not to confuse correlation with causation. Yet the mistake is very common. As a refresher, here's an example I often give my classes:

Consider elementary school students' shoe sizes and scores on a standard reading exam. They are correlated, but saying that larger shoe size causes higher reading scores is as absurd as saying that high reading scores cause larger shoe size.

In this example, there is a clear lurking variable, namely, age. As the child gets older, both their shoe size and reading ability increase.

Elaborating on this situation:

If you agree that increasing age (for elementary school children) causes increasing foot size, and therefore increasing shoe size, then you expect a correlation between age and shoe size. Correlation is symmetric, so shoe size and age are correlated. But it would be absurd to say that shoe size causes age.

In other words, even when there is a causal relationship, the causality typically only goes one way. (Of course, it could go both ways, as in a feedback loop.)

One situation where people slip into confusing correlation and causality is in regression. For example, one might regress college GPA on SAT scores, obtaining a positive coefficient beta of SAT score in the regression equation. Consider the following two statements:

An increase of one point in SAT scores causes, on average, an increase of β points in college GPA.
For every increase of one point in SAT scores, the increase in average college GPA is β points.

Statement B is correct (assuming, of course, that the regression has been carried out correctly). Statement A is incorrect: the regression equation gives no information about causality. Indeed, there is likely a lurking variable (or probably a bunch of lurking variables) that affects both GPA and SAT score; SAT score is considered to be a (perhaps crude) measure ¹of this lurking variable.

Interpreting causality deterministically when the evidence is statistical

"Data from designed experiments, when analyzed appropriately, allow stronger (almost) causative inferences, which incubate further scientific inspiration and hypothesis generation, and so forth, through the cycle. In the right hands, and with a component of luck, this cycle leads to great breakthroughs."

Noel Cressie and Christopher K. Wikle, Statistics for Spatio-Temporal Data, Wiley, 2011, p. 9

After pointing out problems such as confusing correlation and causation, most statistics textbooks include a statement such as:

"The only legitimate way to try to establish a causal connection statistically is through the use of randomized experiments." ²

Unfortunately, such discussions usually come early in the book, and are not revisited for elaboration later after statistical inference has been discussed. When a well-designed, carefully analyzed experiment (or, better yet, series of experiments) has established good evidence of causality, there is still room for misinterpretation, since usually the analysis is in terms of a summary statistic such as an average. When this is the case, the results do not give evidence to a deterministic causation -- that is, they do not prove that "If this is done, then this will be the result in all cases." Instead, what they say is, "If this is done, under these circumstances, then on average this will be the result."

Thus, for example, it is rare that an experiment will support an assertion such as "If you take this medication, your blood pressure will go down" or "If you do this type of exercise this frequently you will not have a heart attack." All that can be concluded are statements such as, "On average, people who take this medication have a decrease in blood pressure" or "Fewer people who do this type of exercise this frequently have heart attacks than people who don't."³And, as noted in the quote from Cressie and Wikle above, even this requires careful design of the experiment and appropriate statistical analysis.

1. The discussion in the linked page is framed in terms of outcome variables, but the considerations apply to predictor variables, such as SAT score, as well.
2. Utts, Jessica (2005) Seeing Through Statistics, Brooks/Cole (Thompson), p. 211. Use of this quote here is not intended as a criticism of this text; the quote is extracted from the context of a very good two-page discussion on establishing causation.
3. What would be of more interest than a difference in means would be the probability that assignment to treatment gives better outcome than assignment to no treatment. This is discussed in Richard H. Browne, The t-Test p Value and Its Relationship to the Effect Size and P(X>Y), The American Statistician, February 1, 2010, 64(1), 30 - 33.

Last updated Sept. 25, 2011