Consider elementary school
students' shoe sizes and scores on a standard reading exam. They are
correlated, but saying that larger shoe size causes higher reading
scores is as absurd as saying that high reading scores cause larger
shoe size.

In this example, there is a clear lurking variable, namely, age. As
the child gets older, both their shoe size and reading ability increase.Elaborating on this situation:

If you agree that increasing age
(for elementary school children) causes increasing foot size, and
therefore increasing shoe size, then you expect a correlation between
age and shoe size. Correlation is symmetric, so shoe size and age are
correlated. But it would be absurd to say that shoe size causes age.

In other words, even when there is a causal relationship, the
causality typically only goes one way. (Of course, it could go both
ways, as in a feedback loop.)One situation where people slip into confusing correlation and causality is in regression. For example, one might regress college GPA on SAT scores, obtaining a positive coefficient beta of SAT score in the regression equation. Consider the following two statements:

- An increase of one point in SAT scores causes, on average, an increase of β points in college GPA.
- For every increase of one point in SAT scores, the increase in average college GPA is β points.

"Data
from designed experiments, when analyzed appropriately, allow stronger
(almost) causative inferences, which incubate further scientific
inspiration and hypothesis generation, and so forth, through the cycle.
In the right hands, and with a component of luck, this cycle leads to
great breakthroughs."

Noel Cressie and Christopher K. Wikle,
Statistics for Spatio-Temporal Data,
Wiley, 2011, p. 9

After pointing out problems such as confusing correlation and causation, most statistics textbooks include a statement such as:

"The only legitimate way to try to
establish a causal connection
statistically is through the use of
randomized experiments." ^{2}

Unfortunately, such discussions usually come early in the book, and are not revisited for elaboration later after statistical inference has been discussed. When a well-designed, carefully analyzed experiment (or, better yet, series of experiments) has established good evidence of causality, there is still room for misinterpretation, since usually the analysis is in terms of a summary statistic such as an average. When this is the case, the results do not give evidence to a deterministic causation -- that is, they do not prove that "If this is done, then this will be the result in all cases." Instead, what they say is, "If this is done, under these circumstances, then on average this will be the result."

Thus, for example, it is rare that an experiment will support an assertion such as "If you take this medication, your blood pressure will go down" or "If you do this type of exercise this frequently you will not have a heart attack." All that can be concluded are statements such as, "On average, people who take this medication have a decrease in blood pressure" or "Fewer people who do this type of exercise this frequently have heart attacks than people who don't."

1. The discussion in the linked page is framed in terms of outcome variables, but the considerations apply to predictor variables, such as SAT score, as well.

2. Utts, Jessica (2005) Seeing Through Statistics, Brooks/Cole (Thompson), p. 211. Use of this quote here is not intended as a criticism of this text; the quote is extracted from the context of a very good two-page discussion on establishing causation.

3. What would be of more interest than a difference in means would be the probability that assignment to treatment gives better outcome than assignment to no treatment. This is discussed in Richard H. Browne, The t-Test p Value and Its Relationship to the Effect Size and P(X>Y), The American Statistician, February 1, 2010, 64(1), 30 - 33.

Last updated Sept. 25, 2011