Data snooping can be done professionally and ethically, or misleadingly and unethically, or misleadingly out of ignorance. Data snooping misleadingly out of ignorance is a common error in using statistics. The problems with data snooping are essentially the problems of multiple inference.

One way in which researchers unintentionally obtain misleading results by data snooping is in failing to account for all of the data snooping they engage in. In particular, in accounting for Type I error when data snooping, you need to count not just the actual hypothesis tests performed, but also all comparisons looked at when deciding which post hoc (i.e., not pre-planned) hypothesis tests to try.

Example:
A group of researchers plans to compare three
dosages
of a drug in a clinical trial. There is no
pre-planned intent to compare effects broken down by sex, but the sex
of the subjects is recorded. The researchers have decided to have an
overall Type I error rate of 0.05, allowing 0.03 for the pre-planned
inferences and 0.02 for any data snooping they might decide to do. The
pre-planned comparison shows no statistically significant difference
between the three dosages when the data are not broken down by sex.
However, since the researchres have recorded sex of the patients, they
decide
to look at the outcomes broken down by combination of sex and dosage.
They
notice that the results for women in the high-dosage group look much
better than the results for the men in the low dosage group, and
perform a hypothesis test to check that out. In accounting for Type I error, the
researchers need to take the number of
data-snooping inferences performed as 15, not one. The reason is
that they have looked at fifteen comparisons: there are 3×2
= 6 dosage-by-sex combinations, and hence (6×5)/2
= 15 pairs of dosage-by-sex
combinations. Thus the significance level for the post hoc test should
not be 0.02. but 0.02/15^{1}.

For some discussions of multiple inference and data snooping with a humerous slant, see:

Seife,
Charles, The Mind-Reading Salmon: The True Meaning of Statistical
Significance, Scientific American,
August 21, 2011

XKCD, Significant

XKCD, Significant

Suggestions for data snooping professionally and ethically

I. Educate yourself on the
limitations of statistical inference: Model assumptions, the
problems of Types I and II errors, power, and multiple
inference, including the "hidden comparisons" that may be involved
in data snooping (as in the above example).

II. Plan your study to take into account the problems involving model assumptions, Type I and II errors, power, multiple inference. Some specifics to consider:

II. Plan your study to take into account the problems involving model assumptions, Type I and II errors, power, multiple inference. Some specifics to consider:

a. If you will be
gathering data, decide before
gathering the data:

- What questions you are trying to answer.
- How you will gather the data, and the inference procedures you intend to use to help answer your questions. These need to be planned together, to maximize the chances that the data will fit the model assumptions of the inference procedures.
- Whether or not you will engage in data
snooping.

- The Type I error rate (or the false discovery rate) and power that would be appropriate (considering the consequences of Type I and Type II errors in the situation you are studying). Be sure to allow some portion of Type I error for any data snooping you think you might do.

Then do a power analysis to
see what sample size is
needed to meet these criteria.

- Take into account any relevant
considerations such as intent-to-treat
analysis or how you will deal
with missing data.

- If the sample size needed is too large for your resources, you will need to either obtain additional resources or scale back the aims of your study.

b. If you plan to use
existing data, you will need to go
through a process similar to that in (a) before looking at the data:

III. Report your results carefully, aiming for honesty and transparency- Decide what questions you are trying to answer.
- Find out how the data were gathered.
- Decide on inference procedures that i) will address your questions of interest and ii) have model assumptions compatible with how the data were collected. If this turns out to be impossible, the data are not suitable.
- Decide whether or not you will engage in data snooping.
- Decide the type I error rate (or false discovery rate) and power that would be appropriate (considering the consequences of these Type I and Type II errors in the situation you are studying). Remember to allow some portion of Type I error for data snooping, if you are likely to engage in any.

- Take into account any relevant
considerations such as intent-to-treat
analysis or how you will deal
with missing data.

- If the sample size needed is larger than the available data set, you will need to either scale back the aims of your study, or find or create another larger data set.

- Be careful to do the randomization in a manner that preserves the structure of the data. For example, if you have students nested in schools nested in school districts, you need to preserve the nesting: if a particular student is assigned to one group (discovery or confirmatory), then the student's school and school district need to be assigned to the same group.
- Using a type I error rate or false discovery rate may not be obligatory in the discovery phase, but may be practical to help you keep the number of hypotheses you generate down to a level that you will be able to test (with a reasonable bound on Type I error rate or false discovery rate, and a reasonable power) in the confirmatory phase
- A preliminary consideration of Type I errors and power should be done to help you make sure that your confirmatory data set is large enough. Be sure to then give further thought to consequences of Type I and II errors for the hypotheses you generate with the discovery data set, and set an overall Type I error rate (or false discovery rate) for the confirmatory stage.

- State clearly the questions you set out to study.
- State your methods, and your reasons for choosing those methods. (e.g., why you chose the inference procedures you used; why you chose the Type I error rate and power that you used)
- Give details of how your data were collected.
- State clearly what (if anything) was data snooping, and how you accounted for it in overall Type I error rate or False Discovery Rate.
- Include a "limitations" section, pointing out any limitations and uncertainties in the analysis (e.g., if power was not large enough to detect a practically significant difference; any uncertainty in whether model assumptions were satisfied; if there was possible confounding; if missing data created additional uncertainty, etc.)
- Be careful not to inflate or over-interpret conclusions, either
in
the abstract or in the results or conclusions sections.

Notes:

1. This is assuming a Bonferroni procedure. If another multiple inference procedure is available, it might give an effective individual significance level somewhat higher than 0.02/15.

This page last revised 11/3/2011