Inadequate Attention to Missing Data

Dealing with Missing Data

Many methods have been proposed for dealing with missing data¹, but these typically make assumptions that may be difficult or impossible to verify. Michael Daniels and Joseph Hogan summarize some of the problems as follows:

"When data are incomplete, inference about parameters of interest cannot be carried out without the benefit of subjecive assumptions about the distribution of missing responses. They are subjective because data cannot be used to critique them. Some of these assumptions are used with such regularity that we forget they are being made; for example, when commercial software such as SAS or Stat is used to analyze incomplete longitudinal data using a random effects model, the missing at random (MAR) assumption is being used; when the Kaplan-Meier estimator is used to summarize a survival curve from censored event times, non-informative censoring is being assumed. Neither assumption can be formally checked, so the validity of inferences relies on subjective judgment."²

Thus dealing with missing data is a real problem in statistics. There are at least a couple of types of active research in this area:

Daniels and Hogan³ propose using Bayesian methods, incorporating relevant information available as a prior distribution.

Researchers are using data bases of medical information to test out different ways of dealing with missing data and, more generally, observational data.⁴

Footnotes:

1. See C. K. Enders and A. C. Gottschall (2011). The Impact of Missing Data on the Ethical Quality of a Research Study, Chapter 14 in A.T. Panter and S. K. Sterba, Handbook of Ethics in Quantitative Methodology, Routledge for discussion of some such methods.

2

. M. Daniels and J. Hogan (2008). Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis, Chapman and Hall/CRC, pp.

xvii - xviii.

3. See Note 3

4. See, for example, D. Madigan and P. Ryan (2011), What can we really learn from observational studies? Epidemiology vol 22, pp. 629 - 631, available at http://scholar.google.com/scholar?hl=en&as_sdt=0,44&cluster=5198548572751487812

Last updated February 4, 2013

COMMON MISTEAKS MISTAKES IN USING STATISTICS: Spotting and Avoiding Them

Introduction Types of Mistakes Suggestions Resources Table of Contents About Glossary Blog

Dealing with Missing Data