Multiple imputation is better than imputing single values

From the multiple imputation book.

The core idea here:

His solution was simple and brilliant: create multiple imputations that reflect the uncertainty of the missing data.

Also:

Drawing imputations from a distribution, instead of estimating the “best” value, was a drastic departure from everything that had been done before. Rubin’s original proposal did not include formulae for calculating combined estimates, but instead stressed the study of variation because of uncertainty in the imputed values. The idea was rooted in the Bayesian framework for inference, quite different from the dominant randomization-based framework in survey statistics.

So if we posit a distribution of values for missing data, we can run any analyses across each imputed version, thereby giving us a distribution over all possible outcomes.

Very Bayesian indeed!


Also, taken from Schafer (1999):

...Rubin recommends that imputations be created through Bayesian arguments: specify a parametric model for the complete data (and, if necessary, a model for the mechanism by which data become missing), apply a prior distribution to the unknown model parameters, and simulate $m$ independent draws from the conditional distribution of $Y_m$ is given $Y_{obs}$ by Bayes' theorem.