Chapter 6 Procedures when we have missing data

Attrition may occur because the researcher cannot obtain the outcome data, the researcher loses track of the subjects, the subjects refuse to cooperate, and many other reasons. Attrition can be random, conditionally random, confined to a subgroup, or other (Gerber and Green, 2012, 220-230).

6.1 Missing Independent of Potential Outcomes (MIPO)

If we suspect that our data is missing independent of potential outcomes, this type of attrition can be seen as random and should have no effect on outcomes. Therefore, we can directly estimate the ATE in our experiment without concern for bias.

6.2 Missing Independent of Potential Outcomes Given X (MIPO|X)

If we suspect that our data is missing independent of potential outcomes given X, this type of attrition can be seen as random conditional on X, a pre-treatment covariate. This conditionality suggests that within each subgroup of covariate X, our missing data is random. We can have an unbiased estimate by taking the weighted average within each subgroup.

If there is missing data within one subgroup, we could use inverse probability weighting to obtain the average effect, where we divide the outcome recorded for each subject without missing data by the inverse of the ratio of subjects treated without missing data in the subgroup to the total subjects treated in the subgroup. We can then subtract the results of the control from the treats of the treated to obtain ATE.

6.3 Bounds

If we are unsure about whether our missing data is random, we may place bounds on the treatment effect by filling in the missing data with extremely high or extremely low outcomes and estimating the ATE after filling in the missing data. We determine a range of outcomes for all subjects. We fill in all of the missing data with the highest value in the range to estimate the upper bound ATE. We fill in all of the missing data with the lowest value in the range to estimate the lower bound ATE.

We now have some information that the true ATE lies within the upper and lower bounds. However, the greater the rate of attrition, larger the difference between the bounds and the less informative the bounds will be.

6.4 Sensitivity Analyses