Machine Learning Methods for bias correction and precision optimization using covariate adjustment

Date:


Talk Summary:

In randomized trials, adjustment for pre-specified baseline covariates results in efficiency gains for estimated treatment effects. The effect estimate remains unbiased for complete data even when the adjustment model is misspecified. When outcome data are missing, however, misspecification of an adjustment model can lead to biased treatment effect estimates. This paper investigates the use of machine learning (ML) for the adjustment model and addresses two questions. For complete data, we investigate whether ML improves efficiency gains relative to a misspecified adjustment model. Here we find that improvements are directly related to proportion variation explained by baseline covariates under the correct model. For missing data, we examine whether using ML can improve efficiency while avoiding bias attributable to model misspecification. Similar findings hold for missing data, with the degree of bias correction depending on the missing data mechanism. The methods and findings are illustrated in simulation studies and application to a randomized trial and can provide additional guidance for the appropriate use of covariate adjustment in randomized trials.


Keywords: Machine Learning, Bias Correction, Covariate adjustment, Precision optimization, Model specification, RCTs, Missing data.