The influence of covariate measurement error on treatment effect estimates and numeric balance diagnostics following several common methods of propensity score matching: A simulation study
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Date of Graduation
Doctor of Philosophy (PhD)
Department of Graduate Psychology
Sonia J. Horst
In applied intervention studies, researchers frequently aim to make inferences about the impact of a treatment program on participants. However, applied researchers are often faced with threats to the internal validity of their studies, or the extent to which changes in participants’ outcomes can be attributed to the intervention. When researchers are unable to randomly assign study participants to treatment conditions, changes in the intervention outcome might be confounded with systematic differences in participants’ baseline characteristics. Propensity score matching is one technique that allows researchers to account for threats to the internal validity of a study. Specifically, using propensity score matching methods, researchers construct a qualitatively-similar comparison group based on participants’ characteristics at baseline (i.e., covariates).
In addition to threats to the internal validity of a study, measurement error is a reality with which many applied researchers must contend. However, research on the impact of covariate score measurement error on the quality of matches and the accuracy of treatment effect estimates is sparse in the propensity score matching literature. Consequently, the purpose of the current study was to evaluate how different levels and types of measurement error impacted the quality of propensity score matched groups and the accuracy of treatment effect estimates.
A simulation study was conducted to manipulate both the levels of measurement error (e.g., 10% versus 60% unreliability) and the types of measurement error (e.g., treatment and comparison group scores measured with the same level of reliability versus different levels of reliability). Four common propensity score matching methods were then used to create comparison groups, including nearest neighbor matching, nearest neighbor matching with a 0.2 caliper, optimal matching, and Mahalanobis distance matching. Numeric diagnostic information and the accuracy of treatment effect estimates were then evaluated. When unreliable covariates were included in the model, the final propensity score matched groups appeared balanced on the unreliable covariates. However, propensity score matching was not able to appropriately account for the full influence of the covariates on treatment effect estimates. That is, as the level of measurement error increased, the estimated treatment effect also increased, resulting in a higher estimated treatment effect than the simulated treatment effect.
Harris, Heather D., "The influence of covariate measurement error on treatment effect estimates and numeric balance diagnostics following several common methods of propensity score matching: A simulation study" (2018). Dissertations, 2014-2019. 173.