Jessica N. Jacovidis
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Date of Award
Doctor of Philosophy (PhD)
Department of Graduate Psychology
Christine E. DeMars
Allison J. Ames
S. Jeanne Horst
Dena A. Pastor
In education, researchers and evaluators are interested in assessing the impact of programs or interventions. Unfortunately, most education programs do not lend themselves to random assignment; participants generally self-select into programs. Lack of random assignment limits the claims that researchers can make about the impact of the program because individuals who self-select into the program may be qualitatively different from individuals who do not self-select into the program. Propensity score matching allows researchers to mimic random assignment by creating a matched comparison group that is similar to the treatment group on researcher-identified variables.
There are a number of matching methods to choose from when employing propensity score matching. Matching methods vary in distance measures, matching algorithms, and rules for comparison group member selection that are used. Thus, the purpose of this study was to examine common matching techniques to determine how they differed in terms of the quantity and quality of matches and whether the results of subsequent group comparisons (e.g., significance test results, estimated effect sizes) varied across the different matching techniques. Differences across effect size, treatment group sample size, comparison-to-treatment ratio, and analysis technique were also examined.
To empirically investigate the performance of common matching methods under known and systematically manipulated conditions, data were simulated to reflect values found in higher education, using a recent study by Jacovidis and her colleagues (in press). The choice of matching method dictates both the quality and quantity of the matches obtained and the resulting outcome analyses (e.g., statistical significance tests and estimated effect sizes). Although nearest neighbor matching with calipers produced better quality matches than the other matching methods, it also resulted in the loss of treatment group members. If treatment group members are excluded from the matched groups, representation of the treatment group could be compromised. If this happens, the researcher may want to select a matching method that does not result in a loss of treatment group members. It is up to the researcher to decide how to best balance the quality and quantity of matches, while recognizing that this decision can impact the accuracy of the outcome analyses.
Jacovidis, Jessica N., "Evaluating the performance of propensity score matching methods: A simulation study" (2017). Dissertations. 149.