Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Date of Graduation

Fall 12-15-2012

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Department of Graduate Psychology

Advisor(s)

Christine E. DeMars

Abstract

Over the past decade, educational policy trends have shifted to a focus on examining students’ growth from kindergarten through twelfth grade (K-12). One way states can track students’ growth is with a vertical scale. Presently, every state that uses a vertical scale bases the scale on a unidimensional IRT model. These models make a strong but implausible assumption that a single construct is measured, in the same way, across grades. Additionally, research has found that variations of psychometric methods within the same model can result in different vertical scales. The purpose of this study was to examine the impact of three IRT models (unidimensional model, U3PL; bifactor model with grade specific subfactors, BG-M3PL; and a bifactor model with content specific subfactors, BC-M3PL); three calibration methods (separate, hybrid, and concurrent), and two scoring methods (EAP pattern and EAP summed scoring; EAPSS) on the resulting vertical scales. Empirical data based on a states’ assessment program were used to create vertical scales for Mathematics and Reading from Grades 3-8. Several important results were found. First, the U3PL model always resulted in the worst model-data fit. The BC-M3PL fit the data best in Mathematics and the BG-M3PL fit the data best in Reading. Second, calibration methods led to minor differences in the resulting vertical scale. Third, examinee proficiency estimates based on the primary factor for each model were generally highly correlated (.97+) across all conditions. Fourth, meaningful classification differences were observed across models, calibration methods, and scoring methods. Overall, I concluded that none of the models were viable for developing operational vertical scales. Multidimensional models are promising for addressing the current limitations of unidimensional models for vertical scaling but more research is needed to identify the correct model specification within and across grades. Implications for these results are discussed within the context of research, operational practice, and educational policy.

Recommended Citation

Koepfler, James, "Examining the bifactor IRT Model for vertical scaling in K-12 assessment" (2012). Dissertations, 2014-2019. 69.
https://commons.lib.jmu.edu/diss201019/69

Download

Included in

Psychology Commons

COinS

Dissertations, 2014-2019

Examining the bifactor IRT Model for vertical scaling in K-12 assessment

Creative Commons License

Date of Graduation

Document Type

Degree Name

Department

Advisor(s)

Abstract

Recommended Citation

Included in

Browse

Search

Author Corner

Other JMU Repositories

Links

Dissertations, 2014-2019

Examining the bifactor IRT Model for vertical scaling in K-12 assessment

Author

Creative Commons License

Date of Graduation

Document Type

Degree Name

Department

Advisor(s)

Abstract

Recommended Citation

Included in

Share

Browse

Search

Author Corner

Other JMU Repositories

Links