Applying a multiple comparison control to IRT item-fit testing
Publication Date
2020
Document Type
Article
Abstract
We used simulation techniques to assess the item-level and familywise Type I error control and power of an IRT item-fit statistic, the S-χ2. Previous research indicated that the S-χ2has good Type I error control and decent power, but no previous research examined familywise Type I error control. We varied percentage of misfitting items, sample size, and test length, and computed familywise Type I error with no correction, a Bonferroni correction, and a Benjamini-Hochberg correction. The S-χ2 controlled item-level and familywise Type I errors when corrections were applied to conditions with no misfitting items. In the presence of misfitting items, the S-χ2 exhibited inflated item-level and familywise false hit rates in many conditions, even with familywise Type I error corrections. Lastly, power was low and negatively impacted when either of the familywise Type I error corrections was applied. We suggest using the S-χ2 with no familywise Type I error control in conjunction with other methods of assessing item fit (e.g., visual analysis).
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Recommended Citation
Sauder, D. C. & DeMars, C. E. (2020). Applying a multiple comparison control to IRT item-fit testing. Applied Measurement in Education, 33, 362-377. https://doi.org/10.1080/08957347.2020.1789138