Wei, X., & Haertel, E. (2011). The effect of ignoring classroom-level variance in estimating the generalizability of school mean scores. Educational Measurement: Issues and Practice, 30 (1), 13–22.
Contemporary educational accountability systems, including state-level systems prescribed under No Child Left Behind, as well as those envisioned under the “Race to the Top” comprehensive assessment competition, rely on school-level summaries of student test scores. The precision of these score summaries is almost always evaluated using models that ignore the classroom-level clustering of students within schools. This paper reports balanced and unbalanced generalizability analyses investigating the consequences of ignoring variation at the level of classrooms within schools when analyzing the reliability of such school-level accountability measures. Results show that the reliability of school means cannot be determined accurately when classroom-level effects are ignored. Failure to take between-classroom variance into account biases generalizability (G) coefficient estimates downward and standard errors (SEs) upward if classroom-level effects are regarded as fixed, and biases G-coefficient estimates upward and SEs downward if they are regarded as random. These biases become more severe as the difference between the school-level intraclass correlation (ICC) and the class-level ICC increases. School-accountability systems should be designed so that classroom (or teacher) level variation can be taken into consideration when quantifying the precision of school rankings, and statistical models for school mean score reliability should incorporate this information.