SRI Authors: Jennifer Smith, Harry Bratt, Colleen Richey, Andreas Tsiartas, Nonye M. Alozie
J. Smith, H. Bratt, C. Richey, N. Basiou, E. Shriberg, A. Tsiartas, C. D’Angelo and N. Alozie, “Spoken Interaction modeling for automatic assessment of collaborative learning,” in Proc. Speech Prosody 2016, pp. 277-281.
Collaborative learning is a key skill for student success, but simultaneous monitoring of multiple small groups is untenable for teachers. This study investigates whether automatic audio- based monitoring of interactions can predict collaboration quality. Data consist of hand-labeled 30-second segments from audio recordings of students as they collaborated on solving math problems. Two types of features were explored: speech activity features, which were computed at the group level; and prosodic features (pitch, energy, durational, and voice quality patterns), which were computed at the speaker level. For both feature types, normalized and unnormalized versions were investigated; the latter facilitate real-time processing applications. Results using boosting classifiers, evaluated by F-measure and accuracy, reveal that (1) both speech activity and prosody features predict quality far beyond chance using majority-class approach; (2) speech activity features are the better predictors overall, but class performance using prosody shows potential synergies; and (3) it may not be necessary to session-normalize features by speaker. These novel results have impact for educational settings, where the approach could support teachers in the monitoring of group dynamics, diagnosis of issues, and development of pedagogical intervention plans.