SRI Authors: Andreas Tsiartas, Jennifer Smith, Harry Bratt, Colleen Richey, Nonye M. Alozie
N. Bassiou, A. Tsiartas, J. Smith, H. Bratt, C. Richey, E. Shriberg, C. D’Angelo, and N. Alozie, “ Privacy- preserving speech analytics for automatic assessment of student collaboration,” in INTERSPEECH 2016 — 17th Annual Conference of the International Speech Communication Association, Proceedings, San Francisco, California, USA, September 8-12, 2016, pp. 888-892.
This work investigates whether nonlexical information from speech can automatically predict the quality of small-group collaborations. Audio was collected from students as they collaborated in groups of three to solve math problems. Experts in education hand-annotated 30-second time windows for collaboration quality. Speech activity features, computed at the group level, and spectral, temporal and prosodic features, extracted at the speaker level, were explored. Fusion on features was also performed after transforming the later ones from the speaker to the group level. Machine learning approaches using Support Vector Machines and Random Forests show that feature fusion yields the best classification performance. The corresponding unweighted average F1 measure on a 4-class prediction task ranges between 40% and 50%, much higher than chance (12%). Speech activity features alone are also strong
predictors of collaboration quality achieving an F1 measure that ranges between 35% and 43%. Spectral, temporal and prosodic features alone achieve the lowest classification performance, but still higher than chance, and exhibit considerable contribution to speech activity feature performance as validated by the fusion results. These novel findings illustrate that the approach under study seems promising for monitoring of group dynamics and attractive in many collaboration activity settings where privacy is desired.