back icon
close icon

Capture phrases in quotes for more specific queries (e.g. "rocket ship" or "Fred Lynn")

Article  September 1, 2016

The SRI speech-based collaborative learning corpus

SRI Authors Colleen Richey, Nonye M. Alozie, Harry Bratt



C. Richey, C. D’Angelo, N. Alozie, H. Bratt and E. Shriberg, “The SRI speech-based collaborative learning corpus,” in Proc. INTERSPEECH 2016, pp. 1550-1554, September 2016.


We introduce the SRI speech-based collaborative learning corpus, a novel collection designed for the investigation and measurement of how students collaborate together in small groups. This is a multi-speaker corpus containing high-quality audio recordings of middle school students working in groups of three to solve mathematical problems. Each student was recorded via a head-mounted noise-cancelling microphone. Each group was also recorded via a stereo microphone placed nearby. A total of 80 sessions were collected with the participation of 134 students. The average duration of a session was 20 minutes. All students spoke English; for some students, English was a second language. Sessions have been annotated with time stamps to indicate which mathematical problem the students were solving and which student was speaking. Sessions have also been hand annotated with common indicators of collaboration for each speaker (e.g., inviting others to contribute, planning) and the overall collaboration quality for each problem. The corpus will be useful to education researchers interested in collaborative learning and to speech researchers interested in children’s speech, speech analytics, and speech diarization. The corpus, both audio and annotation, will be made available to researchers.

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you. Expect a response within 48 hours.

Our Privacy Policy