Can Prosody Aid the Automatic Processing of Multi-Party Meetings? Evidence from Predicting Punctuation, Disfluencies,and Overlapping Speech

Citation

Shriberg, E., Stolcke, A., & Baron, D. (2001). Can Prosody Aid the Automatic Processing of Multi-Party Meetings? Evidence from Predicting Punctuation, Dis uencies, and Overlapping Speech. In ISCA Tutorial and Research Workshop (ITRW) on Prosody in Speech Recognition and Understanding.

Abstract

We investigate whether probabilistic modeling of prosody can aid various automatic labeling tasks essential for processing of multi-party meetings. Task 1, automatic punctuation, seeks to classify sentence boundaries and disfluencies. Task 2, jumpin points, predicts locations within foreground speech at which background speakers start talking; Task 3, jump-in words, examines characteristics of the speech they use to do so. Data are from the ICSI Meeting Recorder corpus. To infer inherent cues, analyses are based on close-talking microphone signals and recognizer forced alignments. As a generous baseline for word-level cues, we compare prosodic models to those of a language model given the true words. Results for Task 1 show prosody reduces classification error by 10% relative over the cheating language model; furthermore when this task is run in “online” mode the prosodic model degrades less than does the language model. For Task 2, the language model provides no information, while the prosodic model reduces entropy by 13% over chance. For Task 3, a prosodic model reduces entropy by 25% over chance. Analyses also show interesting prosodic patterns, which differ over tasks. Task 1 uses cues similar to those for Switchboard (but not Broadcast News) data. Task 2 predicts jump-in points that look prosodically like sentence boundaries but that are not actually such boundaries. And Task 3 shows that speakers “raise” their voice when starting during another’s talk, compared to starting during silence. These results provide evidence that prosodic modeling can be of use for the automatic processing of meetings. Further results and implications for future automatic meeting processing systems are discussed.


Read more from SRI

  • surgeons around a surgical robot

    The SRI research behind today’s surgical robotics

    Intuitive’s da Vinci 5 system represents a major leap in robotic-assisted medicine. It all started at SRI, which continues to advance teleoperation technologies.

  • a collage of digital graphs

    A banner year for quantum

    SRI-managed QED-C’s annual report on quantum trends captures an industry accelerating rapidly from technical promise toward major global impact.

  • ICE Cube containing SRI’s aerogel experiment, photographed prior to launch. Source: Aerospace Applications North America

    An SRI carbon capture experiment launches into space

    By synthesizing carbon-absorbing aerogels in microgravity, SRI research will give us a rare glimpse into how these materials could be radically improved.