Model Adaptation for Sentence Segmentation from Speech

Citation

S. Cuendet, D. Hakkani-Tur and G. Tur, “Model adaptation for sentence segmentation from speech,” 2006 IEEE Spoken Language Technology Workshop, 2006, pp. 102-105, doi: 10.1109/SLT.2006.326827.

Abstract

This paper analyzes various methods to adapt sentence segmentation models trained on conversational telephone speech (CTS) to meeting style conversations. The sentence segmentation model trained using a large amount of CTS data is used to improve the performance when various amounts of meeting data are available. We test the sentence segmentation performance on both reference and speech-to-text (STT) conditions on the ICSI MRDA Meeting Corpus using the Switchboard CTS Corpus as the out-of-domain data. Results show that the sentence segmentation performance is significantly improved by the adapted classification model compared to the one obtained by using in-domain data only, independently of the amount of in-domain data used: 17.5 pct. and 8.4 pct. relative error reductions with only 1,000 and 3,000 in-domain sentences, respectively, and 3.7 pct. relative error reduction with all in-domain data of 80,000 words.


Read more from SRI

  • surgeons around a surgical robot

    The SRI research behind today’s surgical robotics

    Intuitive’s da Vinci 5 system represents a major leap in robotic-assisted medicine. It all started at SRI, which continues to advance teleoperation technologies.

  • a collage of digital graphs

    A banner year for quantum

    SRI-managed QED-C’s annual report on quantum trends captures an industry accelerating rapidly from technical promise toward major global impact.

  • ICE Cube containing SRI’s aerogel experiment, photographed prior to launch. Source: Aerospace Applications North America

    An SRI carbon capture experiment launches into space

    By synthesizing carbon-absorbing aerogels in microgravity, SRI research will give us a rare glimpse into how these materials could be radically improved.