The SRI/OGI 2006 Spoken Term Detection System

Citation

Vergyri, D., Shafran, I., Stolcke, A., Gadde, V. R. R., Akbacak, M., Roark, B., & Wang, W. (2007, August). The SRI/OGI 2006 spoken term detection system. In Interspeech (pp. 2393-2396).

Abstract

This paper describes the system developed jointly at SRI and OGI for participation in the 2006 NIST Spoken Term Detection (STD) evaluation. We participated in the three genres of the English track: Broadcast News (BN), Conversational Telephone Speech (CTS), and Conference Meetings (MTG). The system consists of two phases. First, audio indexing, an offline phase, converts the input speech waveform into a searchable index. Second, term retrieval, possibly an online phase, returns a ranked list of occurrences for each search term. We used a word-based indexing approach, obtained with SRI’s large vocabulary Speech-to-Text (STT) system.

Apart from describing the submitted system and its performance on the NIST evaluation metric, we study the tradeoffs between performance and system design. We examine performance versus indexing speed, effectiveness of different index ranking schemes on the NIST score, and the utility of approaches to deal with out-of-vocabulary (OOV) terms.
Index Terms: spoken term detection, audio indexing


Read more from SRI

  • Banner and attendees at the IEEE Hard Tech Venture Summit

    Cultivating hard tech startups that scale

    IEEE’s Hard Tech Venture Summit convened innovators at SRI to refine strategies and build new networks.

  • Patient going into a MRI

    Bringing surgical tools inside the MRI

    Drawing on SRI’s unique innovation ecosystem, the startup Medical Devices Corner is seeking to improve cancer surgery by advancing MRI-safe teleoperation.

  • Christopher Mims and Susan Patrick

    PARC Forum: How to AI

    The Wall Street Journal tech columnist Christopher Mims and SRI Education’s Susan Patrick discuss how AI can strengthen human agency.