Author: SRI International

  • A Study of Multilingual Speech Recognition

    This paper describes our work in developing multilingual (Swedish and English) speech recognition systems in the ATIS domain. The acoustic component of the multilingual systems is realized through sharing Gaussian codebooks across Swedish and English allophones.

  • Multimodal Interfaces for Internet

    In this paper, we present a Java-enabled application with a multimodal (pen and voice) interface over the web. Our implementation approach was to add Java to the set of languages accepted by the Open Agent Architecture (OAA), a framework for rapidly prototyping complex applications, and particularly suited to those with multimodal interfaces.

  • Using Differential Constraints to Reconstruct Complex Surfaces from Stereo

    Stereo reconstruction algorithms often fail to properly deal with complex surfaces, because there is not enough image information. We propose to guide the reconstruction process using a priori information about the differential geometry of the object surfaces.

  • Model Transformation for Robust Speaker Recognition from Telephone Data

    In the context of automatic speaker recognition, we propose a model transformation technique that renders speaker models more robust to acoustic mismatches and to data scarcity by appropriately increasing their variances.

  • Handset-Dependent Background Models for Robust Text-Independent Speaker Recognition

    This paper studies the effects of handset distortion on telephone-based speaker recognition performance. Results on the 1996 NIST Speaker Recognition Evaluation corpus show that using handset-matched background models reduces false acceptances (at a 10% miss rate) by more than 60% over previously reported (handset-independent) approaches.

  • Neural-Network Based Measures of Confidence for Word Recognition

    This paper proposes a probabilstic framework to define and evaluate confidence measures for word recognition. We describe a novel method to combine different knowledge sources and estimate the confidence in a word hypothesis, via a neural network.

  • HTTP://WWW.SPEECH.SRI.COM/DEMOS/ATIS.HTML

    This paper presents a speech-enabled WWW demonstration based on the Air Travel Information System (ATIS) domain. SRI’s speech recognition technology and natural language understanding are fully integrated in a Java application using the DECIPHER(TM) speech recognition system and the Open Agent Architecture(TM).

  • Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System

    We describe the development of the SRI system evaluated in the 1996 DARPA continuous speech recognition (CSR) Hub4 partitioned evaluation (PE). The task for the Hub4 evaluation was to recognition speech from broadcast television and radio shows.

  • Hub4 Language Modeling Using Domain Interpolation and Data Clustering

    In SRI’s language modeling experiments for the Hub4 domain, three basic approaches were pursued: interpolating multiple models estimated from Hub4 and non-Hub4 training data, adapting the language model (LM) to the focus conditions, and adapting the LM to different topic types.

  • Active And Supportive Computer-Mediated Resources For Student-To-Student Conversation

    We provide quantitative data that suggests that seventh grade students who used PIE learned some of the basic principles of probability. Two cases studies are that illustrate how communication supported by computer-mediated representations contributed to this success.

  • Secondary-Postsecondary Linkages: The Missing Link In School-To-Work Initiatives

  • Mulitmodal User Interfaces in the Open Agent Architecture

    The design and development of the Open Agent Architecture (OAA) system has focused on providing access to agentbased applications through an intelligent, cooperative, distributed, and multimodal agent-based user interfaces. The current multimodal interface supports a mix of spoken language, handwriting and gesture, and is adaptable to the user’s preferences, resources and environment.