Speech & natural language publications
-
A Lognormal Tied Mixture Model of Pitch for Prosody-Based Speaker Recognition
In this work, we develop a statistical model of pitch that allows unbiased estimation of pitch statistics from pitch tracks which are subject to doubling and/or halving.
-
Structure and Performance of a Dependency Language Model
We present a maximum entropy language model that incorporates both syntax and semantics via a dependency grammar.
-
A Study of Multilingual Speech Recognition
This paper describes our work in developing multilingual (Swedish and English) speech recognition systems in the ATIS domain. The acoustic component of the multilingual systems is realized through sharing Gaussian…
-
Neural-Network Based Measures of Confidence for Word Recognition
This paper proposes a probabilstic framework to define and evaluate confidence measures for word recognition. We describe a novel method to combine different knowledge sources and estimate the confidence in…
-
Handset-Dependent Background Models for Robust Text-Independent Speaker Recognition
This paper studies the effects of handset distortion on telephone-based speaker recognition performance. Results on the 1996 NIST Speaker Recognition Evaluation corpus show that using handset-matched background models reduces false…
-
Automatic Pronunciation Scoring for Language Instruction
In this paper we show that we can significantly improve HMM- based scores by using average phone segment posterior probabilities. Correlation between machine and human scores went up from r=0.50…
-
Model Transformation for Robust Speaker Recognition from Telephone Data
In the context of automatic speaker recognition, we propose a model transformation technique that renders speaker models more robust to acoustic mismatches and to data scarcity by appropriately increasing their…
-
HTTP://WWW.SPEECH.SRI.COM/DEMOS/ATIS.HTML
This paper presents a speech-enabled WWW demonstration based on the Air Travel Information System (ATIS) domain. SRI’s speech recognition technology and natural language understanding are fully integrated in a Java…
-
Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System
We describe the development of the SRI system evaluated in the 1996 DARPA continuous speech recognition (CSR) Hub4 partitioned evaluation (PE). The task for the Hub4 evaluation was to recognition…
-
Hub4 Language Modeling Using Domain Interpolation and Data Clustering
In SRI's language modeling experiments for the Hub4 domain, three basic approaches were pursued: interpolating multiple models estimated from Hub4 and non-Hub4 training data, adapting the language model (LM) to…
-
A Speaker Identification Agent
This paper describes a prototype application which combines speaker identification technology and an agent architecture to provide user-definable monitors for incoming voicemail messages. Through a Web-distributable Java user interface, the…
-
Word Predictability After Hesitations: A Corpus-based Study
We ask whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the…