Search results for: “stolcke”

August 1, 2007

fMPE-MAP: Improved Discriminative Adaptation for Modeling New Domains

This paper introduces a new adaptation approach, fMPE-MAP, which is an extension to the original fMPE (feature minimum phone error) algorithm, with the enhanced ability in porting Gaussian models and fMPE transforms to a new domain.
August 1, 2007

Integrating MAP, Marginals, and Unsupervised Language Model Adaptation

We investigate the integration of various language model adaptation approaches for a cross-genre adaptation task to improve Mandarin ASR system performance on a recently introduced new genre, broadcast conversation (BC).
August 1, 2007

Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification

We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using duration- and pronunciation-conditioned word N-grams, simultaneously modeling lexical information along with their acoustic and prosodic characteristics.
April 1, 2007

Speech Recognition as Feature Extraction for Speaker Recognition

We present specific techniques and results from SRI’s NIST speaker recognition evaluation system.
April 1, 2007

Combining Discriminative Feature, Transform, and Model Training for Large Vocabulary Speech Recognition

This paper uses a state-of-the-art Mandarin recognition system as a platform to study the interaction of three techniques. Experiments in the broadcast news and broadcast conversation domains show that the contribution of each technique is nonredundant.
April 1, 2007

Unsupervised Language Model Adaptation for Meeting Recognition

We present an application of unsupervised language model (LM) adaptation to meeting recognition, in a scenario where sequences of multiparty meetings on related topics are to be recognized, but no prior in-domain data for LM training is available.
April 1, 2007

Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages

We analyze subword-based language models (LMs) in large-vocabulary continuous speech recognition across four “morphologically rich” languages: Finnish, Estonian, Turkish, and Egyptian Colloquial Arabic.
April 1, 2007

NAP and WCCN: Comparison of Approaches Using MLLR-SVM Speaker Verification System

We compare two recently proposed techniques, within class covariance normalization (WCCN) [1] and nuisance attribute projection (NAP) [2], for intersession variability compensation in speaker verification.
April 1, 2007

Noise Robust Speaker Identification for Spontaneous Arabic Speech

We present an approach that integrates multiple components and models for improved speaker identification in spontaneous Arabic speech in adverse acoustic conditions.
January 1, 2007

The ICSI-SRI Spring 2006 Meeting Recognition System

We describe the development of the ICSI-SRI speech recognition system for the NIST Spring 2006 Meeting Rich Transcription (RT-06S) evaluation, highlighting improvements, including the delay-and-sum algorithm, the nearfield segmenter, language models, posterior-based features, HMM adaptation methods, and adapting to a small amount of new lecture data.
October 1, 2006

Morphology-based Language Modeling for Conversational Arabic Speech Recognition

In this paper, we investigate improvements in Arabic language modeling by developing various morphology-based language models. We present four different approaches to morphology-based language modeling, including a novel technique called factored language models.
October 1, 2006

A Study in Machine Learning from Imbalanced Data for Sentence Boundary Detection in Speech

We have constructed a hidden Markov model (HMM) system to detect sentence boundaries that uses both prosodic and textual information. Since there are more non-sentence boundaries than sentence boundaries in the data, the prosody model, which is implemented as a decision tree classifier, must be constructed to effectively learn from the imbalanced data distribution.