Search results for: “stolcke”
-
fMPE-MAP: Improved Discriminative Adaptation for Modeling New Domains
This paper introduces a new adaptation approach, fMPE-MAP, which is an extension to the original fMPE (feature minimum phone error) algorithm, with the enhanced ability in porting Gaussian models and fMPE transforms to a new domain.
-
Integrating MAP, Marginals, and Unsupervised Language Model Adaptation
We investigate the integration of various language model adaptation approaches for a cross-genre adaptation task to improve Mandarin ASR system performance on a recently introduced new genre, broadcast conversation (BC).
-
Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification
We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using duration- and pronunciation-conditioned word N-grams, simultaneously modeling lexical information along with their acoustic and prosodic characteristics.
-
Speech Recognition as Feature Extraction for Speaker Recognition
We present specific techniques and results from SRI’s NIST speaker recognition evaluation system.
-
Combining Discriminative Feature, Transform, and Model Training for Large Vocabulary Speech Recognition
This paper uses a state-of-the-art Mandarin recognition system as a platform to study the interaction of three techniques. Experiments in the broadcast news and broadcast conversation domains show that the contribution of each technique is nonredundant.
-
Unsupervised Language Model Adaptation for Meeting Recognition
We present an application of unsupervised language model (LM) adaptation to meeting recognition, in a scenario where sequences of multiparty meetings on related topics are to be recognized, but no prior in-domain data for LM training is available.
-
Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages
We analyze subword-based language models (LMs) in large-vocabulary continuous speech recognition across four “morphologically rich” languages: Finnish, Estonian, Turkish, and Egyptian Colloquial Arabic.
-
NAP and WCCN: Comparison of Approaches Using MLLR-SVM Speaker Verification System
We compare two recently proposed techniques, within class covariance normalization (WCCN) [1] and nuisance attribute projection (NAP) [2], for intersession variability compensation in speaker verification.
-
Noise Robust Speaker Identification for Spontaneous Arabic Speech
We present an approach that integrates multiple components and models for improved speaker identification in spontaneous Arabic speech in adverse acoustic conditions.
-
The ICSI-SRI Spring 2006 Meeting Recognition System
We describe the development of the ICSI-SRI speech recognition system for the NIST Spring 2006 Meeting Rich Transcription (RT-06S) evaluation, highlighting improvements, including the delay-and-sum algorithm, the nearfield segmenter, language models, posterior-based features, HMM adaptation methods, and adapting to a small amount of new lecture data.
-
Morphology-based Language Modeling for Conversational Arabic Speech Recognition
In this paper, we investigate improvements in Arabic language modeling by developing various morphology-based language models. We present four different approaches to morphology-based language modeling, including a novel technique called factored language models.
-
A Study in Machine Learning from Imbalanced Data for Sentence Boundary Detection in Speech
We have constructed a hidden Markov model (HMM) system to detect sentence boundaries that uses both prosodic and textual information. Since there are more non-sentence boundaries than sentence boundaries in the data, the prosody model, which is implemented as a decision tree classifier, must be constructed to effectively learn from the imbalanced data distribution.