Search results for: “stolcke”
-
Making the most from multiple microphones in meeting recognition
In this paper we investigate how these two approaches compare for state-of-the-art recognition systems applied to meeting data from the two most recent NIST Rich Transcription evaluations. Our results show that beamforming is the superior approach, giving more accurate results while being inherently less computationally demanding.
-
Bird species recognition combining acoustic and sequence modeling
The goal of this work was to explore modeling techniques to improve bird species classification from audio samples.
-
Language-independent constrained cepstral features for speaker recognition
We develop language-independent (LI) versions of constraints and compare results to parallel LD versions for English data on the NIST 2008 interview task.
-
The CALO meeting assistant system
This paper presents the CALO-MA architecture and its speech recognition and understanding components.
-
Improving language recognition with multilingual phone recognition and speaker adaptation transforms
We investigate a variety of methods for improving language recognition accuracy based on techniques in speech recognition, and in some cases borrowed from speaker recognition.
-
Acoustic front-end optimization for bird species recognition
The goal of this work was to explore the optimization of the feature extraction module (front-end) parameters to improve bird species recognition.
-
Leveraging speaker diarization for meeting recognition from distant microphones
We investigate using state-of-the-art speaker diarization output for speech recognition purposes.
-
Feature-based and channel-based analyses of intrinsic variability in speaker verification
In this paper we explore the use of other speaker verification systems on the telephone channel data and compare against the GMM baseline. We found the GMM system to be one of the more robust across all conditions.
-
Development of the 2008 SRI Mandarin Speech-To-Text System for Broadcast News and Conversation
We describe the recent progress in SRI’s Mandarin speech-to-text system developed for 2008 evaluation in the DARPA GALE program. A data-driven lexicon expansion technique and language model adaptation methods contribute to the improvement in recognition performance.
-
Multifactor Adaptation for Mandarin Broadcast News and Conversation Speech Recognition
We explore the integration of multiple factors such as genre and speaker gender for acoustic model adaptation tasks to improve Mandarin ASR system performance on broadcast news and broadcast conversation audio.
-
Data-Driven Lexicon Expansion for Mandarin Broadcast News and Conversation Speech Recognition
We present a data-driven framework for expanding the lexicon to improve Mandarin broadcast news and conversation speech recognition.
-
Improving Robustness of MLLR Adaptation with Speaker-Clustered Regression Class Trees
We introduce a strategy for modeling speaker variability in speaker adaptation based on maximum likelihood linear regression (MLLR).