Publications
-
Does Active Learning Help Automatic Dialog Act Tagging in Meeting Data?
We ask if active learning with lexical cues can help for this task and this domain. To better address this question, we explore active learning for two different types of…
-
Pushing the Envelope — Aside
Despite successes, there are still significant limitations to speech recognition performance. For this reason, authors have proposed methods that incorporate different (and larger) analysis windows, which are described in this…
-
Comparing HMM, Maximum Entropy, and Conditional Random Fields for Disfluency Detection
We compare a generative hidden Markov model (HMM)-based approach and two conditional models — a maximum entropy (Maxent) model and a conditional random field (CRF) — for detecting disfluencies in…
-
Distinguishing Deceptive from Non-Deceptive Speech
We present results from a study seeking to distinguish deceptive from non-deceptive speech using machine learning techniques on features extracted from a large corpus of deceptive and non-deceptive speech. We…
-
Improved Discriminative Training Using Phone Lattices
We present an efficient discriminative training procedure utilizing phone lattices. Different approaches to expediting lattice generation, statistics collection, and convergence were studied.
-
Speech Translation for Low-Resource Languages: The Case of Pashto
We present a number of challenges and solutions that have arisen in the development of a speech translation system for American English and Pashto, highlighting those specific to a very…
-
Generation of fast interpreters for Huffman compressed bytecode
Our approach uses canonical Huffman codes to generate compact opcodes with custom-sized operand fields and with a virtual machine that directly executes this compact code. In effect, this automatically creates…
-
Leveraging Speaker-dependent Variation of Adaptation
This work introduces an automatic procedure for determining the size of regression class trees for individual speakers using an ensemble of speaker-level features to control the number of transformations, if…
-
Class-dependent Score Combination for Speaker Recognition
In this work, we are presenting a class-based score combination technique that relies on clustering of both the target models and the test utterances in a vector space defined by…
-
Using MLP Features in SRI’s Conversational Speech Recognition System
We describe the development of a speech recognition system for conversational telephone speech (CTS) that incorporates acoustic features estimated by multilayer perceptrons (MLP). The acoustic features are based on frame-level…
-
MLLR Transforms as Features in Speaker Recognition
We explore the use of adaptation transforms employed in speech recognition systems as features for speaker recognition. This approach is attractive because, unlike standard frame-based cepstral speaker recognition models, it…
-
Robust Feature Compensation in Nonstationary and Multiple Noise Environments
We extend the POF algorithm to allow a more accurate way to select noisy-to-clean feature mappings, by allowing different combinations of speech and noise to have combination-specific mappings selected depending…