Speech & natural language publications

January 1, 2006

Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System

We describe the development of our speech recognition system for the National Institute of Standards and Technology (NIST) Spring 2005 Meeting Rich Transcription (RT-05S) evaluation, highlighting improvements made since last…

Publications, Speech & natural language publications
November 1, 2005

Combining Feature Sets with Support Vector Machines: Application to Speaker Recognition

In this paper, we describe a general technique for optimizing the relative weights of feature sets in a support vector machine (SVM) and show how it can be applied to…

Publications, Speech & natural language publications
November 1, 2005

Multirate ASR Models for Phone-Class Dependent N-Best List Rescoring

In this work, we describe a technique to augment a recognizer that uses this compromise with information from multiple-rate spectral models that emphasize either better time or better frequency resolution…

Publications, Speech & natural language publications
November 1, 2005

A* Based Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings

We investigate the use of the A* algorithm for joint segmentation and classification of dialog acts (DAs) of the ICSI Meeting Corpus. The proposed method is evaluated on both traditional…

Publications, Speech & natural language publications
November 1, 2005

Four Weightings and a Fusion: A Cepstral-SVM System for Speaker Recognition

A new speaker recognition system is described that uses Mel-frequency cepstral features. This system is a combination of four support vector machines (SVMs). All the SVM systems use polynomial features…

Publications, Speech & natural language publications
November 1, 2005

Incorporating Tandem / HATs MLP Features into SRI’s Conversational Speech Recognition System

We describe the development of a speech recognition system for conversational telephone speech (CTS) that incorporates acoustic features estimated by multilayer perceptrons (MLPs). The acoustic features are based on frame-level…

Publications, Speech & natural language publications
September 1, 2005

Meeting Structure Annotation: Data and Tools

ByJohn Niekrasz

We present a set of annotations of hierarchical topic segmentations and action item sub-dialogues collected over 65 meetings from the ICSI and ISL meeting corpora, designed to support automatic meeting…

Publications, Speech & natural language publications
September 1, 2005

MLLR Transforms as Features in Speaker Recognition

We explore the use of adaptation transforms employed in speech recognition systems as features for speaker recognition. This approach is attractive because, unlike standard frame-based cepstral speaker recognition models, it…

Publications, Speech & natural language publications
September 1, 2005

Speech Translation for Low-Resource Languages: The Case of Pashto

ByKristin Precoda, Dimitra Vergyri, Andreas Kathol

We present a number of challenges and solutions that have arisen in the development of a speech translation system for American English and Pashto, highlighting those specific to a very…

Publications, Speech & natural language publications
September 1, 2005

Robust Feature Compensation in Nonstationary and Multiple Noise Environments

ByMartin Graciarena, Horacio Franco, Victor Abrash

We extend the POF algorithm to allow a more accurate way to select noisy-to-clean feature mappings, by allowing different combinations of speech and noise to have combination-specific mappings selected depending…

Publications, Speech & natural language publications
September 1, 2005

Distinguishing Deceptive from Non-Deceptive Speech

ByAndreas Kathol, Martin Graciarena

We present results from a study seeking to distinguish deceptive from non-deceptive speech using machine learning techniques on features extracted from a large corpus of deceptive and non-deceptive speech. We…

Publications, Speech & natural language publications
September 1, 2005

Comparing HMM, Maximum Entropy, and Conditional Random Fields for Disfluency Detection

We compare a generative hidden Markov model (HMM)-based approach and two conditional models — a maximum entropy (Maxent) model and a conditional random field (CRF) — for detecting disfluencies in…

Publications, Speech & natural language publications