Author: SRI International

November 1, 2005

Multirate ASR Models for Phone-Class Dependent N-Best List Rescoring

In this work, we describe a technique to augment a recognizer that uses this compromise with information from multiple-rate spectral models that emphasize either better time or better frequency resolution in order to improve performance.
October 1, 2005

VERL: An ontology framework for representing and annotating video events

This article describes the findings of a recent workshop series that has produced an ontology framework for representing video events-called Video Event Representation Language (VERL) -and a companion annotation framework, called Video Event Markup Language (VEML). One of the key concepts in this work is the modeling of events as composable, whereby complex events are…
October 1, 2005

Singapore Tablet PC Program Study: Executive Summary and Final Report, Volume 1, Technical Findings
October 1, 2005

Singapore Tablet PC Program Study: Executive Summary and Final Report, Volume 2, Technical Appendicies
October 1, 2005

Task Management under Change and Uncertainty: Constraint Solving Experience with the CALO Project

We outline the challenges and opportunities presented by constraint solving in the presence of change and uncertainty, embodied in CALO’s personalized time management and task reasoning and execution systems.
October 1, 2005

Mapping The Distribution Of Expertise And Resources In A School: Investigating The Potential Of Using Social Network Analysis In Evaluation

This paper describes results of a study investigating the potential of using social network analysis to evaluate the capacity of a school to undertake a schoolwide educational reform.
September 1, 2005

MLLR Transforms as Features in Speaker Recognition

We explore the use of adaptation transforms employed in speech recognition systems as features for speaker recognition. This approach is attractive because, unlike standard frame-based cepstral speaker recognition models, it normalizes for the choice of spoken words in text-independent speaker verification.
September 1, 2005

Generation of fast interpreters for Huffman compressed bytecode

Our approach uses canonical Huffman codes to generate compact opcodes with custom-sized operand fields and with a virtual machine that directly executes this compact code. In effect, this automatically creates both an instruction set for a customized virtual machine and an implementation of that machine.
September 1, 2005

Leveraging Speaker-dependent Variation of Adaptation

This work introduces an automatic procedure for determining the size of regression class trees for individual speakers using an ensemble of speaker-level features to control the number of transformations, if any, that should be estimated by maximum likelihood linear regression.
September 1, 2005

Using MLP Features in SRI’s Conversational Speech Recognition System

We describe the development of a speech recognition system for conversational telephone speech (CTS) that incorporates acoustic features estimated by multilayer perceptrons (MLP). The acoustic features are based on frame-level phone posterior probabilities, obtained by merging two different MLP estimators, one based on PLP-Tandem features, the other based on hidden activation TRAPs (HATs) features.
September 1, 2005

Does Active Learning Help Automatic Dialog Act Tagging in Meeting Data?

We ask if active learning with lexical cues can help for this task and this domain. To better address this question, we explore active learning for two different types of DA models — hidden Markov models (HMMs) and maximum entropy (maxent).
September 1, 2005

Pushing the Envelope — Aside

Despite successes, there are still significant limitations to speech recognition performance. For this reason, authors have proposed methods that incorporate different (and larger) analysis windows, which are described in this article.