Author: Victor Abrash

May 1, 2015

Classification of Lexical Stress Using Spectral and Prosodic Features for Computer-assisted Language Learning Systems

We present a system for detection of lexical stress in English words spoken by English learners. This system was designed to be part of the EduSpeak® computer-assisted language learning (CALL) software.
May 1, 2014

Lexical Stress Classification for Language Learning Using Spectral and Segmental Features

We present a system for detecting lexical stress in English words spoken by English learners. The system uses both spectral and segmental features to detect three levels of stress for each syllable in a word.
December 1, 2011

SRILM at sixteen: Update and outlook

We review developments in the SRI Language Modeling Toolkit (SRILM) since 2002, when a previous paper on SRILM was published.
July 1, 2010

EduSpeak®: A Speech Recognition and Pronunciation Scoring Toolkit for Computer-Aided Language Learning Applications

SRI International’s EduSpeak® system is a SDK that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.
September 1, 2005

Robust Feature Compensation in Nonstationary and Multiple Noise Environments

We extend the POF algorithm to allow a more accurate way to select noisy-to-clean feature mappings, by allowing different combinations of speech and noise to have combination-specific mappings selected depending on the observation.
September 1, 2003

Development of Phrase Translation Systems for Handheld Computers: from Concept to Field

We describe the development and conceptual evolution of handheld spoken phrase translation systems, beginning with an initial undirectional system for translation of English phrases, and later extending to a limited bidirectional phrase translation system.
March 1, 2002

DynaSpeak: SRI’s Scalable Speech Recognizer for Embedded and Mobile Systems

We introduce SRI’s new speech recognition engine, DynaSpeak(TM), which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability, efficient grammar optimization, support for natural language parsing functionality, and operation based on integer arithmetic.
August 1, 2000

The SRI EduSpeak(TM) System: Recognition and Pronunciation Scoring for Language Learning

The EduSpeak(TM) system is a software development toolkit that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.
September 1, 1997

Mixture Input Transformations for Adaptation of Hybrid Connectionist Speech Recognizers

In this paper, we propose a new algorithm to train mixtures of transformation networks (MTNs) in the hybrid connectionist recognition framework. We apply the new algorithm to nonnative speaker adaptation, and present recognition results for the 1994 WSJ Spoke 3 development set.
September 1, 1995

Connectionist Speaker Normalization and Adaptation

We explore supervised speaker adaptation and normalization in the MLP component of a hybrid hidden Markov model/multilayer perceptron version of SRI’s DECIPHER™ speech recognition system. Our approach combines both adaptation and normalization in a single, consistent manner, works with limited adaptation data, and is text-independent.
January 1, 1994

Incorporating linguistic features in a hybrid HMM/MLP speech recognizer

We propose two schemes for incorporating distinctive speech features (sonorant, fricative, nasal, vocalic, and voiced) into the MLP component of our system. We show a small improvement in recognition performance on a 160-word speaker-independent continuous-speech Japanese conference room reservation database.
January 1, 1993

Modeling Consistency in a Speaker Independent Continuous Speech Recognition System

In this paper we discuss a Gender Dependent Neural Network (GDNN) which can be tuned for each gender, while sharing most of the speaker independent parameters. We use a classification network to help generate gender-dependent phonetic probabilities for a statistical (HMM) recognition system.