Senior Computer Scientist, Speech Technology and Research Laboratory
Victor Abrash is a senior computer scientist in SRI International’s Speech Technology and Research (STAR) Laboratory. Since joining SRI in 1990, his interests have included speech recognition, speech endpointing, algorithm and software development, neural networks for speech recognition, and the use of speech technology for language learning.
He is the principal software architect and developer behind the EduSpeak® Speech Recognition Toolkit, a major contributor to the Decipher and DynaSpeak® speech recognition engines, and a leader in efforts to develop and deploy SRI’s speech recognition technology for commercial and educational use.
Abrash has co-authored more than 20 papers in the areas of speech recognition and neural networks, and holds patents for speech endpointing and the application of speech recognition for reading education.
He holds a B.S. in electrical engineering from MIT and an MPhil from the University of Cambridge in physics and in computer speech and language processing.
Classification of Lexical Stress Using Spectral and Prosodic Features for Computer-assisted Language Learning Systems
We present a system for detection of lexical stress in English words spoken by English learners. This system was designed to be part of the EduSpeak® computer-assisted language learning (CALL) software.
We present a system for detecting lexical stress in English words spoken by English learners. The system uses both spectral and segmental features to detect three levels of stress for each syllable in a word.
We review developments in the SRI Language Modeling Toolkit (SRILM) since 2002, when a previous paper on SRILM was published.
EduSpeak®: A Speech Recognition and Pronunciation Scoring Toolkit for Computer-Aided Language Learning Applications
SRI International’s EduSpeak® system is a SDK that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology.
We extend the POF algorithm to allow a more accurate way to select noisy-to-clean feature mappings, by allowing different combinations of speech and noise to have combination-specific mappings selected depending on the observation.
We describe the development and conceptual evolution of handheld spoken phrase translation systems, beginning with an initial undirectional system for translation of English phrases, and later extending to a limited bidirectional phrase translation system.