Publications Search | SRI International

Toggle Menu

Publications Search

This paper assesses the role of robust acoustic features in spoken term detection (a.k.a keyword spotting—KWS) under heavily degraded channel and noise corrupted conditions.

May, 2014
In Proceedings
413

We address the problem of subselecting a large set of acoustic data to train automatic speech recognition (ASR) systems.

May, 2014
In Proceedings
413

We present a system for detecting lexical stress in English words spoken by English learners.

May, 2014
In Proceedings
413

Recently, a new version of the iVector modelling has been proposed for noise robust speaker recognition, where the nonlinear function that relates clean and noisy cepstral coefficients is approximated by a first order vector Taylor series (VTS). In this paper, it is proposed to substitute the first...

May, 2014
In Proceedings
413
By Yik-Cheung Tam, Yun Lei, Jing Zheng, Wen Wang

Detecting automatic speech recognition (ASR) errors can play an important role for effective human-computer spoken dialogue system, as recognition errors can hinder accurate system understanding of user intents.

May, 2014
In Proceedings
413

We propose a novel framework for speaker recognition in which extraction of sufficient statistics for the state-of-the-art i-vector model is driven by a deep neural network (DNN) trained for automatic speech recognition (ASR).

May, 2014
In Proceedings
413

The authors propose a novel staged hybrid model for emotion detection in speech.

May, 2014
In Proceedings
413

In the context of computer-aided language learning, automatic detection of specific phone mispronunciations by nonnative speakers can be used to provide detailed feedback about specific pronunciation problems.

May, 2014
In Proceedings
Topics:
413

Studies have shown that the performance of state-of-the-art automatic speech recognition (ASR) systems significantly deteriorate with increased noise levels and channel degradations, when compared to human speech recognition capability. Traditionally, noise-robust acoustic features are deployed to...

May, 2014
In Proceedings
413

Current speech-input systems typically use a nonspeech threshold for end-of-utterance detection. While usually sufficient for short utterances, the approach can cut speakers off during pauses in more complex utterances. We elicit personal-assistant speech (reminders, calendar entries, messaging,...

May, 2014
In Proceedings
413

Pages