Publications Search | SRI International

Toggle Menu

Publications Search

Recently, a new version of the iVector modelling has been proposed for noise robust speaker recognition, where the nonlinear function that relates clean and noisy cepstral coefficients is approximated by a first order vector Taylor series (VTS). In this paper, it is proposed to substitute the first...

May, 2014
In Proceedings
412
By Yik-Cheung Tam, Yun Lei, Jing Zheng, Wen Wang

Detecting automatic speech recognition (ASR) errors can play an important role for effective human-computer spoken dialogue system, as recognition errors can hinder accurate system understanding of user intents.

May, 2014
In Proceedings
412

We propose a novel framework for speaker recognition in which extraction of sufficient statistics for the state-of-the-art i-vector model is driven by a deep neural network (DNN) trained for automatic speech recognition (ASR).

May, 2014
In Proceedings
412

The authors propose a novel staged hybrid model for emotion detection in speech.

May, 2014
In Proceedings
412

In the context of computer-aided language learning, automatic detection of specific phone mispronunciations by nonnative speakers can be used to provide detailed feedback about specific pronunciation problems.

May, 2014
In Proceedings
Topics:
412

Studies have shown that the performance of state-of-the-art automatic speech recognition (ASR) systems significantly deteriorate with increased noise levels and channel degradations, when compared to human speech recognition capability. Traditionally, noise-robust acoustic features are deployed to...

May, 2014
In Proceedings
412

Current speech-input systems typically use a nonspeech threshold for end-of-utterance detection. While usually sufficient for short utterances, the approach can cut speakers off during pauses in more complex utterances. We elicit personal-assistant speech (reminders, calendar entries, messaging,...

May, 2014
In Proceedings
412

Reverberation in speech degrades the performance of speech recognition systems, leading to higher word error rates.

May, 2014
In Proceedings
412

This paper presents a deep neural network (DNN) to extract articulatory information from the speech signal and explores different ways to use such information in a continuous speech recognition task.

May, 2014
In Proceedings
412

Though sparse features have produced significant gains over traditional dense features in statistical machine translation, careful feature selection and feature engineering are necessary to avoid overfitting in optimizations.

May, 2014
In Proceedings
412

Pages