Speech & natural language publications

March 1, 2017

Toward human-assisted lexical unit discovery without text resources

ByAndreas Kathol, Dimitra Vergyri, Harry Bratt

This work addresses lexical unit discovery for languages without (usable) written resources.

Publications, Speech & natural language publications
March 1, 2017

Joint modeling of articulatory and acoustic spaces for continuous speech recognition tasks

ByDimitra Vergyri, Horacio Franco

This paper investigates using deep neural networks (DNN) and convolutional neural networks (CNNs) for mapping speech data into its corresponding articulatory space.

Publications, Speech & natural language publications
February 1, 2017

Multi-microphone speech recognition integrating beamforming, robust feature extraction, and advanced DNN/RNN backend

This paper gives an in-depth presentation of the multi-microphone speech recognition system we submitted to the 3rd CHiME speech separation and recognition challenge and its extension.

Publications, Speech & natural language publications
November 1, 2016

Conversational In-Vehicle Dialog Systems: The past, present, and future

We review research and development activities for in-vehicle dialog systems, examine findings, discuss key challenges, and share our visions for voice-enabled interaction and intelligent assistance for smart vehicles over the…

Publications, Speech & natural language publications
September 1, 2016

Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition

ByHoracio Franco

This work investigates the performance of traditional deep neural networks under varying acoustic conditions and evaluates their performance with speech recorded under realistic background conditions that are mismatched with respect…

Publications, Speech & natural language publications
September 1, 2016

Minimizing Annotation Effort for Adaptation of Speech-Activity Detection Systems

ByMartin Graciarena

This paper focuses on the problem of selecting the best-possible subset of available audio data given a budgeted time for annotation.

Publications, Speech & natural language publications
September 1, 2016

The SRI CLEO Speaker-State Corpus

ByAndreas Kathol, Massimiliano de Zambotti

We introduce the SRI CLEO (Conversational Language about Everyday Objects) Speaker-State Corpus of speech, video, and biosignals.

Biomedical sciences publications, Publications, Speech & natural language publications
September 1, 2016

On the Issue of Calibration in DNN-Based Speaker Recognition Systems

ByAaron Lawson, Mitchell McLaren

This article is concerned with the issue of calibration in the context of Deep Neural Network (DNN) based approaches to speaker recognition. We propose a hybrid alignment framework, which stems…

Publications, Speech & natural language publications
September 1, 2016

The SRI System for the NIST OpenSAD 2015 Speech Activity Detection Evaluation

ByMartin Graciarena

In this paper, we present the SRI system submission to the NIST OpenSAD 2015 speech activity detection (SAD) evaluation. We present results on three different development databases that we created…

Publications, Speech & natural language publications
September 1, 2016

Automatic Speech Transcription for Low-Resource Languages — The Case of Yoloxóchitl Mixtec (Mexico)

ByAndreas Kathol

In the present study, we focus exclusively on progress in developing speech recognition for the language of interest, Yoloxóchitl Mixtec (YM), an Oto-Manguean language spoken by fewer than 5000 speakers…

Publications, Speech & natural language publications
September 1, 2016

The 2016 Speakers in the Wild Speaker Recognition Evaluation

ByMitchell McLaren, Aaron Lawson

This article provides details of the SITW speaker recognition challenge and analysis of evaluation results. We provide an analysis of some of the top performing systems submitted during the evaluation and…

Publications, Speech & natural language publications
September 1, 2016

Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel- and Noise-Degraded Speech

ByDimitra Vergyri

Current state-of-the-art automatic speech recognition systems are sensitive to changing acoustic conditions, which can cause significant performance degradation.

Publications, Speech & natural language publications