SRI International
-
Automatic Speech Transcription for Low-Resource Languages — The Case of Yoloxóchitl Mixtec (Mexico)
In the present study, we focus exclusively on progress in developing speech recognition for the language of interest, Yoloxóchitl Mixtec (YM), an Oto-Manguean language spoken by fewer than 5000 speakers on the Pacific coast of Guerrero, Mexico.
-
Issue Brief: Early Warning Systems
The brief describes early warning systems as a dropout prevention strategy based on a nationally representative sample of more than 2,000 U.S. public high schools.
-
Privacy- preserving speech analytics for automatic assessment of student collaboration
This work investigates whether nonlexical information from speech can automatically predict the quality of small-group collaborations. Audio was collected from students as they collaborated in groups of three to solve math problems.
-
Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition
This work investigates the performance of traditional deep neural networks under varying acoustic conditions and evaluates their performance with speech recorded under realistic background conditions that are mismatched with respect to the training data.
-
On the Issue of Calibration in DNN-Based Speaker Recognition Systems
This article is concerned with the issue of calibration in the context of Deep Neural Network (DNN) based approaches to speaker recognition. We propose a hybrid alignment framework, which stems from our previous work in DNN senone alignment, that uses the bottleneck features only for the alignment of features during statistics calculation.
-
The 2016 Speakers in the Wild Speaker Recognition Evaluation
This article provides details of the SITW speaker recognition challenge and analysis of evaluation results. We provide an analysis of some of the top performing systems submitted during the evaluation and provide future research directions.
-
The SRI CLEO Speaker-State Corpus
We introduce the SRI CLEO (Conversational Language about Everyday Objects) Speaker-State Corpus of speech, video, and biosignals.
-
Unsupervised Learning of Acoustic Units Using Autoencoders and Kohonen Nets
This work investigates learning acoustic units in an unsupervised manner from real-world speech data by using a cascade of an autoencoder and a Kohonen net.
-
Intelligent Coaching Systems in Higher-Order Applications: Lessons from Automated Content Creation Bottlenecks
This presentation describes two projects for interactive training that developed prototypes for automated content creation plus a third project that illustrates a suite of learning object libraries to support engineering instruction.
-
Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel- and Noise-Degraded Speech
Current state-of-the-art automatic speech recognition systems are sensitive to changing acoustic conditions, which can cause significant performance degradation.
-
Minimizing Annotation Effort for Adaptation of Speech-Activity Detection Systems
This paper focuses on the problem of selecting the best-possible subset of available audio data given a budgeted time for annotation.
-
The SRI speech-based collaborative learning corpus
We introduce the SRI speech-based collaborative learning corpus, a novel collection designed for the investigation and measurement of how students collaborate together in small groups.