Conference Paper April 1, 2015

Cross-corpus depression prediction from speech

Research on detecting depression from speech has advanced in recent years, but most work has focused on the analysis of one corpus at a time. Given that clinical corpora are...

Conference Paper March 1, 2015

The SRI biofrustration corpus: Audio, video and physiological signals for continuous user modeling

We describe the SRI BioFrustration Corpus, an inprogress corpus of time-aligned audio, video, and autonomic nervous system signals recorded while users interact with a dialog system to make returns of...

Conference Paper March 1, 2015

Enhanced end-of-turn detection for speech to a personal assistant

Speech to personal assistants (e.g., reminders, calendar entries, messaging, voice search) is often uttered under cognitive load, causing nonfinal pausing that can result in premature recognition cut-offs. Prior research suggests...

Conference Paper November 1, 2014

The SRI AVEC-2014 Evaluation System

Though depression is a common mental health problem with significant impact on human society, it often goes undetected. We explore a diverse set of features based only on spoken audio...

Conference Paper May 1, 2014

Automatic Characterization of Speaking Styles in Educational Videos

Recent studies have shown the importance of using online videos along with textual material in educational instruction, especially for better content retention and improved concept understanding. A key question is...

Conference Paper May 1, 2014

Computationally-Efficient Endpointing Features for Natural Spoken Interaction with Personal-Assistant Systems

Current speech-input systems typically use a nonspeech threshold for end-of-utterance detection. While usually sufficient for short utterances, the approach can cut speakers off during pauses in more complex utterances. We...

Conference Paper January 1, 2012

Speaker recognition with region-constrained MLLR transforms

It has been shown that standard cepstral speaker recognition models can be enhanced by em region-constrained models, where features are extracted only from certain speech regions defined by linguistic or...

Conference Paper August 1, 2011

Constrained cepstral speaker recognition using matched UBM and JFA training

We study constrained speaker recognition systems, or systems that model standard cepstral features that fall within particular types of speech regions. A question in modeling such systems is whether to...

Conference Paper May 1, 2011

Recent progress in prosodic speaker verification

We describe recent progress in the field of prosodic modeling for speaker verification. In a previous paper, we proposed a technique for modeling syllable-based prosodic features that uses a multinomial...

Conference Paper May 1, 2011

Language-independent constrained cepstral features for speaker recognition

Constrained cepstral systems, which select frames to match various linguistic “constraints” in enrollment and test, have shown significant improvements for speaker verification performance. Past work, however, relied on word recognition,...

Conference Paper May 1, 2011

Bird species recognition combining acoustic and sequence modeling

The goal of this work was to explore modeling techniques to improve bird species classification from audio samples. We first developed an unsupervised approach to obtain approximate note models from...

Article September 1, 2010

A Corpus Analysis of Patterns of Age-Related Change in Conversational Speech

