September 1, 2009

Feature-based and channel-based analyses of intrinsic variability in speaker verification

Citation

M. Graciarena, T. Bocklet, E. Shriberg, A. Stolcke, and S. Kajarekar, “Feature-based and channel-based analyses of intrinsic variability in speaker verification,” in Proc. 10th Annual Conference of the International Speech Communication Association 2009 (INTERSPEECH 2009), pp. 2015–2018.

Abstract

We explore how intrinsic variations (those associated with the speaker rather than the recording environment) affect text-independent speaker verification performance. In a previous paper we introduced the SRI-FRTIV corpus and provided speaker verification results using a Gaussian mixture model (GMM) system on telephone-channel speech. In this paper we explore the use of other speaker verification systems on the telephone channel data and compare against the GMM baseline. We found the GMM system to be one of the more robust across all conditions. Systems relying on recognition hypotheses had a significant degradation in low vocal effort conditions. We also explore the use of the GMM system on several other channels. We found improved performance on table-top microphones compared to the telephone channel in furtive conditions and gradual degradations as a function of the distance from the microphone to the speaker. Therefore distant microphones further degrade the speaker verification performance due to intrinsic variability.

Index Terms: speaker recognition, vocal effort, speaking style, intrinsic variation, furtive speech, interview speech, read speech, oration

↓ Download

Feature-based and channel-based analyses of intrinsic variability in speaker verification

Abstract

Read more from SRI

Researchers develop materials that can take on the toughest conditions

Podcast: Re-imagining instructional quality and coaching

SRI’s Genome Explorer: Enhanced genome browser delivers better user experience