M. Graciarena, T. Bocklet, E. Shriberg, A. Stolcke, and S. Kajarekar, “Feature-based and channel-based analyses of intrinsic variability in speaker verification,” in Proc. 10th Annual Conference of the International Speech Communication Association 2009 (INTERSPEECH 2009), pp. 2015–2018.
We explore how intrinsic variations (those associated with the speaker rather than the recording environment) affect text-independent speaker verification performance. In a previous paper we introduced the SRI-FRTIV corpus and provided speaker verification results using a Gaussian mixture model (GMM) system on telephone-channel speech. In this paper we explore the use of other speaker verification systems on the telephone channel data and compare against the GMM baseline. We found the GMM system to be one of the more robust across all conditions. Systems relying on recognition hypotheses had a significant degradation in low vocal effort conditions. We also explore the use of the GMM system on several other channels. We found improved performance on table-top microphones compared to the telephone channel in furtive conditions and gradual degradations as a function of the distance from the microphone to the speaker. Therefore distant microphones further degrade the speaker verification performance due to intrinsic variability.
Index Terms: speaker recognition, vocal effort, speaking style, intrinsic variation, furtive speech, interview speech, read speech, oration