Recent progress in prosodic speaker verification

Citation

M. Kockmann, L. Ferrer, L. Burget, E. Shriberg and J. Cernocky, “Recent progress in prosodic speaker verification,”in Proc. 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2011), pp.4556–4559.

Abstract

We describe recent progress in the field of prosodic modeling for speaker verification.  In a previous paper, we proposed a technique for modeling syllable-based prosodic features that uses a multinomial subspace model for feature extraction and within-class covariance normalization or linear discriminant analysis for session variability compensation.  In this paper, we show that performance can be significantly improved with the use of probabilistic linear discriminant analysis (PLDA) for session variability compensation.  This system does not require score normalization. We report an equal error rate below 7% on a NIST 2008 task. To our knowledge, this is the best reported result to date for a prosodic system for speaker recognition.  Fusion of this system with a state-of-the-art acoustic baseline system yields 10% relative improvement in the new detection cost function (DCF) as defined by NIST.

Index Terms— Prosodic speaker verification, SNERFs, MSM, iVector, PLDA


Read more from SRI

  • A photo of Mary Wagner

    Recognizing the life and work of Mary Wagner 

    A cherished SRI colleague and globally respected leader in education research, Mary Wagner leaves behind an extraordinary legacy of groundbreaking work supporting children and youth with disabilities and their families.

  • Testing XRGo in a robotics laboratory

    Robots in the cleanroom

    A global health leader is exploring how SRI’s robotic telemanipulation technology can enhance pharmaceutical manufacturing.

  • SRI research aims to make generative AI more trustworthy

    Researchers have developed a new framework that reduces generative AI hallucinations by up to 32%.