Factor analysis back ends for MLLR transforms in speaker recognition


N. Scheffer, Y. Lei, and L. Ferrer, “Factor analysis back ends for MLLR transforms in speaker recognition,” in Proc. Interspeech, 2011, pp. 257–260.


The purpose of this work is to show how recent developments in cepstral-based systems for speaker recognition can be leveraged for the use of Maximum Likelihood Linear Regression (MLLR) transforms.  Speaker recognition systems based on MLLR transforms have shown to be greatly beneficial in combination with standard systems, but most of the advances in speaker modeling techniques have been implemented for cepstral features.  We show how these advances, based on Factor Analysis, such as eigenchannel and ivector, can be easily employed to achieve very high accuracy. We show that they outperform the current state-of-the-art MLLR-SVM system that SRI submitted during the NIST SRE 2010 evaluation.  The advantages of leveraging the new approaches are manyfold: the ability to process a large amount of data, working in a reduced dimensional space, importing any advances made for cepstral systems to the MLLR features, and the potential for system combination at the ivector level.

Index Terms: speaker verification, MLLR, factor analysis

Read more from SRI