N. Scheffer and R. Vogt, “On the use of speaker superfactors for speaker recognition,” in Proc. 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 4410–4413.
We propose a new method to characterize a speaker within the Joint Factor Analysis (JFA) framework. Scoring within the JFA framework can be costly and a new method was proposed to produce an accurate score in a fast manner. However, this method is nonsymmetric and performs badly without any score normalization. We propose a new JFA scoring method that is both symmetrical and efficient. In the same way as means of Gaussians can be concatenated to form a supervector, we use several estimates of speaker factors from the eigenvoice space to build a supervector of factors that we call superfactors. We motivate the use of such factors in the current JFA model through comparison with a Tied Factor Analysis model. We show that this method substantially improves the performance of a system that uses only the standard speaker factors to produce scores, and usually outperforms the baseline system. […]