Impact of Prior Channel Information for Speaker Identification

Citation

Vaquero, C., Scheffer, N., Karajekar, S. (2009). Impact of Prior Channel Information for Speaker Identification. In: Tistarelli, M., Nixon, M.S. (eds) Advances in Biometrics. ICB 2009. Lecture Notes in Computer Science, vol 5558. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01793-3_46

Abstract

Joint factor analysis (JFA) has been very successful in speaker recognition but its success depends on the choice of development data. In this work, we apply JFA to a very diverse set of recording conditions and conversation modes in NIST 2008 SRE, showing that having channel matched development data will give improvements of about 50% in terms of Equal Error Rate against a Maximum a Posteriori (MAP) system, while not having it will not give significant improvement. To provide robustness to the system, we estimate eigenchannels in two ways. First, we estimate the eigenchannels separately for each condition and stack them. Second, we pool all the relevant development data and obtain a single estimate. Both techniques show good performance, but the former leads to lower performance when working with low-dimension channel subspaces, due to the correlation between those subspaces.

Keywords: Gaussian Mixture Model, Equal Error Rate, Speaker Recognition, Development Data, Female Speaker


Read more from SRI