A. Stolcke and S. Kajarekar, “Recognizing Arabic speakers with English phones,” In Proc. Odyssey: The Speaker and Language Recognition Workshop, 2008.
We investigate the question of whether phone recognition models trained on large English databases can be used for speaker recognition in another language. Such a cross-language use of recognition models is an attractive option when a speaker recognition system is to be ported to a new language without the necessary data resources, while retaining some of the advantages of phone modeling and ASR-based feature extraction. We compare the performance of such systems to a baseline cepstral GMM system (which is inherently language independent), and to a phone-recognition-based system trained exclusively on Arabic data. Our results indicate that cross-language models are highly competitive, and, at least in our case, have a performance advantage over within-language training and the language-independent baseline. We also examine the effect of coverage of colloquial Arabic dialects in the training data.