SRI Authors: Mitchell McLaren, Aaron Lawson
This paper proposes adaptive Gaussian backend (AGB), a novel approach to robust language identification (LID). In this approach, a given test sample is compared to language-specific training data in order to dynamically select data for a trial-specific language model. Discriminative AGB additionally weights the training data to maximize discrimination against the test segment. Evaluated on heavily degraded speech data, discriminative AGB provides relative improvements of up to 45% and 38% in equal error rates (EER) over the widely adopted Gaussian backend (GB) and neural network (NN) approaches to LID, respectively. Discriminative AGB also significantly outperforms those techniques at shorter test durations, while demonstrating robustness to limited training resources and to mismatch between training and testing speech duration. The efficacy of AGB is validated on clean speech data from National Institute of Standards and Technology (NIST) language recognition evaluation (LRE) 2009, on which it was found to provide improvements over the GB and NN approaches.