Data-Driven Lexicon Expansion for Mandarin Broadcast News and Conversation Speech Recognition


X. Lei, W. Wang and A. Stolcke, “Data-driven lexicon expansion for Mandarin broadcast news and conversation speech recognition,” 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, pp. 4329-4332, doi: 10.1109/ICASSP.2009.4960587.


We present a data-driven framework for expanding the lexicon to improve Mandarin broadcast news and conversation speech recognition. The lexicon expansion includes the generation of pronunciation variants for frequent words and vocabulary augmentation with new words and phrases derived from the training data. To learn multiple pronunciations, we first generate all possible pronunciation candidates for a word from its character pronunciation network. The top pronunciation variants are then selected from forced alignment statistics. To augment the acoustic vocabulary, we propose an efficient algorithm that derives new words based on N-gram statistics. Experiments show that a dictionary expanded in this manner yields significant improvements on a Mandarin broadcast speech recognition task.

Read more from SRI