G. Tur and A. Stolcke, “Unsupervised Languagemodel Adaptation for Meeting Recognition,” 2007 IEEE International Conference on Acoustics, Speech and Signal Processing – ICASSP ’07, 2007, pp. IV-173-IV-176, doi: 10.1109/ICASSP.2007.367191.
We present an application of unsupervised language model (LM) adaptation to meeting recognition, in a scenario where sequences of multiparty meetings on related topics are to be recognized, but no prior in-domain data for LM training is available. The recognizer LMs are adapted according to the recognition output on temporally preceding meetings, either in speaker-dependent or speaker-independent mode. Model adaptation is carried out by interpolating the $n$-gram probabilities of a large generic LM with those of a small LM estimated from the adaptation data, and minimizing perplexity on the automatic transcripts of a separate meeting set, also previously recognized. The adapted LMs yield about 5-9 pct. relative reduction in word error compared to the baseline. This improvement is about half of what can be achieved with supervised adaptation, i.e., using human-generated speech transcripts.