The Development of SRI’s 1997 Broadcast News Transcription System


Sankar, A., Weng, F., Rivlin, Z. E., Stolcke, A., & Gadde, R. R. (1998, February). The development of SRI’s 1997 Broadcast News transcription system. In Proceedings DARPA Broadcast News Transcription and Understanding Workshop (pp. 91-96).


This paper describes SRI’s 1997 broadcast news transcription system used for the 1997 DARPA H4 evaluations. Our system had several novel components. These include automatic segmentation of entire broadcast shows, word-internal and crossword acoustic models robustly estimated with a new Gaussian Merging-Splitting (GMS) algorithm, the use of trigram language models (LMs) in lattices instead of for rescoring N-best lists, and an LM pruning algorithm that allows efficient representation of high-order (like 4- or 5-gram) LMs. We briefly describe these features and give comparative experimental results. We achieved a 18.7pct. relative improvement in performance on our 1996 H4 partitioned evaluation (PE) development test set as compared to our 1996 H4 PE evaluation system.

Read more from SRI