SRILM – An Extensible Language Modeling Toolkit

Citation

Stolcke, A. (2002). SRILM-an extensible language modeling toolkit. In Seventh international conference on spoken language processing.

Abstract

SRILM is a collection of C libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications. SRILM is freely available for noncommercial purposes. The toolkit supports creation and evaluation of a variety of language model types based on N-gram statistics, as well as several related tasks, such as statistical tagging and manipulation of N-best lists and word lattices. This paper summarizes the functionality of the toolkit and discusses its design and implementation, highlighting ease of rapid prototyping, reusability, and combinability of tools.


Read more from SRI

  • Collage of Douglas Engelbart at the Mother of All Demos and a modern computer mouse

    Stanford celebrates a world-changing SRI invention

    Spotlighting Douglas Engelbart’s invention of the computer mouse, Stanford Magazine revisits a moment when SRI transformed computing forever.

  • Two IT professionals solving a problem

    Why quantum assurance matters

    New SRI research seeks to secure the future of quantum innovation by extending software assurance capabilities from classical computers to quantum information systems.

  • PARC Forum Participants

    PARC Forum: The future of defense technologies

    Silicon Valley is paying close attention to the defense sector. SRI convened a conversation exploring new opportunities to advance security through innovation.