SRILM is a collection of C++ libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and other applications. SRILM is freely available for noncommercial purposes. The toolkit supports creation and evaluation of a variety of language model types based on N-gram statistics, as well as several related tasks, such as statistical tagging and manipulation of N-best lists and word lattices. This paper summarizes the functionality of the toolkit and discusses its design and implementation, highlighting ease of rapid prototyping, reusability, and combinability of tools.
Cite as: Stolcke, A. (2002) SRILM - an extensible language modeling toolkit. Proc. 7th International Conference on Spoken Language Processing (ICSLP 2002), 901-904, doi: 10.21437/ICSLP.2002-303
@inproceedings{stolcke02_icslp, author={Andreas Stolcke}, title={{SRILM - an extensible language modeling toolkit}}, year=2002, booktitle={Proc. 7th International Conference on Spoken Language Processing (ICSLP 2002)}, pages={901--904}, doi={10.21437/ICSLP.2002-303} }