[SRILM User List] About the -prune option

Mon Oct 29 03:09:55 PDT 2012

Hi, I need to obtain a small LM for ASR decoding by pruning from a large
LM. The original large LM contains about 1.6 billion n-grams, and the small
one should contains about 30 million n-grams. The -prune option in SRILM
could do this. However, I want to ask if it's the same by pruning in one
time and in serveral times. For example, there are two approaches to finish
this pruning task.

1) Set a proper value and prune only one time to get the targe LM:
     ngram -lm LM_Large -prune 1e-9 -order 5 -write-lm LM_Small

2) Set several proper values to prune gradually to get the targe LM:
     ngram -lm LM_Large -prune 1e-10 -order 5 -write-lm LM_Small1
     ... ...
     ngram -lm LM_Small1 -prune 1e-9 -order 5 -write-lm LM_Small

Are there any differences between above two approaches? Does the pruned LM
have a lower perplexity by the second method?

Thanks!

Meng CHEN
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20121029/ca9db101/attachment.html>