[SRILM User List] Compacting language models
Luis Uebel
lfu20 at hotmail.com
Sun Feb 13 13:05:51 PST 2011
I am using SRI to produce some reverse language models and are quite big.
Stats: training data: 1.1G words
88M sentences
but system was limited to 39k words (wordlist.txt) by:
ngram-count -memuse -order 3 -interpolate -kndiscount -unk -vocab ../lang-data/wordlist.txt -limit-vocab -text ../lang-data/${training}-${reverse}.xml -lm ${training}-reverse-lm${trigram}
Is there other options to reduce LM size since trigrams are 1.7G? (without so much lost in performance)?
Thanks,
Luis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20110213/168a90a8/attachment.html>
More information about the SRILM-User
mailing list