[SRILM User List] Configuration for best language models

Luis Uebel lfu20 at hotmail.com
Tue Aug 9 14:59:59 PDT 2011


I am producing some language models (3-grams) for HTK.
What is the best configuration for produce the best language models using SRILM?
My configuration is:
$SRILM/ngram-count -memuse -order ${trigram} -interpolate -kndiscount -unk -vocab $wordlist -limit-vocab -text ${training} -lm ${train}-lm
${trigram}


The script line is above and I am using -kndiscount
Is there a better type of discount or parameters to produce better language models using SRILM?

Number of words (unique): 38k
Size: 93Mbytes
Number of lines: 550656
Number of words (total): 17166049 (17M)

Thanks.


Luis


 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20110809/906d1ed3/attachment.html>


More information about the SRILM-User mailing list