command line for make-big-lm
Andreas Stolcke
stolcke at speech.sri.com
Sun Jun 1 19:01:45 PDT 2008
In message <E1A7F2EA-516F-4B03-AF0E-36B4EAC28BAF at gmail.com>you wrote:
> I'm studying training-scripts to estimate a big LM for modified Kneser-
> Ney. Will this do the job:
>
> make-big-lm -name my-kn-model -read my.counts.gz -max-per-file
> 10000000 -kndiscount 5
> -- is -kndiscount all what's needed to trigger KN estimation? And the
> number is the maximum order N, i.e. we don't need to repeat it from 1
> up to N, like -kndiscount 1, -kndiscount 2, ...?
Not quite: use
-kndiscount -order 5
> -- also, how do I estimate -max-per-file for 16 GB RAM and 5-grams?
It really depends on your data, so it's hard to predict.
10000000 is the default actually, and 16GB is quite a bit of memory, so
you should have no problem.
Andreas
More information about the SRILM-User
mailing list