auxiliary scripts and make-big-lm

Andreas Stolcke stolcke at speech.sri.com
Sun Jun 1 19:07:15 PDT 2008


In message <3B9C2620-3BF0-4043-984C-2105FD5CE32D at gmail.com>you wrote:
> What are the typical situations when some of the training-scripts are  
> useful?
> 
> Eg., there're  get-gt-counts, which produce a few small files out of  
> my huge 5-gram count file.  Also there're make-gt-discounts, make-kn- 
> counts, make-kn-discounts.  Are these mostly called by make-big-lm, or  
> have their own uses?  With the ngram-count, there's -kn set of options  
> to read the counts -- when is it useful to save/read them with it?

The ngram-count -kn options are used to separate the discount estimation
process from the LM building proper.  They are used by make-big-lm to 
reduce the maximum amount of memory needed.

> 
> I'd really like to try a few huge models with make-big-lm.  Is it by  
> itself sufficient for model estimation, calling the auxiliary script  
> on its own?

Yes, so you don't really have to know what these scripts do.  Some of them
are useful by themselves, or they could be used as instructional tools.

Andreas 




More information about the SRILM-User mailing list