[SRILM User List] Why does 'ngram -factored' needs the countfile

Gregor Donaj gregor.donaj at uni-mb.si
Thu Oct 11 07:59:23 PDT 2012


Hi,

I'm trying to rescore factored hypothesizes with ngram with the 
-factored option. I realized that the program requires the countfile to 
be present as specified in the flm definition file and that it also 
seems to be loaded into memory. Same with using fngram. Why is this so?

Since for calculating probabilities and perplexities I only need the 
actual language model file and not the counts, this is a bit annoying as 
my countfiles are sometimes larger than my RAM.

I kind of "solved" the problem by creating and empty countfile. I tested 
this on a small example and saw that it calculates the rescored 
probabilities fine. Is there any way to tell ngram not to look for the 
countfile? I guess that would be a better solution that just giving the 
program a dummy countfile that doesn't correspond to the language model 
file.

Thanks


-- 
Gregor Donaj, univ. dipl. inž. el., univ. dipl. mat.

Laboratorij za digitalno procesiranje signalov
Fakulteta za elektrotehniko, računalništvo in informatiko
Smetanova ulica 17, 2000 Maribor
Tel.: 02/220 72 05
E-mail: gregor.donaj at uni-mb.si

Digital Signal Processing Laboratory
Faculty of Electrical Engineering and Computer Science
Smetanova ulica 17, 2000 Maribor, Slovenia
Tel.: +386 2 220 72 05



More information about the SRILM-User mailing list