[SRILM User List] Why does 'ngram -factored' needs the countfile
Gregor Donaj
gregor.donaj at uni-mb.si
Thu Oct 11 07:59:23 PDT 2012
Hi,
I'm trying to rescore factored hypothesizes with ngram with the
-factored option. I realized that the program requires the countfile to
be present as specified in the flm definition file and that it also
seems to be loaded into memory. Same with using fngram. Why is this so?
Since for calculating probabilities and perplexities I only need the
actual language model file and not the counts, this is a bit annoying as
my countfiles are sometimes larger than my RAM.
I kind of "solved" the problem by creating and empty countfile. I tested
this on a small example and saw that it calculates the rescored
probabilities fine. Is there any way to tell ngram not to look for the
countfile? I guess that would be a better solution that just giving the
program a dummy countfile that doesn't correspond to the language model
file.
Thanks
--
Gregor Donaj, univ. dipl. inž. el., univ. dipl. mat.
Laboratorij za digitalno procesiranje signalov
Fakulteta za elektrotehniko, računalništvo in informatiko
Smetanova ulica 17, 2000 Maribor
Tel.: 02/220 72 05
E-mail: gregor.donaj at uni-mb.si
Digital Signal Processing Laboratory
Faculty of Electrical Engineering and Computer Science
Smetanova ulica 17, 2000 Maribor, Slovenia
Tel.: +386 2 220 72 05
More information about the SRILM-User
mailing list