[SRILM User List] arpa header number of 4g to big for int

Juan Pino jmp84 at cam.ac.uk
Thu Sep 19 04:13:10 PDT 2013


Hello,

I am running this command with version 1.7.0 (the purpose is to fix the
format of my input lm):

srilm1.7.0/bin/i686-m64/ngram -debug 1 -order 4 -lm MY_LM_IN_ARPA_FORMAT
-write-lm MY_OUTPUT_LM

I get this error:

line 6: ngram number -1840328771 out of range

This is because I have this header in my input lm:
ngram 4=2454638525

So the number of 4grams is bigger than the maximum 32-bit int.

I've fixed it by replacing
int nNgrams;
by
long nNgrams;
at line 497 in lm/src/NgramLM.cc and by replacing
} else if (sscanf(line, "ngram %d=%d", &thisOrder, &nNgrams) == 2) {
by
} else if (sscanf(line, "ngram %d=%ld", &thisOrder, &nNgrams) == 2) {
at line 515 in lm/src/NgramLM.cc

Are there other places in the code that I should change ? Is there a better
solution for my problem ?

Thanks very much,

Juan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130919/9f090a30/attachment.html>


More information about the SRILM-User mailing list