[SRILM User List] arpa header number of 4g to big for int

Juan Pino jmp84 at cam.ac.uk
Thu Sep 19 14:42:41 PDT 2013


Thanks very much, this works!
I have attached the patch wrt 1.7.0, it's almost the same.

Best,

Juan


On Thu, Sep 19, 2013 at 9:27 PM, Andreas Stolcke
<stolcke at icsi.berkeley.edu>wrote:

>  The attached patch should fix it.  Note this still doesn't support
> vocabularies larger than 2^32, but the number of higher-order ngrams can
> now be 2^64.
>
> Thanks for reporting this problem!
>
> Andreas
>
>
>
> On 9/19/2013 4:13 AM, Juan Pino wrote:
>
> Hello,
>
>  I am running this command with version 1.7.0 (the purpose is to fix the
> format of my input lm):
>
>  srilm1.7.0/bin/i686-m64/ngram -debug 1 -order 4 -lm MY_LM_IN_ARPA_FORMAT
> -write-lm MY_OUTPUT_LM
>
>  I get this error:
>
>  line 6: ngram number -1840328771 out of range
>
>  This is because I have this header in my input lm:
> ngram 4=2454638525
>
>  So the number of 4grams is bigger than the maximum 32-bit int.
>
>  I've fixed it by replacing
> int nNgrams;
> by
> long nNgrams;
> at line 497 in lm/src/NgramLM.cc and by replacing
> } else if (sscanf(line, "ngram %d=%d", &thisOrder, &nNgrams) == 2) {
> by
> } else if (sscanf(line, "ngram %d=%ld", &thisOrder, &nNgrams) == 2) {
>  at line 515 in lm/src/NgramLM.cc
>
>  Are there other places in the code that I should change ? Is there a
> better solution for my problem ?
>
>  Thanks very much,
>
>  Juan
>
>
> _______________________________________________
> SRILM-User site listSRILM-User at speech.sri.comhttp://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130919/69f9724c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ngramlm-64bit-1.7.0.patch
Type: application/octet-stream
Size: 3707 bytes
Desc: not available
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130919/69f9724c/attachment.obj>


More information about the SRILM-User mailing list