[SRILM User List] arpa header number of 4g to big for int
Juan Pino
jmp84 at cam.ac.uk
Thu Sep 19 14:42:41 PDT 2013
Thanks very much, this works!
I have attached the patch wrt 1.7.0, it's almost the same.
Best,
Juan
On Thu, Sep 19, 2013 at 9:27 PM, Andreas Stolcke
<stolcke at icsi.berkeley.edu>wrote:
> The attached patch should fix it. Note this still doesn't support
> vocabularies larger than 2^32, but the number of higher-order ngrams can
> now be 2^64.
>
> Thanks for reporting this problem!
>
> Andreas
>
>
>
> On 9/19/2013 4:13 AM, Juan Pino wrote:
>
> Hello,
>
> I am running this command with version 1.7.0 (the purpose is to fix the
> format of my input lm):
>
> srilm1.7.0/bin/i686-m64/ngram -debug 1 -order 4 -lm MY_LM_IN_ARPA_FORMAT
> -write-lm MY_OUTPUT_LM
>
> I get this error:
>
> line 6: ngram number -1840328771 out of range
>
> This is because I have this header in my input lm:
> ngram 4=2454638525
>
> So the number of 4grams is bigger than the maximum 32-bit int.
>
> I've fixed it by replacing
> int nNgrams;
> by
> long nNgrams;
> at line 497 in lm/src/NgramLM.cc and by replacing
> } else if (sscanf(line, "ngram %d=%d", &thisOrder, &nNgrams) == 2) {
> by
> } else if (sscanf(line, "ngram %d=%ld", &thisOrder, &nNgrams) == 2) {
> at line 515 in lm/src/NgramLM.cc
>
> Are there other places in the code that I should change ? Is there a
> better solution for my problem ?
>
> Thanks very much,
>
> Juan
>
>
> _______________________________________________
> SRILM-User site listSRILM-User at speech.sri.comhttp://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130919/69f9724c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ngramlm-64bit-1.7.0.patch
Type: application/octet-stream
Size: 3707 bytes
Desc: not available
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130919/69f9724c/attachment.obj>
More information about the SRILM-User
mailing list