format error in kncounts.gz

Alexy Khrabrov deliverable at gmail.com
Thu Jun 5 14:51:58 PDT 2008


Hmm -- I might have omitted the -lm new-mode switch, and as a result  
got only kncounts.gz file in that case.  How would I get a KN model  
out of it, and is it faster than rerunning make-big-lm from scratch?

Cheers,
Alexy

On Jun 5, 2008, at 8:36 AM, ilya oparin wrote:

> You have probably set wrong parameters to make-big-lm or took wrong  
> output file.
>
> make-big-lm -name name -read counts -lm new-model [ -trust-totals ]  
> [-max-per-file M ] [ -ngram-filter filter ] [ ngram-options ... ]
>
> May it happen that took counts file (from manual: "The  -name  
> parameter is used to name various auxiliary files.  counts contains  
> the raw N-gram counts; it may be (and usually is) a compressed file.  
> "), instead of the resulting LM file generated by the script (the  
> name of which you put after -lm option)? Basically a count file is  
> used to generate LMs that are subsequently read with "ngram -lm  
> my_LM ...". Counts file is not a language model on its own.
>
>
> best regards,
> Ilya
>
>
> --- On Thu, 5/6/08, Alexy Khrabrov <deliverable at gmail.com> wrote:
>
>> From: Alexy Khrabrov <deliverable at gmail.com>
>> Subject: Re: format error in kncounts.gz
>> To: ioparin at yahoo.co.uk
>> Cc: "srilm-user" <srilm-user at speech.sri.com>
>> Date: Thursday, 5 June, 2008, 6:46 PM
>> Hmm -- I've run make-big-lm, and got a few small files,
>> a .kndir, and
>> that kncounts.gz -- which looks just like counts and is a
>> few
>> gigabytes, so I thought that's my model.  I've
>> posted my command line
>> earlier when figuring out exactly the way to get a
>> Kneser-Ney
>> model...  The kncounts.gz looks just like a counts file.
>>
>> The counts I fed to make-big-lm with -read are the ones I
>> got with
>> make/merge-batch-counts -order 5 for 5-grams.  Should I
>> have done
>> anything extra before or after?
>>
>> Cheers,
>> Alexy
>>
>> On Jun 5, 2008, at 5:46 AM, ilya oparin wrote:
>>
>>> That usually means you're loading something else
>> than a LM in the
>>> ARPA format. Have you visually checked your
>> model.kncounts.gz?
>
>
>      __________________________________________________________
> Sent from Yahoo! Mail.
> A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html
>




More information about the SRILM-User mailing list