[SRILM User List] Right way to build LM

Tue Apr 29 23:53:39 PDT 2014

Right, thanks Andreas.
It's getting clearer to me now.

Regards,
Ismail

On 04/30/2014 01:39 PM, Andreas Stolcke wrote:
> On 4/28/2014 7:38 PM, Ismail Rusli wrote:
>> Thanks for the answer, Andreas.
>>
>> As i read paper by
>> Chen and Goodman (1999), they used held-out data
>> to optimize parameters in language model. How do i
>> do this in SRILM? Does SRILM optimize parameters
>> when i use -kndiscount?
> SRILM just uses the formulas for estimating the discounts from the 
> count-of-counts, i.e., equations (26) in the Chen & Goodman technical 
> report 
> <http://www.speech.sri.com/projects/srilm/manpages/pdfs/chen-goodman-tr-10-98.pdf>.
>
>> I tried -kn to save
>> parameters in a file and included this file
>> when building LM but it turned out
>> my perplexity is getting bigger.
> You can save the discounting parameters using:
>
> 1)      ngram-count -read COUNTS -kndiscount -kn1 K1 -kn2 K2 -kn3 K3
> (no -lm argument!)
>
> Then you can read them back in for LM estimation using
>
> 2)    ngram-count -read COUNTS -kndiscount -kn1 K1 -kn2 K2 -kn3 K3 -lm LM
>
> and the result will be identical to the second command when run 
> without -kn1/2/3 options.
>
> Now, if you want you can manipulate the discounting parameters before 
> invoking command 2.
> For example, you could perform a search over the D1, D2, D3 parameters 
> optimizing perplexity on a held-out set, just like C&G did.  But you 
> have to implement that search yourself by writing some wrapper scripts.
>
> Also consider the interpolated version of KN smoothing.   Just add the 
> ngram-count -interpolate option, it usually gives slightly better results.
>>
>> And just one more,
>> do you have a link to good tutorial in using
>> class-based models with SRILM?
> There is a basic tutorial at 
> http://ssli.ee.washington.edu/ssli/people/sarahs/srilm.html .
>
> Andreas
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20140430/af9c557e/attachment.html>