[SRILM User List] If I use -kndiscount for order 2, does I get an uncorrect unigram model?

Andreas Stolcke stolcke at icsi.berkeley.edu
Tue May 28 09:41:10 PDT 2013


On 5/28/2013 12:42 AM, 贺天行 wrote:
> Hello
> When I use order 2 kndiscount, I get a unigram model and a bigram model
> Then I use order 1 kndiscount, I also get a unigram
> But these two unigrams are different, I read the
> http://www.speech.sri.com/projects/srilm/manpages/ngram-discount.7.html
> It seems that this has to do with some implementation issue, what i 
> want to ask is, is the unigram I get in the order 2 kndiscount uncorrect?
>
> Because if I use order1 Katz discount and order 2 Katz discount, the 
> two unigrams are the same, so I think I need to treat kndiscount 
> result with caution.

It is one of the distinguishing features of KN discounting that the 
lower-order (backoff) distributions are estimated differently from the 
highest-order distribution.
You are not supposed to use the unigram distribution in a KN-smoothed 
bigram by itself.

So what you're seeing is completely expected and correct.   For a 
detailed explanation see the Chen and Goodman paper. 
<http://www.speech.sri.com/projects/srilm/manpages/pdfs/chen-goodman-tr-10-98.pdf>

Andreas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130528/cc22c9d5/attachment.html>


More information about the SRILM-User mailing list