[SRILM User List] Why does -addsmooth still has discounting effects?
贺天行
cloudygooseg at gmail.com
Mon May 27 00:19:11 PDT 2013
The manual wrote:
*-addsmooth** D***********Smooth by adding *D *to each N-gram count. This
is usually a poor smoothing method, included mainly for instructional
purposes.
*p*(*a*_*z*) = (*c*(*a*_*z*) + *D*) / (*c*(*a*_) + *D* *n*(*))
My script is:
ngram-count -write allcnt -order 3 -debug 2 -text test_htx.dat -addsmooth
0 -lm lmtest
The the debug wrote:
test_htx.dat: line 3: 2 sentences, 6 words, 0 OOVs
0 zeroprobs, logprob= 0 ppl= 1 ppl1= 1
using AddSmooth for 1-grams
using AddSmooth for 2-grams
using AddSmooth for 3-grams
discarded 1 2-gram contexts containing pseudo-events
discarded 2 3-gram contexts containing pseudo-events
discarded 6 3-gram probs discounted to zero
writing 6 1-grams
writing 8 2-grams
writing 0 3-grams
So there's still discounting, I'm confused that why addsmooth still has
discounting?
Thanks a lot!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130527/ad4893b1/attachment.html>
More information about the SRILM-User
mailing list