Adding-One smoothing

Thu Oct 18 10:27:37 PDT 2007

> Add-delta smoothing is implemented in the latest version of SRILM.
> Try downloading the 1.5.4 (beta) version.  The options are 
> 
> 	-addsmooth d
> 	-addsmooth1 d
> 	-addsmooth2 d
> 	etc.
> 
> where d is the constant to add to each count.

Thanks Prof. for this new release and your quick answer. I will test it.

> I'm not sure exactly what method you are asking about, but deleted
> interpolation is implemented as the smoothing method used by the
> ngram-count -count-lm option.  ngram -count-lm is used to evaluate such
> an LM.  

currently the SW we have implements something like this:

P(w|h) = lambda_trig * P_3(w|h) + (1-lambda_trig)[lambda_big(P_2(w|h) + (1-lambda_big)[lambda_unig(P(w) + (1-lambda_unig)P(zerogram)]]

In all cases, the probability is calculated using the adding-delta smoothing technique. 

It is important to mention that in this equation, there is a global lambda_trig, lambda_big and lambda_unig values (i.e. this is like having just one bin, not as proposed by Jelinek where there is a different lambda for different bins). 

Previously, I had tried to use the -count-lm using the following configuration file:

order 3 
vocabsize 1002 
totalcount 74883 
mixweights 0 
0.5 0.5 0.5 
countmodulus 1 
counts train.counts

and after applying the EM algorithm I obtained the following values:

order 3
mixweights 0
 0.932452 0.894774 0.994639
countmodulus 1
vocabsize 1002
totalcount 74883
counts train.counts

but my PPL results were not as good as using the SW we have. 

Is it something wrong with the configuration file? or the problem is related with using Good-Turing instead of Adding-delta?

Thanks in advance,

Luis Fernando