Adding-One smoothing

Thu Oct 18 09:35:01 PDT 2007

In message <20071017085610.GC26232 at die.upm.es>you wrote:
> Hello everyone:
> 
> I just want to ask if the SRILM toolkit allows the creation a LM using the Li
> dstone's smoothing technique (i.e. adding-one or adding-delta). I want to com
> pare the results obtained with a proprietary SW that works with this smoothin
> g and the SRILM. I know that this technique is not the best one, but unfortun
> ately we have a small corpus (around 5K sentences) and, at the moment, the pe
> rformance of the other techniques have not been really good when compared wit
> h Lidstone's (at least using this SW). 

Add-delta smoothing is implemented in the latest version of SRILM.
Try downloading the 1.5.4 (beta) version.  The options are 

	-addsmooth d
	-addsmooth1 d
	-addsmooth2 d
	etc.

where d is the constant to add to each count.

> 
> BTW: In our SW we use deleted interpolation, I know that SRILM just accept Ba
> ckoff models. In a previous email in the user´s list, I saw an explanation abo
> ut how to use it, but it was not totally clear for me. Could you (prof. Stolc
> ke) expand a little more the example you wrote? Or if anyone has experience w
> ith that to explain me it again? 

I'm not sure exactly what method you are asking about, but deleted
interpolation is implemented as the smoothing method used by the
ngram-count -count-lm option.  ngram -count-lm is used to evaluate such
an LM.  Read the ngram man page to find a description of the file format.
You prepare a descriptor file for -count-lm, estimate the interpolation
weights with ngram-count, and then give the resulting file to ngram-count.
An example of all this is in $SRILM/test/tests/ngram-count-lm/run-test .

Andreas

> 
> Thanks in advance.
> 
> Sincerely,
> 
> 
> Luis Fernando D'Haro