Adding-One smoothing
Andreas Stolcke
stolcke at speech.sri.com
Thu Oct 18 09:35:01 PDT 2007
In message <20071017085610.GC26232 at die.upm.es>you wrote:
> Hello everyone:
>
> I just want to ask if the SRILM toolkit allows the creation a LM using the Li
> dstone's smoothing technique (i.e. adding-one or adding-delta). I want to com
> pare the results obtained with a proprietary SW that works with this smoothin
> g and the SRILM. I know that this technique is not the best one, but unfortun
> ately we have a small corpus (around 5K sentences) and, at the moment, the pe
> rformance of the other techniques have not been really good when compared wit
> h Lidstone's (at least using this SW).
Add-delta smoothing is implemented in the latest version of SRILM.
Try downloading the 1.5.4 (beta) version. The options are
-addsmooth d
-addsmooth1 d
-addsmooth2 d
etc.
where d is the constant to add to each count.
>
> BTW: In our SW we use deleted interpolation, I know that SRILM just accept Ba
> ckoff models. In a previous email in the user´s list, I saw an explanation abo
> ut how to use it, but it was not totally clear for me. Could you (prof. Stolc
> ke) expand a little more the example you wrote? Or if anyone has experience w
> ith that to explain me it again?
I'm not sure exactly what method you are asking about, but deleted
interpolation is implemented as the smoothing method used by the
ngram-count -count-lm option. ngram -count-lm is used to evaluate such
an LM. Read the ngram man page to find a description of the file format.
You prepare a descriptor file for -count-lm, estimate the interpolation
weights with ngram-count, and then give the resulting file to ngram-count.
An example of all this is in $SRILM/test/tests/ngram-count-lm/run-test .
Andreas
>
> Thanks in advance.
>
> Sincerely,
>
>
> Luis Fernando D'Haro
More information about the SRILM-User
mailing list