[SRILM User List] Interpolation of Unigrams

Sat Dec 15 23:34:37 PST 2012

On 12/15/2012 6:48 AM, Mohammed Mediani wrote:
> Hi,
> Are the unigrams always interpolated with 0-gram (probability of any 
> word from the vocab)?
> I got the same probabilities for unigrams with and without 
> -interpolate (both with  -kndiscount). Is it meant to be this way?
> Many thanks for your help.
> Mohammed
The KN discounting strategy for  unigrams only interpolates with the 
zero-gram (uniform) estimate if the -interpolate flag is given.
This is just a special case of the interpolation happening at all 
N-vgram levels.

However, there is an independent step whereby unallocated unigram 
probability mass is filled in by adding a uniform probability increment 
to all words in the vocabulary.   When this happens you see a message like

warning: distributing 0.0659302 left-over probability mass over all 
26573 words

This happens for unigrams only, and regardless of what discounting 
method is in effect, because otherwise that probability mass would be 
"lost" and the model would be deficient.

It so happens that the effect of both strategies is the same when it 
comes to unigrams, and that explains your observation.

Andreas