[SRILM User List] Adding n-grams to an existing LM

Joris Pelemans Joris.Pelemans at esat.kuleuven.be
Sat Nov 2 06:16:16 PDT 2013


On 11/02/13 02:07, Andreas Stolcke wrote:
> On 11/2/2013 8:00 AM, Joris Pelemans wrote:
>> but I get a lot of errors of the type "BOW numerator for context is 
>> ... < 0" and "BOW denominator for context is ... <= 0.
>
> The BOW for a given context is is computed as 1 - sum of all 
> higher-order probabilities (in a given context), divided by 1 - sum of 
> all backoff probabilities for those same ngrams.  So, if you're adding 
> ngrams to a context, those sums can exceed 1, and you end up with 
> negative numerators and/or denominators.
I can see how that happens for the numerators, but aren't the backoff 
weights recomputed and thus this not prevent the denominators from 
ending up negative? What if I remove all the backoff weights and then 
renormalize? I'm just asking out of interest, I got rid of all the 
denominator complaints (see below).
>> What do these errors mean, can I ignore them or is there a better way 
>> to renormalize my new LMs?
>
> I think you should split the existing ngram probabilities among all 
> the synonyms, when the synonym occurs in the final position of the 
> ngram.  That would not add anything to the sums of probabilities 
> involved in the BOW computation.
That did take care of most of the errors. Only a handful of numerator 
complaints left, but I guess that might be due to bad scripting on my 
behalf. I find it strange though that the complaints I get, concern 
n-grams that aren't in the LM at all. The following is the first 
complaint that I get:

BOW numerator for context "negentig Hills" is -0.0120325 < 0

But if I grep the LM (before and after renormalization) for "negentig 
Hills" it gives me nothing? If there are no 3-grams with this context, 
how can 1 - (sum of all higher-order probabilities with this context) be 
negative?

> For example, if have p(c | a b) = x  and d and c synonyms, you set
>
> p(c | a b ) = x/2
> p(d | a b) = x/2
OK, that makes sense. And just to be complete (in case others might want 
to know), if I want to map d onto c with a certainty of say 0.1, then I 
just do:

p(c | a b ) = 0.9*x
p(d | a b) = 0.1*x

> If, however, the synonyms occur in the context portion of the ngram, 
> you can just copy the parameter (as you have been doing).
>
> p( e | a c) = p(e | a d)

And this stays the same for the 0.1 example/

Thanks already!

Joris


More information about the SRILM-User mailing list