[SRILM User List] Backoff question

Mon May 30 16:56:34 PDT 2011

In message <4DAF2364.6060906 at tmit.bme.hu>you wrote:
> 
> Hi Andreas,
> 
> I'd have a question about backoff weights in SRI-LM.
> I know they are weights, and not probabilities,
> but sometimes they become extremely large ( e.g., log(BO)=6 ) and the 
> converted WFST language model is working on an unusual way.
> 
> I've made a dummy corpus to  present my problem.
> The corpus is in text_ab_1000.txt , the resulted counts, and arpa LM in 
> text_ab_1000.out , and text_ab_1000.out.arpa ,
> and the problematic part of the resulted WFST in text_ab_1000.jpg .
> 
> I have only two symbols, "a" and "b". Having both (b|aaa) and (a|aaa) 
> 4-grams, the (aaa) backoff weight would be unnecessary,
> but if I build the WFST, there is the backoff link from (a_a_a) to (a_a).
> In this way I can get from (a_a_a) to (a_a_b) on two ways, b+eps, or 
> BO+b+eps
> The first route has the weight -2,10037 = log(p(b|aaa))
> The second route has the weight 3,66358+(-2,1038) = log( BO(aaa)*p(b|aa) 
> ) = 1,56 which is abnormally high.
> 
> My question is if the backoff weight should have lower values, or the 
> wfst network is incorrectly built?
> 
> Thanks for your advice,
> Tibor Fegyó
> 
> 

Tibor,

there was a problem in the BOW computation when the probabilities from the lower-order distribution
adds up to almost 1.  This causes the BOW denominator to be near zero, and causes anomalous 
values.
You can apply the appended patch.  It also enables debugging information for the BOW computation
with -debug 3 or higher.

See if this fixes your problem.

Andreas

*** lm/src/NgramLM.cc	28 Sep 2010 20:17:24 -0000	1.121
--- lm/src/NgramLM.cc	30 May 2011 23:46:38 -0000	1.122
***************
*** 2039,2045 ****
  	denominator = 0.0;
      }

!     if (denominator == 0.0 && numerator > Prob_Epsilon) {
  	/* 
  	 * Backoff distribution has no probability left.  To avoid wasting
  	 * probability mass scale the N-gram probabilities to sum to 1.
--- 2039,2045 ----
  	denominator = 0.0;
      }

!     if (denominator < Prob_Epsilon && numerator > Prob_Epsilon) {
  	/* 
  	 * Backoff distribution has no probability left.  To avoid wasting
  	 * probability mass scale the N-gram probabilities to sum to 1.
***************
*** 2055,2060 ****
--- 2055,2061 ----
  	    *prob += scale;
  	}

+ 	denominator = 0.0;
  	numerator = 0.0;
  	return true;
      } else if (numerator < 0.0) {
***************
*** 2118,2124 ****
  	     */
  	    if (order == 0 /*&& numerator > 0.0*/) {
  		distributeProb(numerator, context);
! 	    } else if (numerator == 0.0 && denominator == 0) {
  		node->bow = LogP_One;
  	    } else {
  		node->bow = ProbToLogP(numerator) - ProbToLogP(denominator);
--- 2119,2125 ----
  	     */
  	    if (order == 0 /*&& numerator > 0.0*/) {
  		distributeProb(numerator, context);
! 	    } else if (numerator == 0.0 && denominator == 0.0) {
  		node->bow = LogP_One;
  	    } else {
  		node->bow = ProbToLogP(numerator) - ProbToLogP(denominator);
***************
*** 2130,2135 ****
--- 2131,2144 ----
  	    node->bow = LogP_Zero;
  	    result = false;
  	}
+ 
+ 	if (debug(DEBUG_ESTIMATES)) {
+ 	    dout() << "CONTEXT " << (vocab.use(), context)
+ 		   << " numerator " << numerator
+ 		   << " denominator " << denominator
+ 		   << " BOW " << node->bow
+ 		   << endl;
+ 	}
      }

      return result;