[SRILM User List] gtmin and kndiscount

Fri Dec 14 13:43:51 PST 2012

On 12/14/2012 1:21 PM, Mohammed Mediani wrote:
> Could anybody please tell me how the discounting parameters for 
> modified kneser-ney smoothing (D1, D2, D3+) are computed in case we 
> have gtmin parameter greater than 1.
> In such case, the corresponding ni would be zero, and we eventually 
> have to divide by this ni to get one of the Di's.
> Many thanks,
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user 

The gtmin parameter is applied (i.e., the ngrams with frequency below 
the threshold are omitted from the model) AFTER the discounting 
constants are computed, so the gtmin options don't affect the D1,D2,D3 
computation.

You have a problem when frequency cutoffs have been applied to the Ngram 
data BEFORE SRILM gets to see it.  This is the case, e.g., with the 
Google N-gram data.  In that case, if you use the make-big-lm wrapper 
script, an attempt will be made to extrapolate the low count-of-counts 
from the higher ones, according to an empirical law that is described in 
Figure 1 / Equation 1 of this paper 
<http://www.speech.sri.com/cgi-bin/run-distill?papers/asru2007-mt-lm.ps.gz>.

Andreas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20121214/9feeac2a/attachment.html>