[SRILM User List] Cutoff, probabilities, and backoffs

Andreas Stolcke stolcke at icsi.berkeley.edu
Mon Dec 17 14:05:45 PST 2012


On 12/17/2012 1:52 PM, Mohammed Mediani wrote:
> Thank you very much Andreas,
> In fact, I have done all what you have just suggested.
> - Modify the counts
> - Compute smoothing parameters (discount constants)
> - Compute the probabilities
> - Remove the rare ngrams according to gtmin
> - Compute the backoffs.
>
> I get the exact numbers for both probabilities and backoffs if no 
> gtmin specified. But in the presence of cutoffs, I get a bit different 
> numbers (e.g if gt3min=2 I get slightly different backoffs for 
> 2-grams). I thought I did something wrong, since I still can't get the 
> Backoffs correctly. If there is no special attention to be paid to 
> different cases, the I just need to  look more into it.
The ngram probabilities should be the same.  The backoff weights MUST be 
different since you are backing of for more of the ngrams, when choosing 
a higher gtmin threshold.

Andreas




More information about the SRILM-User mailing list