[SRILM User List] Cutoff, probabilities, and backoffs
Andreas Stolcke
stolcke at icsi.berkeley.edu
Mon Dec 17 14:05:45 PST 2012
On 12/17/2012 1:52 PM, Mohammed Mediani wrote:
> Thank you very much Andreas,
> In fact, I have done all what you have just suggested.
> - Modify the counts
> - Compute smoothing parameters (discount constants)
> - Compute the probabilities
> - Remove the rare ngrams according to gtmin
> - Compute the backoffs.
>
> I get the exact numbers for both probabilities and backoffs if no
> gtmin specified. But in the presence of cutoffs, I get a bit different
> numbers (e.g if gt3min=2 I get slightly different backoffs for
> 2-grams). I thought I did something wrong, since I still can't get the
> Backoffs correctly. If there is no special attention to be paid to
> different cases, the I just need to look more into it.
The ngram probabilities should be the same. The backoff weights MUST be
different since you are backing of for more of the ngrams, when choosing
a higher gtmin threshold.
Andreas
More information about the SRILM-User
mailing list