error in discount estimator for order 3

Rebecca Madsen rmadsen at byu.net
Thu Aug 3 14:02:46 PDT 2006


Is there a reason why duplicating my data would give me the following error:

using ModKneserNey for 3-grams
Kneser-Ney smoothing 3-grams
n1 = 0
n2 = 94762
n3 = 0
n4 = 37773
one of required modified KneserNey count-of-counts is zero
error in discount estimator for order 3

I can build a language model using the following command line with the
normal data, but concatenating two copies of the data together gives
me the discount estimator error.

$ /home/tools/srilm/bin/i686/ngram-count -text my_data_doubled.txt
-interpolate -kndiscount1 -kndiscount2 -kndiscount3 -lm
my_data_doubled.lm

Thanks for your help,
Rebecca



More information about the SRILM-User mailing list