SRILM BOW denominator warning
David Gelbart
gelbart at icsi.berkeley.edu
Mon Jan 14 17:23:42 PST 2008
Hello,
I am trying to build a trigram LM for the OGI Numbers corpus, in which
utterances are spoken strings of numbers such as 'eighty nine eighty
eight'. Since there are no singletons, I am using Witten-Bell
discounting instead of Good-Turing. ngram-count displays "BOW
denominator for context... is zero" warnings. Does this mean the LM
is broken? If I try adding "-gt3min 1 -gt2min 1" to the ngram-count
options, I still see these warnings. Here is the ngram-count output:
$ ngram-count -wbdiscount -text /u/gelbart/tmp/train.trans -order 3 \
-lm /u/gelbart/tmp/numbers-wb.lm
BOW denominator for context "seven" is zero; scaling probabilities to sum to 1
BOW denominator for context "six" is zero; scaling probabilities to sum to 1
BOW denominator for context "four" is zero; scaling probabilities to sum to 1
BOW denominator for context "two" is zero; scaling probabilities to sum to 1
In the generated language model, the log BOWs are zero for those four
words:
-1.156247 four 0
-1.09725 seven 0
-1.203041 six 0
-1.029482 two 0
Thanks,
David
More information about the SRILM-User
mailing list