singleton counts warning
Solen Quiniou
solen.quiniou at irisa.fr
Mon Mar 15 00:36:49 PST 2004
Hi !
I use SRILM to build a language model on letters. I have a warning that
I don't understand : "warning: no singleton counts
GT discounting disabled"
So, the model computed is wrong since some back-off weight are positives
(in log-probability) ! Do you know what does this warning mean ? I
thought no counts on single letters were computed but they were so I
can't find an explanation !
I've got another question, about the computation of unigram
log-probability. When I used the formula : log[P(w)] = log[c(w)] -
log[N], where N is the number of word TOKENS in the training corpus, I
don't find exactly the value given by SRILM. Is there smoothing on
unigram ? And if so, how is it made ?
Thank you for answering.
Solen.
More information about the SRILM-User
mailing list