cross-entropy with OOV
svp at zuzino.net.ru
Fri Nov 2 03:45:45 PDT 2007
I need to compute entropy with OOV words...
If we have dict_size diffrent words in training corpora
then for test corpora (per word)
entr2 = entr1 +
entr1 = log2(ppl1)
But in C++ code TextStats.cc I don't know how to get Dict_size_train_corpora
to compute this.
Dict_size_train_corpora = number_unigrams_train_corpora
Thanx in advance!
More information about the SRILM-User