Naive question about unknown words
Andreas Stolcke
stolcke at speech.sri.com
Tue Oct 11 09:15:49 PDT 2005
In message <434BD6DF.7040405 at healthonnet.org>you wrote:
> Sorry for this naive question:
>
> I create my LM with this command:
> ngram-count -text learningdb.txt -lm GT -unk
>
> I evaluate a sentence with the following command:
> ngram -lm GT -ppl sentence.txt
>
> I obtain coherent results but I get also the following warning message:
> "warning: non-zero probability for <unk> in closed-vocabulary LM"
>
> Can anyone give me some information about this warning and how to avoid it?
> Of course I need to give a weight for the unknown words.
You need to specify -unk on the ngram command line as well.
--Andreas
More information about the SRILM-User
mailing list