[SRILM User List] ngram-count hangs and other problems
E
otheremailid at aol.com
Wed Oct 9 02:05:55 PDT 2013
Hello,
Please find my files here http://goo.gl/WVMEcw
To keep file size small I've only shared unigram counts. When I run the following command-
ngram-count -order 1 -vocab wordList -read ngramCounts -lm ug.lm
I get below output-
warning: no singleton counts
GT discounting disabled
BOW numerator for context "" is -126.947 < 0
I understand that the "singleton" warning is because there are no ngrams that occur only once. Still the "ug.lm" file is generated.
Two issues-
If I use the following command suggested elsewhere in the mailing list to fix "BOW numerator .." warning, I get more warnings and the original warning is still present.
ngram -lm ug.lm -renorm -write-lm ug_norm.lm
If to fix the "singleton" warning, I use WittenBell smoothing (As advised in another thread here), ngram-count hangs indefinitely.
ngram-count -order 1 -vocab wordList -read ngramCounts -lm ug.lm -wbdiscount1
How do I debug this issue?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20131009/8234bbd0/attachment.html>
More information about the SRILM-User
mailing list