[SRILM User List] ngram-count hangs and other problems

E otheremailid at aol.com
Wed Oct 9 02:05:55 PDT 2013


Hello,


Please find my files here  http://goo.gl/WVMEcw


To keep file size small I've only shared unigram counts. When I run the following command-



ngram-count -order 1 -vocab wordList -read ngramCounts -lm ug.lm


I get below output-
warning: no singleton counts
GT discounting disabled
BOW numerator for context "" is -126.947 < 0



I understand that the "singleton" warning is because there are no ngrams that occur only once. Still the "ug.lm" file is generated.


Two issues-
If I use the following command suggested elsewhere in the mailing list to fix "BOW numerator .." warning, I get more warnings and the original warning is still present.


ngram -lm ug.lm -renorm -write-lm ug_norm.lm


If to fix the "singleton" warning, I use WittenBell smoothing (As advised in another thread here), ngram-count hangs indefinitely.


ngram-count -order 1 -vocab wordList -read ngramCounts -lm ug.lm -wbdiscount1


How do I debug this issue?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20131009/8234bbd0/attachment.html>


More information about the SRILM-User mailing list