[SRILM User List] SRI LM toolkit: ngram-count

Andreas Stolcke stolcke at speech.sri.com
Mon Feb 8 09:59:45 PST 2010


On 2/7/2010 9:35 PM, 이일빈 wrote:
> Dear Andreas Stolcke
> Hello. I'm ILBIN LEE who develops a speech recognizer in ETRI, Korea.
> While using ngram-count command of SRI LM toolkit, I encountered the
> following error message.
> $ ngram-count.exe -order 3 -sort -float-counts -gt2min 1 -gt3min 1
> -vocab vocab.txt -read count.txt -lm lm.txt
> error in discount estimator for order 1
> The count file is an interpolation of two different count files. So it
> has lots of fractional counts.
> If you could suggest me some possible causes, it would help me a lot.
You cannot use Good Turing discounting with fractional counts. Try
-wbdiscount or -cdiscount or -addsmooth.

The fact that you didn't get an error message also indicates that you
weren't using -float-counts, which you must when processing fractional
counts.

Please also read the FAQ section on Smoothing issues before proceedings
further.

Andreas

> Thank you.
> Best regards,
> ILBIN

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20100208/9d7861fa/attachment.html>


More information about the SRILM-User mailing list