[SRILM User List] Question about SRILM and sentence boundary detection
L. Amber Wilcox-O'Hearn
amber.wilcox.ohearn at gmail.com
Tue Feb 14 11:20:13 PST 2012
On Tue, Feb 14, 2012 at 9:41 AM, Andreas Stolcke
<stolcke at icsi.berkeley.edu> wrote:
> On 2/14/2012 4:54 AM, L. Amber Wilcox-O'Hearn wrote:
>> I see. I misunderstood the difference between -ppl and -counts.
>> I did try this and the summary statistics at the end gave the correct
>> sum, but there weren't any statistics output before the escaped lines:
>>> cat testcounts | ngram -lm LM -escape "===" -counts - -unk
>> file -: 0 sentences, 4 words, 0 OOVs
>> 0 zeroprobs, logprob= -9.87606 ppl= 294.452 ppl1= 294.452
>> Did I miss something?
> This is poorly documented. The escape lines trigger output of "sentence
> level" statistics. At the end, you get the "file level" statistics.
> However, to be compatible with -ppl, sentence level stats are only output
> with -debug 1 or higher. So your example will work as long as you also add
> -debug 1.
Ah, perfect. Thank you very much!
More information about the SRILM-User