[SRILM User List] Observed omit event

Andreas Stolcke stolcke at icsi.berkeley.edu
Tue Mar 6 12:34:20 PST 2012


The attached source patch will fix the behavior of ngram -hidden-vocab 
so that the vocab file can contain event property specifications as 
described in the man page.   Previously only the names of the hidden 
event words were read from that file, but all treated as default hidden 
events.

The patch also fixes a couple of unrelated bugs in HiddenNgram.cc .

Andreas


On 2/27/2012 9:14 AM, Andreas Stolcke wrote:
> On 2/27/2012 5:45 AM, Dmytro Prylipko wrote:
>> Hi,
>>
>> I would like to clarify how to evaluate properly a language model 
>> with an observed hidden event (<A>), omitted from context.
>>
>> I have manually created the counts file, where this event had been 
>> skipped from context, and have built a LM from that.
>> Also, I have added this line to the end of the LM file:
>> <A> -observed -omit
>>
>> My question is whether it is necessary to specify a hidden vocabulary 
>> with -hidden-vocab option.
>> Which command line is correct:
>>
>> ngram -lm 3-gram.omit.lm -ppl test.txt -order 3 -vocab wlist 
>> -hidden-vocab df.defs
>>
>> or just
>>
>> ngram -lm 3-gram.omit.lm -ppl test.txt -order 3 -vocab wlist
>
> If you append the hidden vocab definitions to the LM file you only 
> need to tell ngram that it IS a hidden even LM that you're reading.
> You can achieve that by adding -hidden-vocab /dev/null .
>
> Andreas
>
>

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: hidden-ngram.patch
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120306/b86b9713/attachment.ksh>


More information about the SRILM-User mailing list