[SRILM User List] Observed omit event
Andreas Stolcke
stolcke at icsi.berkeley.edu
Tue Mar 6 12:34:20 PST 2012
The attached source patch will fix the behavior of ngram -hidden-vocab
so that the vocab file can contain event property specifications as
described in the man page. Previously only the names of the hidden
event words were read from that file, but all treated as default hidden
events.
The patch also fixes a couple of unrelated bugs in HiddenNgram.cc .
Andreas
On 2/27/2012 9:14 AM, Andreas Stolcke wrote:
> On 2/27/2012 5:45 AM, Dmytro Prylipko wrote:
>> Hi,
>>
>> I would like to clarify how to evaluate properly a language model
>> with an observed hidden event (<A>), omitted from context.
>>
>> I have manually created the counts file, where this event had been
>> skipped from context, and have built a LM from that.
>> Also, I have added this line to the end of the LM file:
>> <A> -observed -omit
>>
>> My question is whether it is necessary to specify a hidden vocabulary
>> with -hidden-vocab option.
>> Which command line is correct:
>>
>> ngram -lm 3-gram.omit.lm -ppl test.txt -order 3 -vocab wlist
>> -hidden-vocab df.defs
>>
>> or just
>>
>> ngram -lm 3-gram.omit.lm -ppl test.txt -order 3 -vocab wlist
>
> If you append the hidden vocab definitions to the LM file you only
> need to tell ngram that it IS a hidden even LM that you're reading.
> You can achieve that by adding -hidden-vocab /dev/null .
>
> Andreas
>
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: hidden-ngram.patch
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120306/b86b9713/attachment.ksh>
More information about the SRILM-User
mailing list