[SRILM User List] Does '-omit' work?
Andreas Stolcke
stolcke at icsi.berkeley.edu
Tue Jan 31 01:12:18 PST 2012
On 1/29/2012 7:45 AM, Dmytro Prylipko wrote:
> Dear Andreas,
>
> I found that using -omit and -observed options does not influence on
> the calculation of perplexity.
> I trained an skip-LM for filled pauses as you advised me (generated
> n-grams, where FPs were skipped from context).
> But when I apply it to the test data it does not matter which
> combination of options do I use for the hidden-vocabulary:
> <FP> -omit -observed
> <FP> -omit
> <FP> -observed
> or just
> <FP>
This is a bug, more in the documentation than in the code.
The hidden event "options" (-omit, -observed, etc) are only processed
when they appear in the -lm file following the ngram parameters.
When processing the -hidden-vocab file, on the other hand, only the
names of the hidden events are recorded (like -vocab).
This should be fixed. But for now, simply append your hidden-event file
to the contents of the -lm file .
Sorry for the confusion in the man page. It kind of says this but in a
very confusing way, and I agree that the -hidden-vocab file should also
interpret the full hidden event specifications.
Andreas
More information about the SRILM-User
mailing list