[SRILM User List] classes-format question + followup question

Wed Apr 27 11:50:53 PDT 2011

Fabian - wrote:
> Hi Andreas,
>
> thank you again for the quick answer. Unfortunately didn't make myself 
> clear. I really want to interpolate one class LM and one word LM. 
> Where the classes are part-of-speech tags. So the question is, again, 
> why is static interpolation not correct/possible?
Although the class LM mechanism in SRILM can handle ngrams over a mix of 
words and classes, empirically it does not work well to merge 
(statically interpolate) models where one is a purely word-based and the 
other is a class-based ngram LM.   This is because ngram -mix-lm WITHOUT 
the -bayes 0 option does not just implement the standard interpolation 
of probability estimates, it also merges the ngrams used for backoff 
computation (this is explained in the 2002 Interspeech paper). This 
works fine, and usually improves the results when combining models of 
the same type, but merging a lower-order ngram with a lower-order 
class-based LM gives weird results because the class expansions is not 
applied at the backoff  level when performing the merge.

For this reason, the ngram man page says (see the last sentence):

       -mix-lm file
              Read a second N-gram model for interpolation purposes.  
The second and any additional interpolated
              models can also be class N-grams (using the same -classes 
definitions),  but  are  otherwise  con-
              strained  to  be standard N-grams, i.e., the options -df, 
-tagged, -skip, and -hidden-vocab do not
              apply to them.
              NOTE: Unless -bayes (see below) is specified, -mix-lm 
triggers a static interpolation of the  mod-
              els  in memory.  In most cases a more efficient, dynamic 
interpolation is sufficient, requested by
              -bayes 0.  Also, mixing models of different type (e.g., 
word-based and class-based) will only work
              correctly with dynamic interpolation.

So you might just have to re-engineer your application to accept true 
interpolated LMs, or, if its feasible, convert the class-LM into a 
word-based LM with  ngram -expand-classes BEFORE doing the merging of 
models.   Sorry.

Andreas

>
> > > thank you again for the quick help!
> > > I added the smoothing and the PPL dropped to 720 which is a bit
> > > better, but still above the range ~500 which would "feel" correct.
> > > Anyways,...
> > >
> > You might want to verify that your probabilities are normalized
> > correctly. Try ngram -debug 3 -ppl .
> Well, it seems that the probabilities are not properly normalized -> 
> there are many warnings:
> for example:
> warning: word probs for this context sum to 2.48076 != 1 : ...
>
> >
> > >
> > > ...I have another question:
> > >
> > > why can't i use the static interpolation for interpolating one class
> > > LMs and word LM? I use a class-based (from ngram-count) or one
> > > class-based with my own tags with the word-based LM. In the
> > > documentation it only says -mix-lm with static interpolation won't
> > > work correct?
> > I didn't realize you want to interpolate two class-based LMs. That
> > should work, you just need to keep the class labels distinct, nad
> > combine the class definition files in to one file.
> > > I want to build interpolated LMs (with -write-lm) to use them in my
> > > ASR, so far I simply used the static interpolation, which seems to
> > > work more or less OK.
> > You should be able to ngram -mix-lm -write-lm with two class-based LMs
> > but WITHOUT using the -classes option when doing so.
> > If you include the -classes it will be appended to the LM file.
>
> Fabian