[Q] on mix-lm?

Andreas Stolcke stolcke at speech.sri.com
Thu Oct 3 08:25:58 PDT 2002


Woosung,

I suspect that you are noticing the difference between "static" and
"dynamic" interpolation.  The former is sometimes called N-gram "merging",
while the latter is the commonly used mixture of probabilities.
ngram -bayes 0 -mix-lm performs dynamic interpolation. 
Without the -bayes option you get static interpolation.
This is also explained in the man page:

       -mix-lm file
              Read a second N-gram model for  interpolation  pur-
              poses.   The second and any additional interpolated
              models can also be class N-grams  (using  the  same
              -classes   definitions),  but  are  otherwise  con-
              strained to be standard N-grams, i.e., the  options
              -df, -tagged, -skip, and -hidden-vocab do not apply
              to then.
              NOTE: Unless -bayes (see below) is specified, -mix-
              lm triggers a static interpolation of the models in
              memory.  In most cases a  more  efficient,  dynamic
              interpolation is sufficient, requested by -bayes 0.

There is some discussion of the two methods in the paper that just 
appeared in ICSLP
(http://www.speech.sri.com/cgi-bin/run-distill?papers/icslp2002-srilm.ps.gz,
last paragraph of section 3.2).

--Andreas

In message <20021003001518.430ac8d8.woosung at clsp.jhu.edu>you wrote:
> Dear Dr. Stolcke,
> 
> I am doing some experiments using interpolated LMs, and
> I've noticed that mixed LMs give slightly different PPLs
> from PPLs that should be. I mean, PPLs calculated by getting
> weighted sums after getting respective models' word probs.
> Do you have any documentations or explanations how that 'mix-lm' 
> works in your toolkit or how it is different from the correct way?
> Of course, the best ways would be to look at the source code,
> but I am looking for an easier way.
> According to my experiments, mix-lm gives better results when
> the baseline model (before mixing) is good (PPL less than 300), 
> but it gives worse results when it is not good (PPL above 500).
> 
> Thanks in advance,
> -- 
> Woosung Kim




More information about the SRILM-User mailing list