[Q] on mix-lm?
Andreas Stolcke
stolcke at speech.sri.com
Thu Oct 3 08:25:58 PDT 2002
Woosung,
I suspect that you are noticing the difference between "static" and
"dynamic" interpolation. The former is sometimes called N-gram "merging",
while the latter is the commonly used mixture of probabilities.
ngram -bayes 0 -mix-lm performs dynamic interpolation.
Without the -bayes option you get static interpolation.
This is also explained in the man page:
-mix-lm file
Read a second N-gram model for interpolation pur-
poses. The second and any additional interpolated
models can also be class N-grams (using the same
-classes definitions), but are otherwise con-
strained to be standard N-grams, i.e., the options
-df, -tagged, -skip, and -hidden-vocab do not apply
to then.
NOTE: Unless -bayes (see below) is specified, -mix-
lm triggers a static interpolation of the models in
memory. In most cases a more efficient, dynamic
interpolation is sufficient, requested by -bayes 0.
There is some discussion of the two methods in the paper that just
appeared in ICSLP
(http://www.speech.sri.com/cgi-bin/run-distill?papers/icslp2002-srilm.ps.gz,
last paragraph of section 3.2).
--Andreas
In message <20021003001518.430ac8d8.woosung at clsp.jhu.edu>you wrote:
> Dear Dr. Stolcke,
>
> I am doing some experiments using interpolated LMs, and
> I've noticed that mixed LMs give slightly different PPLs
> from PPLs that should be. I mean, PPLs calculated by getting
> weighted sums after getting respective models' word probs.
> Do you have any documentations or explanations how that 'mix-lm'
> works in your toolkit or how it is different from the correct way?
> Of course, the best ways would be to look at the source code,
> but I am looking for an easier way.
> According to my experiments, mix-lm gives better results when
> the baseline model (before mixing) is good (PPL less than 300),
> but it gives worse results when it is not good (PPL above 500).
>
> Thanks in advance,
> --
> Woosung Kim
More information about the SRILM-User
mailing list