[SRILM User List] How to interpolate two class-based language models
Andreas Stolcke
stolcke at icsi.berkeley.edu
Fri Apr 6 11:09:38 PDT 2012
On 4/6/2012 9:01 AM, Meng Chen wrote:
> Hi, I have a question about interpolating two class-based language
> models. Suppose I have two class-based language models trained from
> two different corpus.
> And each class-based lm has its own class definition files. For
> example, the class definition file for class-lm1 is lm1.classes, and
> lm2.classes for class-lm2. So my question is, how to interpolate these
> two different class-based language models? Can you give me the steps?
> with commands better.
>
> * Do I need to use the -classes option when interpolating them?
>
You need to merge the class definitions for both LMs, making sure that
there are no name conflicts. If necessary rename class labels
CLASS01234 to LM1_CLASS01234 etc., in both the LM and the class
definition files, then combine the two class definitions into one file,
then interpolate the models.
>
> * Do I need to use the -bays 0 option to interpolate them dynamically?
>
Yes, you want use something like
ngram -lm LM1 -mix-lm LM2 -lambda L -classes
MERGED_CLASS_DEFINITIONS -bayes 0
> I also confused about the expand class operation. If I expand the
> class-based language model to word-based language model, does the
> perplexity change with the same test set ?
ngram -expand-classes is an approximation, so you won't get exactly the
same ppl, but something close.
Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120406/25db24c9/attachment.html>
More information about the SRILM-User
mailing list