[SRILM User List] How to interpolate two class-based language models

Andreas Stolcke stolcke at icsi.berkeley.edu
Fri Apr 6 11:09:38 PDT 2012

On 4/6/2012 9:01 AM, Meng Chen wrote:
> Hi, I have a question about interpolating two class-based language 
> models. Suppose I have two class-based language models trained from 
> two different corpus.
> And each class-based lm has its own class definition files. For 
> example, the class definition file for class-lm1 is lm1.classes, and 
> lm2.classes for class-lm2. So my question is, how to interpolate these 
> two different class-based language models? Can you give me the steps? 
> with commands better.
>   * Do I need to use the -classes option when interpolating them?
You need to merge the class definitions for both LMs, making sure that 
there are no name conflicts.  If necessary rename class labels 
CLASS01234 to LM1_CLASS01234 etc., in both the LM and the class 
definition files, then combine the two class definitions into one file, 
then interpolate the models.
>   * Do I need to use the -bays 0 option to interpolate them dynamically?
Yes, you want use something like

     ngram -lm LM1 -mix-lm LM2 -lambda L -classes 

> I also confused about the expand class operation. If I expand the 
> class-based language model to word-based language model, does the 
> perplexity change with the same test set ?
ngram -expand-classes is an approximation, so you won't get exactly the 
same ppl, but something close.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120406/25db24c9/attachment.html>

More information about the SRILM-User mailing list