[SRILM User List] lm interpolation

Mon Oct 29 09:15:25 PDT 2012

Hello everyone,

I am trying to interpolate 2 language models because I want to do an experiment in domain adaption. Below are the commands that I used. When I try to compute lamda, I get the error "mismatch in number of samples (60001 != 67708)". I don't know what to fix...please help me.

~/local/tools/srilm/bin/i686/ngram -order 3  -unk -lm 
~/local/test1/lm/lm1.lm -ppl ~/local/test1/lm/de-en_corpus1.lowercased.en -debug 2 >  ppl1.ppl
~/local/tools/srilm/bin/i686/ngram -order 3  -unk -lm ~/local/test2/lm/lm2.lm -ppl ~/local/test2/lm/de-en_corpus2.lowercased.en -debug 2 >  ppl2.ppl
~/local/tools/srilm/bin/i686/compute-best-mix ~/local/test1/ppl1.ppl ~/local/test2/ppl2.ppl

The ppl1.ppl file contains: " 2082 sentences, 57919 words, 0 OOVs
0 zeroprobs, logprob= -100036 ppl= 46.4762 ppl1= 53.3534" and
the ppl2.ppl file contains: "2091 sentences, 65617 words, 0 OOVs
0 zeroprobs, logprob= -89850.8 ppl= 21.2341 ppl1= 23.4057"

I apologise for asking such a basic question...I have just started reading about machine translation. 

Thank you very much for your time!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20121029/60870f2d/attachment.html>