[SRILM User List] Inconsistency between mix-lm and compute-best-mix ?
Anoop Deoras
adeoras at jhu.edu
Fri Apr 29 12:47:43 PDT 2011
Hello,
I am trying to interpolate two LMs and I see inconsistency in the
outputs when 2 different methods are used
for interpolation.
I will explain my setup :
I have two LMs: LM1 and LM2 and I have a text corpus TEXT
Step 1: produce debug file using ngram tool with debug=2 option using
LM1 and LM2.
Lets call them DEBUG1 and DEBUG2
ngram -lm LM1 -order 4 -unk -vocab VOCAB -ppl TEXT -debug 2 > DEBUG1
ngram -lm LM2 -order 4 -unk -vocab VOCAB -ppl TEXT -debug 2 > DEBUG2
Step 2: Get the optimal weights using the command:
compute-best-mix DEBUG1 DEBUG2
Let the final best perplexity obtained be denoted as PPL_Step2
Let the weights be LAMBDA, 1-LAMBDA
Thus LAMBDA corresponds to LM1.
Step3 : Combine LM1 and LM2 linearly with the weights found above and
compute the PPL
ngram -lm LM1 -order 4 -unk -vocab VOCAB -ppl TEXT -mix-lm LM2 -
lambda LAMBDA
Let the perplexity obtained be denoted as PPL_Step3
For my setup, PPL_Step3 turns out to be greater than PPL_Step2 and I
don't understand why ?
Am I missing something while combining the models ?
Any pointers would be useful.
Thanks and Regards
Anoop
More information about the SRILM-User
mailing list