Interpolating with -lambda 1.0
Tanel Alumäe
tanel.alumae at aqris.com
Wed Feb 23 00:11:00 PST 2005
Hello,
I'm a bit confused with interpolation.
I want to calculate test text's perplexity using different interpolation
weights (lambdas). Everything is OK until I set lambda to 1.0. Shouldn't
I then get the same perplexity as using only the base language model?
This doesn't seem to be the case:
$ ngram -lm trigram.arpa -ppl <testtxt>
file <testtxt>: 2394 sentences, 29475 words, 1224 OOVs
0 zeroprobs, logprob= -86274.9 ppl= 653.583 ppl1= 1132.06
$ ngram -lm trigram.arpa -ppl <testtxt> -classes <classdefs> -mix-lm
class-trigram.arpa -lambda 1.0
file <testtxt>: 2394 sentences, 29475 words, 1224 OOVs
0 zeroprobs, logprob= -85554.4 ppl= 619.144 ppl1= 1067.5
As shown, the perplexity is 653.539 when using standalone trigram, and
619.144 when interpolating the trigram with the class-trigam, using
lambda 1.0. Why are they not equal?
Both word trigram and class trigram are close-vocabulary LMs, if it
matters.
Regards,
Tanel A.
More information about the SRILM-User
mailing list