[SRILM User List] A problem with expanding class-based LMs
Dmytro Prylipko
dmytro.prylipko at ovgu.de
Fri Dec 16 01:47:57 PST 2011
Hi Andreas,
I have a class-based LM, which gives a particular perplexity value on
the test set:
ngram -ppl test.fold3.txt -lm 2-gram.class.dd150.fold3.lm -classes
class.dd150.fold3.defs -order 2 -vocab ../all.wlist
file test.fold3.txt: 1397 sentences, 37403 words, 0 OOVs
427 zeroprobs, logprob= -72617.1 ppl= 78.0551 ppl1= 92.0235
I expanded it and got a word-level model:
ngram -lm 2-gram.class.dd150.fold3.lm -classes class.dd150.fold3.defs
-order 2 -write-lm 2-gram.class.dd150.expanded_exact.fold3.lm
-expand-classes 2 -expand-exact 2 -vocab ../all.wlist
But the new model provides different result:
ngram -ppl test.fold3.txt -lm 2-gram.class.dd150.expanded_exact.fold3.lm
-order 2 -vocab ../all.wlist
file test.fold3.txt: 1397 sentences, 37403 words, 0 OOVs
0 zeroprobs, logprob= -78108.4 ppl= 103.063 ppl1= 122.544
You can see there is no more zeroprobs in the new one, which .affects
the perplexity.
I can show you detailed output from both models:
Class-based:
<s> gruess gott frau traub </s>
p( gruess | <s> ) = [OOV][2gram] 0.0167159 [ -1.77687 ]
p( gott | gruess ...) = [OOV][1gram][OOV][2gram] 0.658525 [
-0.181428 ]
p( frau | gott ...) = [OOV][1gram][OOV][2gram] 0.119973 [
-0.920917 ]
p( traub | frau ...) = [OOV][OOV] 0 [ -inf ]
p( </s> | traub ...) = [1gram] 0.0377397 [ -1.4232 ]
1 sentences, 4 words, 0 OOVs
1 zeroprobs, logprob= -4.30242 ppl= 11.9016 ppl1= 27.1731
And the same sentence with expanded LM:
<s> gruess gott frau traub </s>
p( gruess | <s> ) = [2gram] 0.0167159 [ -1.77687 ]
p( gott | gruess ...) = [2gram] 0.658525 [ -0.181428 ]
p( frau | gott ...) = [2gram] 0.119973 [ -0.920917 ]
p( traub | frau ...) = [1gram] 3.84699e-14 [ -13.4149 ]
p( </s> | traub ...) = [1gram] 0.0377397 [ -1.4232 ]
1 sentences, 4 words, 0 OOVs
0 zeroprobs, logprob= -17.7173 ppl= 3495.1 ppl1= 26873.5
From my point of view it looks like a computational error, such a small
probabilities should be treated as zero.
BTW, how can zero probabilities appear there? They should be smoothed,
right?
I divided my corpus on 10 folds and performed these actions on all of
them. With 6 folds everything is fine, perplexities are almost the same
for both models, but with other 4 parts I have such a problem.
I would be greatly appreciated for any help.
Sincerely yours,
Dmytro Prylipko.
More information about the SRILM-User
mailing list