[SRILM User List] A problem with expanding class-based LMs
Andreas Stolcke
stolcke at icsi.berkeley.edu
Wed Dec 21 17:03:51 PST 2011
My guess is that your class definitions contain multiple words per
expansion, such as "GREETING" expanding to "gruess gott". In that
case a bigram expansion of the LM will not have as much predictive power
as the original class bigram LM.
Try using -expand-classes 3 (or even higher).
Andreas
Dmytro Prylipko wrote:
> Hi Andreas,
>
> I have a class-based LM, which gives a particular perplexity value on
> the test set:
>
> ngram -ppl test.fold3.txt -lm 2-gram.class.dd150.fold3.lm -classes
> class.dd150.fold3.defs -order 2 -vocab ../all.wlist
>
> file test.fold3.txt: 1397 sentences, 37403 words, 0 OOVs
> 427 zeroprobs, logprob= -72617.1 ppl= 78.0551 ppl1= 92.0235
>
> I expanded it and got a word-level model:
>
> ngram -lm 2-gram.class.dd150.fold3.lm -classes class.dd150.fold3.defs
> -order 2 -write-lm 2-gram.class.dd150.expanded_exact.fold3.lm
> -expand-classes 2 -expand-exact 2 -vocab ../all.wlist
>
>
> But the new model provides different result:
>
> ngram -ppl test.fold3.txt -lm
> 2-gram.class.dd150.expanded_exact.fold3.lm -order 2 -vocab ../all.wlist
>
> file test.fold3.txt: 1397 sentences, 37403 words, 0 OOVs
> 0 zeroprobs, logprob= -78108.4 ppl= 103.063 ppl1= 122.544
>
> You can see there is no more zeroprobs in the new one, which .affects
> the perplexity.
>
>
> I can show you detailed output from both models:
>
> Class-based:
>
> <s> gruess gott frau traub </s>
> p( gruess | <s> ) = [OOV][2gram] 0.0167159 [ -1.77687 ]
> p( gott | gruess ...) = [OOV][1gram][OOV][2gram] 0.658525 [
> -0.181428 ]
> p( frau | gott ...) = [OOV][1gram][OOV][2gram] 0.119973 [
> -0.920917 ]
> p( traub | frau ...) = [OOV][OOV] 0 [ -inf ]
> p( </s> | traub ...) = [1gram] 0.0377397 [ -1.4232 ]
> 1 sentences, 4 words, 0 OOVs
> 1 zeroprobs, logprob= -4.30242 ppl= 11.9016 ppl1= 27.1731
>
>
> And the same sentence with expanded LM:
>
> <s> gruess gott frau traub </s>
> p( gruess | <s> ) = [2gram] 0.0167159 [ -1.77687 ]
> p( gott | gruess ...) = [2gram] 0.658525 [ -0.181428 ]
> p( frau | gott ...) = [2gram] 0.119973 [ -0.920917 ]
> p( traub | frau ...) = [1gram] 3.84699e-14 [ -13.4149 ]
> p( </s> | traub ...) = [1gram] 0.0377397 [ -1.4232 ]
> 1 sentences, 4 words, 0 OOVs
> 0 zeroprobs, logprob= -17.7173 ppl= 3495.1 ppl1= 26873.5
>
>
> From my point of view it looks like a computational error, such a
> small probabilities should be treated as zero.
> BTW, how can zero probabilities appear there? They should be smoothed,
> right?
>
> I divided my corpus on 10 folds and performed these actions on all of
> them. With 6 folds everything is fine, perplexities are almost the
> same for both models, but with other 4 parts I have such a problem.
>
> I would be greatly appreciated for any help.
>
> Sincerely yours,
> Dmytro Prylipko.
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
More information about the SRILM-User
mailing list