[SRILM User List] reproduce Penn Treebank KN5 results
Joris Pelemans
Joris.Pelemans at esat.kuleuven.be
Wed Jul 9 07:03:27 PDT 2014
Hi all,
I'm trying to reproduce some reported N-gram perplexity results on the
Penn Treebank with SRILM, but somehow my results are always different by
a large degree. Since I will be interpolating with these models and
comparing the interpolated model with others, I would really prefer to
start on the same level :-).
The data set I'm using is the one that comes with Mikolov's RNNLM
toolkit and applies the same processing of data as used in many LM
papers, including "Empirical Evaluation and Combination of Advanced
Language Modeling Techniques". In that paper, Mikolov et al report a KN5
perplexity of 141.2. It's not entirely clear (1) whether they ignore OOV
words or simply use the <unk> probability; and (2) whether it's a
back-off or interpolated model, but I assume the latter as this has been
reported as best many times. They do report using SRILM and no count
cut-offs.
I have tried building the same model in many ways:
*regular:* ngram-count -order 5 -text data/penn/ptb.train.txt -lm
models/ptb.train_5-gram_kn.arpa2 -kndiscount -interpolate
*open vocab:* ngram-count -order 5 -text data/penn/ptb.train.txt -lm
models/ptb.train_5-gram_kn.arpa3 -kndiscount -interpolate -unk
*no sentence markers:* ngram-count -order 5 -text
data/penn/ptb.train.txt -lm models/ptb.train_5-gram_kn.arpa4 -kndiscount
-interpolate -no-sos -no-eos
*open vocab + no sentence markers:* ngram-count -order 5 -text
data/penn/ptb.train.txt -lm models/ptb.train_5-gram_kn.arpa5 -kndiscount
-interpolate -unk -no-sos -no-eos
*back-off (just in case**):* ngram-count -order 5 -text
data/penn/ptb.train.txt -lm models/ptb.train_5-gram_kn.arpa5 -kndiscount
-unk
None of them however, give me a perplexity of 141.2:
<jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt -lm
models/ptb.train_5-gram_kn.arpa2 -order 5
file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 4794 OOVs
0 zeroprobs, logprob= -172723 ppl= 167.794 ppl1= 217.791
<jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt -lm
models/ptb.train_5-gram_kn.arpa3 -order 5 -unk
file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 0 OOVs
0 zeroprobs, logprob= -178859 ppl= 147.852 ppl1= 187.743
<jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt -lm
models/ptb.train_5-gram_kn.arpa4 -order 5
file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 4794 OOVs
0 zeroprobs, logprob= -179705 ppl= 206.4 ppl1= 270.74
<jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt -lm
models/ptb.train_5-gram_kn.arpa5 -order 5 -unk
file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 0 OOVs
0 zeroprobs, logprob= -186444 ppl= 182.746 ppl1= 234.414
<jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt -lm
models/ptb.train_5-gram_kn.arpa5 -order 5 -unk
file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 0 OOVs
0 zeroprobs, logprob= -181381 ppl= 158.645 ppl1= 202.127
So... what am I missing here? 147.852 is close, but still not quite 141.2.
Joris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20140709/33d396a0/attachment.html>
More information about the SRILM-User
mailing list