[SRILM User List] reproduce Penn Treebank KN5 results
Joris Pelemans
Joris.Pelemans at esat.kuleuven.be
Thu Jul 10 01:43:57 PDT 2014
Hi Siva,
Thanks a lot, with these arguments the perplexity is very close to the
reported 141.2 (still not entirely the same though):
<jpeleman at spchcl23:~/exp/025> ngram-count -order 5 -text
data/penn/ptb.train.txt -lm models/ptb.train_5-gram_kn.arpa7 -kndiscount
-interpolate -unk -gt3min 1 -gt4min 1
<jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt -lm
models/ptb.train_5-gram_kn.arpa7 -order 5 -unk
file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 0 OOVs
0 zeroprobs, logprob= -177278 ppl= *141.464* ppl1= 179.251
I wonder about the value of experiments that include <unk> in the
perplexity calculation. Does it not make the problem a lot easier
(predicting a huge class is not hard - imagine mapping all words to
<unk>) and as such yield misleading results?
Joris
On 07/09/14 16:24, Siva Reddy Gangireddy wrote:
> Hi Joris,
>
> Use the count cut-offs like this.
>
> ngram-count -order 5 -text ptb.train.txt -lm templm -kndiscount
> -interpolate -unk -gt3min 1 -gt4min 1
> ngram -ppl ptb.test.txt -lm templm -order 5 -unk
>
> By default SRILM uses different count cut-offs.
>
> ---
> Siva
>
>
>
> On Wed, Jul 9, 2014 at 11:03 PM, Joris Pelemans
> <Joris.Pelemans at esat.kuleuven.be
> <mailto:Joris.Pelemans at esat.kuleuven.be>> wrote:
>
> Hi all,
>
> I'm trying to reproduce some reported N-gram perplexity results on
> the Penn Treebank with SRILM, but somehow my results are always
> different by a large degree. Since I will be interpolating with
> these models and comparing the interpolated model with others, I
> would really prefer to start on the same level :-).
>
> The data set I'm using is the one that comes with Mikolov's RNNLM
> toolkit and applies the same processing of data as used in many LM
> papers, including "Empirical Evaluation and Combination of
> Advanced Language Modeling Techniques". In that paper, Mikolov et
> al report a KN5 perplexity of 141.2. It's not entirely clear (1)
> whether they ignore OOV words or simply use the <unk> probability;
> and (2) whether it's a back-off or interpolated model, but I
> assume the latter as this has been reported as best many times.
> They do report using SRILM and no count cut-offs.
>
> I have tried building the same model in many ways:
>
> *regular:* ngram-count -order 5 -text data/penn/ptb.train.txt -lm
> models/ptb.train_5-gram_kn.arpa2 -kndiscount -interpolate
> *open vocab:* ngram-count -order 5 -text data/penn/ptb.train.txt
> -lm models/ptb.train_5-gram_kn.arpa3 -kndiscount -interpolate -unk
> *no sentence markers:* ngram-count -order 5 -text
> data/penn/ptb.train.txt -lm models/ptb.train_5-gram_kn.arpa4
> -kndiscount -interpolate -no-sos -no-eos
> *open vocab + no sentence markers:* ngram-count -order 5 -text
> data/penn/ptb.train.txt -lm models/ptb.train_5-gram_kn.arpa5
> -kndiscount -interpolate -unk -no-sos -no-eos
> *back-off (just in case**):* ngram-count -order 5 -text
> data/penn/ptb.train.txt -lm models/ptb.train_5-gram_kn.arpa5
> -kndiscount -unk
>
> None of them however, give me a perplexity of 141.2:
>
> <jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt
> -lm models/ptb.train_5-gram_kn.arpa2 -order 5
> file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 4794 OOVs
> 0 zeroprobs, logprob= -172723 ppl= 167.794 ppl1= 217.791
>
> <jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt
> -lm models/ptb.train_5-gram_kn.arpa3 -order 5 -unk
> file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 0 OOVs
> 0 zeroprobs, logprob= -178859 ppl= 147.852 ppl1= 187.743
>
> <jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt
> -lm models/ptb.train_5-gram_kn.arpa4 -order 5
> file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 4794 OOVs
> 0 zeroprobs, logprob= -179705 ppl= 206.4 ppl1= 270.74
>
> <jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt
> -lm models/ptb.train_5-gram_kn.arpa5 -order 5 -unk
> file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 0 OOVs
> 0 zeroprobs, logprob= -186444 ppl= 182.746 ppl1= 234.414
>
> <jpeleman at spchcl23:~/exp/025> ngram -ppl data/penn/ptb.test.txt
> -lm models/ptb.train_5-gram_kn.arpa5 -order 5 -unk
> file data/penn/ptb.test.txt: 3761 sentences, 78669 words, 0 OOVs
> 0 zeroprobs, logprob= -181381 ppl= 158.645 ppl1= 202.127
>
> So... what am I missing here? 147.852 is close, but still not
> quite 141.2.
>
> Joris
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com <mailto:SRILM-User at speech.sri.com>
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20140710/d586f093/attachment.html>
More information about the SRILM-User
mailing list