perplexity evaluation

Valsan, Zica valsan at
Wed Dec 4 00:21:31 PST 2002

Thank you for your prompt answer.
I have understood that </s> is taken into account but the question is way
only it and not the other one, too? I read papers where people resort to
this strategy (choosing only one) but is not clear for me the reason for
which they do like this.

Regarding the CMU toolkit I did not say it doesn't output any probabilities
for these context cues, but it outputs the same small values for each of
them (-98.999 very close to the values outputted by SRILM toolkit). This is
somehow "equivalent" with saying there are not taken into account for
perplexity computation, I think. 


-----Original Message-----
From: Andreas Stolcke [mailto:stolcke at]
Sent: Dienstag, 3. Dezember 2002 17:48
To: Valsan, Zica
Cc: 'srilm-user at'
Subject: Re: perplexity evaluation 

In message <B0793DB946E52942A49C1E8152A1358C8E3781 at>you
> Hi all, 
> I'm a new user of the toolkit and I need a little bit support in order to
> understand how the perplexity is computed and why it is different from the
> expected value.
> For instance, I have the training data in the file train.text that contain
> only a line:
> <s> a b c </s>
> and the vocabulary (train.vocab) that contains all these words, and I want
> to generate a LM based on unigram only and to evaluate it on the same
> training data. I don't want any discounting strategy to be applied. 
> Here are the commands I used:
> ngram-count -order 1 -vocab train.vocab -text train.text -lm
> 0
> ngram -lm -debug 2 -vocab train.vocab -ppl train.text > out.ppl
> So, according to the theory, the expected value for perplexity is PP=3 if
> the context cues are not taken into account. This is also what one can get
> using CMU toolkit. 
> Using this toolkit and the above commands what I've got actually, is PP=4.
> Looking inside of the created arpa model , I could see that </s> has the
> same probability as any of the real word (a, b,c). 
> Does anybody could explain me why is like this? Did I make a mistake or is
> something that miss me?

You didn't make a mistake and this is the right answer as far as I can tell.
</s> needs to get a probability in order to be able to compute 
a probability for the whole "sentence".

Are you saying that the CMU software doesn't give any probabiliy to </s> ?
that would be quite odd.

Maybe someone on this list who is more familiar with the CMU toolkit can
contribute an explanation.


More information about the SRILM-User mailing list