[SRILM User List] How do you calculate perplexity given a test sentence?
Andreas Stolcke
stolcke at icsi.berkeley.edu
Fri May 18 13:45:48 PDT 2012
On 5/18/2012 10:39 AM, Burkay Gur wrote:
> This is still not clear to me. When we calculate the perplexity of a
> language
> model alone, we just take p as the language model itself. This tells
> us how
> perplexed is that language model.
>
> This is H(p) = - Sum_i(p_i*log(p_i))
>
> Now when we introduce a test sentence, I am not sure what we are
> calculating. In
> your example you are not mentioning q in the equation.
>
> H(p,q) = -Sum_i(p_i * log(q_i))
First, exchange p and q, if p is your LM, so you have
H(p,q) = -Sum_i(q_i * log(p_i))
q_i is approximated by the empirical distribution of words in the test
data. So effectively, q_i = number of occurrences of word i / length
of test corpus.
Of course for many (most) words q_i will be zero (they don't occur in
the test data).
With this approximation you get
H(p,q) = - Sum_j log (p_j)
where j now ranges over the word occurrences (tokens, not types) in the
test set, and p_j is the probability of the j-th word.
Andreas
More information about the SRILM-User
mailing list