[SRILM User List] big difference between ppl and ppl1
Saman Noorzadeh
saman_2004 at yahoo.com
Tue Dec 27 03:58:34 PST 2011
Yes both of my texts are 1 sentence per line, (but some sentences are a little long!)
I used gtmax options but the result were almost the same
the commands I use are as following:
to count:
ngram-count -order 3 -write-vocab language.voc -text language_tain.txt -write language.bo
to make the model:
ngram-count -order 3 language.bo -lm language.BO -gt2min 1 -gt3min 2
testing Perplexity:
ngram -lm language.BO -ppl language_test.txt
Thank you
Saman
________________________________
From: Burkay Gur <burkay at MIT.EDU>
To: Saman Noorzadeh <saman_2004 at yahoo.com>
Cc: Srilm group <srilm-user at speech.sri.com>
Sent: Tuesday, December 27, 2011 12:56 AM
Subject: Re: [SRILM User List] big difference between ppl and ppl1
Is your Dutch model arranged so that there is one sentence on each line? Also which command are you using? I recommend using -gt1max 1 -gt2max 1 -gt3max 1 and -ukndiscount for kneser ney smoothing. These will give you more accurate perplexities.
-Burkay
Sent from my iPad
On Dec 27, 2011, at 6:26 AM, Saman Noorzadeh <saman_2004 at yahoo.com> wrote:
>
>I made 2 models of 2 languages, Dutch and English, to make a language recognition.
>I got the following perplexities:
>
>
>Model: Dutch Test: English ppl:55 ppl2: 2* 10^18
>Model: Dutch Test: Dutch ppl:303 ppl2: 400
>Model: English Test: Dutch ppl: 600 ppl2: 3122ses n
>
>Model: English Test: English ppl: 227 ppl2: 1897
>
>
>I think it is reasonable if I have a large perplexity when my model and test are different but why ppl=55 when having a Duch model and an English test?
>and
>
>Why is there a BIG difference in their ppl and ppl1 ?
>
>
>Thanks in advance
>
>
>
>
>
_______________________________________________
>SRILM-User site list
>SRILM-User at speech.sri.com
>http://www.speech.sri.com/mailman/listinfo/srilm-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20111227/e43d0675/attachment.html>
More information about the SRILM-User
mailing list