[SRILM User List] big difference between ppl and ppl1

Saman Noorzadeh saman_2004 at yahoo.com
Tue Dec 27 03:58:34 PST 2011


Yes both of my texts are 1 sentence per line, (but some sentences are a little long!)
I used gtmax options but the result were almost the same
the commands I use are as following:

to count:

ngram-count -order 3 -write-vocab language.voc -text language_tain.txt -write language.bo

to make the model:

ngram-count -order 3  language.bo -lm language.BO -gt2min 1 -gt3min 2


testing Perplexity:

ngram -lm language.BO -ppl language_test.txt 


Thank you
Saman


________________________________
 From: Burkay Gur <burkay at MIT.EDU>
To: Saman Noorzadeh <saman_2004 at yahoo.com> 
Cc: Srilm group <srilm-user at speech.sri.com> 
Sent: Tuesday, December 27, 2011 12:56 AM
Subject: Re: [SRILM User List] big difference between ppl and ppl1
 

Is your Dutch model arranged so that there is one sentence on each line? Also which command are you using? I recommend using -gt1max 1 -gt2max 1 -gt3max 1 and -ukndiscount for kneser ney smoothing. These will give you more accurate perplexities.

-Burkay

Sent from my iPad

On Dec 27, 2011, at 6:26 AM, Saman Noorzadeh <saman_2004 at yahoo.com> wrote:



>
>I  made 2 models of 2 languages, Dutch and English, to make a language recognition.
>I got the following perplexities:
>
>
>Model: Dutch    Test: English    ppl:55    ppl2: 2* 10^18
>Model: Dutch    Test: Dutch   ppl:303   ppl2: 400
>Model: English   Test: Dutch   ppl: 600  ppl2: 3122ses n
>
>Model: English  Test: English    ppl: 227   ppl2: 1897
>
>
>I think it is reasonable if I have a large perplexity when my model and test are different but why ppl=55 when having a Duch model and an English test?
>and
>
>Why is there a BIG difference in their ppl and ppl1 ?
>
>
>Thanks in advance
>
>
>
>
>
_______________________________________________
>SRILM-User site list
>SRILM-User at speech.sri.com
>http://www.speech.sri.com/mailman/listinfo/srilm-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20111227/e43d0675/attachment.html>


More information about the SRILM-User mailing list