[SRILM User List] big difference between ppl and ppl1

Burkay Gur burkay at mit.edu
Tue Dec 27 05:32:16 PST 2011


To get lower and more relevant perplexities I d recommend getting rid of the -order 3 and adding the kneser ney smoothing. Also make sure the corpora are not too small. 

Sent from my iPad

On Dec 27, 2011, at 1:58 PM, Saman Noorzadeh <saman_2004 at yahoo.com> wrote:

> Yes both of my texts are 1 sentence per line, (but some sentences are a little long!)
> I used gtmax options but the result were almost the same
> the commands I use are as following:
> 
> to count:
> ngram-count -order 3 -write-vocab language.voc -text language_tain.txt -write language.bo
> 
> to make the model:
> ngram-count -order 3  language.bo -lm language.BO -gt2min 1 -gt3min 2
> 
> testing Perplexity:
> ngram -lm language.BO -ppl language_test.txt 
> 
> Thank you
> Saman
> From: Burkay Gur <burkay at MIT.EDU>
> To: Saman Noorzadeh <saman_2004 at yahoo.com> 
> Cc: Srilm group <srilm-user at speech.sri.com> 
> Sent: Tuesday, December 27, 2011 12:56 AM
> Subject: Re: [SRILM User List] big difference between ppl and ppl1
> 
> Is your Dutch model arranged so that there is one sentence on each line? Also which command are you using? I recommend using -gt1max 1 -gt2max 1 -gt3max 1 and -ukndiscount for kneser ney smoothing. These will give you more accurate perplexities.
> 
> -Burkay
> 
> Sent from my iPad
> 
> On Dec 27, 2011, at 6:26 AM, Saman Noorzadeh <saman_2004 at yahoo.com> wrote:
> 
>> 
>> I  made 2 models of 2 languages, Dutch and English, to make a language recognition.
>> I got the following perplexities:
>> 
>> Model: Dutch    Test: English    ppl:55    ppl2: 2* 10^18
>> Model: Dutch    Test: Dutch    ppl:303    ppl2: 400
>> Model: English    Test: Dutch    ppl: 600   ppl2: 3122ses n
>> Model: English   Test: English    ppl: 227    ppl2: 1897
>> 
>> I think it is reasonable if I have a large perplexity when my model and test are different but why ppl=55 when having a Duch model and an English test?
>> and
>> Why is there a BIG difference in their ppl and ppl1 ?
>> 
>> Thanks in advance
>> 
>> 
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
> 
> 
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20111227/f34bf9fd/attachment.html>


More information about the SRILM-User mailing list