[SRILM User List] big difference between ppl and ppl1
Burkay Gur
burkay at mit.edu
Tue Dec 27 05:32:16 PST 2011
To get lower and more relevant perplexities I d recommend getting rid of the -order 3 and adding the kneser ney smoothing. Also make sure the corpora are not too small.
Sent from my iPad
On Dec 27, 2011, at 1:58 PM, Saman Noorzadeh <saman_2004 at yahoo.com> wrote:
> Yes both of my texts are 1 sentence per line, (but some sentences are a little long!)
> I used gtmax options but the result were almost the same
> the commands I use are as following:
>
> to count:
> ngram-count -order 3 -write-vocab language.voc -text language_tain.txt -write language.bo
>
> to make the model:
> ngram-count -order 3 language.bo -lm language.BO -gt2min 1 -gt3min 2
>
> testing Perplexity:
> ngram -lm language.BO -ppl language_test.txt
>
> Thank you
> Saman
> From: Burkay Gur <burkay at MIT.EDU>
> To: Saman Noorzadeh <saman_2004 at yahoo.com>
> Cc: Srilm group <srilm-user at speech.sri.com>
> Sent: Tuesday, December 27, 2011 12:56 AM
> Subject: Re: [SRILM User List] big difference between ppl and ppl1
>
> Is your Dutch model arranged so that there is one sentence on each line? Also which command are you using? I recommend using -gt1max 1 -gt2max 1 -gt3max 1 and -ukndiscount for kneser ney smoothing. These will give you more accurate perplexities.
>
> -Burkay
>
> Sent from my iPad
>
> On Dec 27, 2011, at 6:26 AM, Saman Noorzadeh <saman_2004 at yahoo.com> wrote:
>
>>
>> I made 2 models of 2 languages, Dutch and English, to make a language recognition.
>> I got the following perplexities:
>>
>> Model: Dutch Test: English ppl:55 ppl2: 2* 10^18
>> Model: Dutch Test: Dutch ppl:303 ppl2: 400
>> Model: English Test: Dutch ppl: 600 ppl2: 3122ses n
>> Model: English Test: English ppl: 227 ppl2: 1897
>>
>> I think it is reasonable if I have a large perplexity when my model and test are different but why ppl=55 when having a Duch model and an English test?
>> and
>> Why is there a BIG difference in their ppl and ppl1 ?
>>
>> Thanks in advance
>>
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20111227/f34bf9fd/attachment.html>
More information about the SRILM-User
mailing list