FW: A simple question about SRILM

Mon May 17 11:25:52 PDT 2004

Hi,

I have the same problem. I want the LM to give maximum-likelihood estimates.
That is, all the backoff weights should be zero.

I applied the solution below, but still I get backoff weights. 

For example, when I build the lm like this:
ngram-count -order 3 -gt1max 0 -gt2max 0 -gt3max 0 -text corpus.tags -lm corpus.tags.lm

I found that the once-occuring trigrams DO NOT APPEAR in the lm, so probablity mass is still discounted.

When I turned on the debug messages, I saw many messages like: 
warning: 0 backoff probability mass left for "AT SCLN" -- incrementing denominator 

Does it mean that smoothing is enforced here?

Is there a way to get a pure maximum-likelihood language model, without backoff weights at all, using ngram-count?

Thanks,
Roy.
> -----Original Message-----
> From: owner-srilm-user at speech.sri.com 
> [mailto:owner-srilm-user at speech.sri.com] On Behalf Of Andreas Stolcke
> Sent: Tuesday, April 06, 2004 6:34 PM
> To: David Picף
> Cc: srilm-user at speech.sri.com; Jorge Gonzבlez
> Subject: Re: A simple question about SRILM 
> 
> 
> 
> The ngram-count man page says
> 
>        -gtnmax count
>               where  n  is 1, 2, 3, 4, 5, 6, 7, 8, or 9.  Set the
>               maximal count of N-grams of order n that  are  dis-
>               counted  under  Good-Turing.  All N-grams more fre-
>               quent than that  will  receive  maximum  likelihood
>               estimates.  Discounting can be effectively disabled
>               by setting this to 0.
> 
> Therefore, you can disable smoothing with 
> 
> 	ngram-count -gt1max 0 -gt2max 0 -gt3max 0 ...
> 
> --Andreas
> 
> In message <40726957.3070101 at dsic.upv.es>you wrote:
> > Hello,
> > 
> > I also have a little question about SRILM. How can I infer 
> a trigram 
> > (or
> > bigram, or tetragram...) with no smoothing at all? I need 
> to do some 
> > experiments to check the effect of n-gram smoothing in my 
> models and I 
> > need a pure trigram with no probability mass derived to 
> lower levels. Is 
> > this possible in SRILM? I need to be sure that I really get 
> a trigram 
> > (with the whole trigram probabilities).
> > 
> > Thank you very much in advance for your help and attention! David
> > 
> > --
> > David Picó-Vila
> > Universitat Politècnica de València
> > Departament de Sistemes Informàtics i Computació
> > València, Spain
> > 
> 
>