FW: A simple question about SRILM
Roy Bar Haim
barhaim at cs.technion.ac.il
Mon May 17 11:25:52 PDT 2004
Hi,
I have the same problem. I want the LM to give maximum-likelihood estimates.
That is, all the backoff weights should be zero.
I applied the solution below, but still I get backoff weights.
For example, when I build the lm like this:
ngram-count -order 3 -gt1max 0 -gt2max 0 -gt3max 0 -text corpus.tags -lm corpus.tags.lm
I found that the once-occuring trigrams DO NOT APPEAR in the lm, so probablity mass is still discounted.
When I turned on the debug messages, I saw many messages like:
warning: 0 backoff probability mass left for "AT SCLN" -- incrementing denominator
Does it mean that smoothing is enforced here?
Is there a way to get a pure maximum-likelihood language model, without backoff weights at all, using ngram-count?
Thanks,
Roy.
> -----Original Message-----
> From: owner-srilm-user at speech.sri.com
> [mailto:owner-srilm-user at speech.sri.com] On Behalf Of Andreas Stolcke
> Sent: Tuesday, April 06, 2004 6:34 PM
> To: David Picף
> Cc: srilm-user at speech.sri.com; Jorge Gonzבlez
> Subject: Re: A simple question about SRILM
>
>
>
> The ngram-count man page says
>
> -gtnmax count
> where n is 1, 2, 3, 4, 5, 6, 7, 8, or 9. Set the
> maximal count of N-grams of order n that are dis-
> counted under Good-Turing. All N-grams more fre-
> quent than that will receive maximum likelihood
> estimates. Discounting can be effectively disabled
> by setting this to 0.
>
> Therefore, you can disable smoothing with
>
> ngram-count -gt1max 0 -gt2max 0 -gt3max 0 ...
>
> --Andreas
>
> In message <40726957.3070101 at dsic.upv.es>you wrote:
> > Hello,
> >
> > I also have a little question about SRILM. How can I infer
> a trigram
> > (or
> > bigram, or tetragram...) with no smoothing at all? I need
> to do some
> > experiments to check the effect of n-gram smoothing in my
> models and I
> > need a pure trigram with no probability mass derived to
> lower levels. Is
> > this possible in SRILM? I need to be sure that I really get
> a trigram
> > (with the whole trigram probabilities).
> >
> > Thank you very much in advance for your help and attention! David
> >
> > --
> > David Picó-Vila
> > Universitat Politècnica de València
> > Departament de Sistemes Informàtics i Computació
> > València, Spain
> >
>
>
More information about the SRILM-User
mailing list