[SRILM User List] Interpolate trigram Probabilities to an n-gram LM

Andreas Stolcke stolcke at icsi.berkeley.edu
Tue Sep 24 12:04:00 PDT 2013


On 9/23/2013 1:33 PM, Md. Akmal Haidar wrote:
> Hi,
>
> 1. Is it possible to interpolate some trigram probabilities (say they 
> are in file t.txt) with an n-gram LM ?
> SRILM gives results with the warning (no bow for prefix of trigram of 
> t.txt).
> -lm n-gram.lm -lambda .9 -mix-lm t.txt -ppl test.txt
> 2. When the trigram probabilities in t.txt changes (newt.txt), the 
> results are exactly the same as above.
> -lm n-gram.lm -lambda .9 -mix-lm newt.txt -ppl test.txt
>
> Is above interpolation is OK?Is there any other methods that are 
> required to interpolate these trigram probabilities to an n-gram LM?

The above would be fine if newt.txt contained a well-formed LM. The 
format you generated is incomplete.
As implied by the warning message, for each trigram "a b c" also need 
the history portion ("a b") to be included as a bigram.
Therefore, you should  include a line

-99    a b     0

for every such history (plus the appropriate ngram count information in 
the header).  You also need a unigram section containing all words of 
your vocabulary.

-99  a    0

(the final 0's  are the log backoff weights).

Now, giving 0 (log = -99) probabilities to all your unigrams and bigrams 
is suboptimal because there will be cases where you don't have a 
matching trigram and then the backoff will result in probability 0.  
This is not the end of the world since you presumably are interpolating 
with another model that will yield a non-zero probability, but it should 
be better to estimate a non-zero probability for those unigrams and 
bigrams.  If you do, then run the resulting model through

ngram -lm newt.txt -renorm -write-lm newt-norm.txt

to recompute the backoff weights.  Finally,  interpolate.

Andreas

>
> Format of t.txt/newt.txt
> \data\
> ngram 3=242
> \3-grams:
> ....
> \end\
>
> Thanks
> Best Regards
> Akmal
>
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130924/f2e334ce/attachment.html>


More information about the SRILM-User mailing list