ngram manipulation

Joel Pinto joel.pinto at idiap.ch
Thu Mar 8 06:33:52 PST 2007


Hello SRILM users,

I have a question on the use of srilm toolkit for LM manipulation.

The language model in the arpa format gives conditional probabilities
e.g  p(wd3|wd1, wd2)
Can I compute the joint probability p(wd1, wd2, wd3)  using any utility.

I have a heavy LM with (ngram 1=50002, ngram 2=29077135, ngram 3=40083381).


Any help would be greatly appreciated.
Thanks,
joel.


arpa format:
p(wd3|wd1,wd2) = if(trigram exists)           p_3(wd1,wd2,wd3)
                else if(bigram w1,w2 exists) bo_wt_2(w1,w2)*p(wd3|wd2)
                else                         p(wd3|w2)

p(wd2|wd1)= if(bigram exists) p_2(wd1,wd2)
            else              bo_wt_1(wd1)*p_1(wd2)




More information about the SRILM-User mailing list