[SRILM User List] Factored Language Model - Backoff Weight Differs

Thu Aug 5 10:03:40 PDT 2010

Hello,

I am working with Factored Language Models and want to start with a Factored Language Model which is equal to a standart 4-gram Language Model. Therefore I use the following factor language model file:

1
W : 3 W(-1) W(-2) W(-3) trainW.count trainW.flm.lm 4
W1,W2,W3 W3 ukndiscount gtmin 0
W1,W2 W2 ukndiscount gtmin 0
W1 W1 ukndiscount gtmin 0
0 0 ukndiscount gtmin 0

When I build the factored language model and write it using fngram-count -lm I realized the backoff weights in the language model differ significantly from the backoff weights in the standart n-gram. Both language model use ukndiscount and a cutoff of 0. 

For example while my normal 4-gram contains the following entries:
-2.401827       A BEAUTIFUL     0.01767567
-2.401827       A BETTER        0.01767567

the factored language model has:
-2.401827       W-A W-BEAUTIFUL -0.1628703
-2.401827       W-A W-BETTER

So the both language models have the same probability but a different or even missing backoff weight. 

If I evalulate the language model written with fngram-count using ngram I get a lot of warnings like:
trainWX.flm.lm: line 2678: warning: no bow for prefix of ngram "A BEAUTIFUL" .

If I use the factored language model for decode I have a higher WER than with the standart 4-gram.

I would like to know how to get the backoff weights for FLMs like for a standart n-gram. 
Also an explanation why the backoff weights are missing or different in the FLM would help.

Thank you for your help.

Jan