Format of LMs

> Hi,
> In recent days I am doing some evaluation on some SRI training tools.
> I met problems when I tried to use skipping LMs and factored LMs.
> What is the format of these models?
> As for skipping LMs, what is the meaning of the last part at the end 
> of the LM file?
> \end\ ## the end of a normal LM file
> -pau- 0.5
> </s> 0.5
> <s> 0
> <unk> 0.0041594 (how to apply these coef. to some beam-search engine?)
I cannot answer the last question, but the numbers in the word list 
following \end\ represent the probabilities with which a word in the 
history is "skipped". So if the skip probability of a word x is p and x 
occurs in a history before a word w,
the probability of w is estimated as (1-p) times the regular ngram 
probability + p times the ngram probability with x removed from the history.
> As for the factored LMs, I trained a bigram, and got a result that
> there seemed to be no backing-off coef. in the unigram section.
> And what is the meaning of the coefficients right after the 2-gram probs?
> ...
> \0x0-grams:
> -1.071043 </s>
> -1.281587 <unk> (where is the backing-off coef.? )
> ...
> \0x1-grams:
> -2.178066 86AA B2BB -0.7455529(what is the meaning of these coef.?)
> -0.9450388 86AA B6BA_BAC5
> -1.72854 86AA CBF4
> -1.281777 86AA CECA_BAC5
> -6.393295 <s> </s> -0.9474632
> Anyone can show me some helpful reference? Thanks a lot.
The best documentation of FLMs can be found at, but I 
don't see an explanation of the modified backoff model file format 
there. It is probably best to either read the code, or contact 
bilmes at, who wrote most of the code.


