[SRILM User List] lattice rescoring with conventional LM and FLM

Andreas Stolcke stolcke at icsi.berkeley.edu
Tue Oct 16 09:59:46 PDT 2012


On 10/13/2012 2:37 AM, yuan liang wrote:
> Hi srilm users,
>
> Now I'm using the 'lattice-tool' to rescore the lattice, my goal is 
> using a Factor Language Model(FLM) score to replace the original 
> language model score in the word lattice.
>
> 1) First in the baseline system, I used conventional Bigram LM to do 
> speech recognition and generate the htk word lattice (we name it 
> "Lattice_1"). Then I try to use a conventional Trigram LM to rescore 
> the "Lattice_1", using:
>
>    "lattice-tool -in-lattice Lattice_1 -unk -vocab [voc_file] 
> -read-htk -no-nulls -no-htk-nulls -lm [Trigram_file] -htk-lmscale 15 
> -htk-logbase 2.71828183 -posterior-scale 15  -write-htk -out-lattice 
> Lattice_2"
Two factors come into play here:

1) when you apply a trigram model to a bigram lattice the lattice is 
expanded so that trigram contexts (i.e., the last two words) are encoded 
uniquely at each node.  Hence the size increase.

2) The options -no-nulls -no-htk-nulls actually imply a size increase 
all on their own because of the way HTK lattices are represented 
internally (arcs are encode as nodes, and then mapped back to arc on 
output).   You should not use them.

>
> I just want to use the new Trigram LM score to replace the old LM 
> score in "Lattice_1", so I think "Lattice_2" and "Lattice_1" should 
> have the same size, just each word's LM score will be different. But I 
> found the size of "Lattice_2" are larger than "Latttice_1". Did I miss 
> something? How can I only replace the LM score without expanding the 
> size of the lattice?
>
>
>
> 2) I used a Trigram in FLM format to rescore "Lattice_1":
>
>     First I converted all word nodes (HTk format) to FLM representation;
>
>     Then rescored with:
>
>   " lattice-tool  -in-lattice  Lattice_1  -unk  -vocab [voc_file]  
> -read-htk  -no-nulls  -no-htk-nulls  -factored  -lm 
> [FLM_specification_file]  -htk-lmscale  15  -htk-logbase 2.71828183  
> -posterior-scale  15  -write-htk  -out-lattice Lattice_3"
>
>    I think "Lattice_2" and "Lattice_3" should be the same, since the 
> perplexity of using Trigram and using Trigram in FLM format are same. 
> However, they are different. Did I miss something?

This is a question about the equivalent encoding of standard word-based 
LMs as FLMs, and I'm not an expert here.
However, as a sanity check, I would first do a simple perplexity 
computation (ngram -debug 2 -ppl) with both models on some test set and 
make sure you get the same word-for-word conditional probabilities.  If 
not, you can spot where the differences are and present a specific case 
of different probabilities to the group for debugging.

>
>
>
>  3) Also I checked the accuracy from the decoding result of using 
> "Lattice_2" and "Lattice_3", the result are:
>
>                     viterbi decode result is the same;
>                     n-best list are almost same, but using "Lattice_2" 
> is better than using "Lattice_3";
>                     posterior decode result is quite different, using 
> "Lattice_2" is better than using "Lattice_3";
>
>      Did I miss something when I using FLM to rescore the lattice?
You need to resolve question 2 above first before tackling this one.

Andreas



More information about the SRILM-User mailing list