[SRILM User List] lattice rescoring with conventional LM and FLM
Andreas Stolcke
stolcke at icsi.berkeley.edu
Tue Oct 16 09:59:46 PDT 2012
On 10/13/2012 2:37 AM, yuan liang wrote:
> Hi srilm users,
>
> Now I'm using the 'lattice-tool' to rescore the lattice, my goal is
> using a Factor Language Model(FLM) score to replace the original
> language model score in the word lattice.
>
> 1) First in the baseline system, I used conventional Bigram LM to do
> speech recognition and generate the htk word lattice (we name it
> "Lattice_1"). Then I try to use a conventional Trigram LM to rescore
> the "Lattice_1", using:
>
> "lattice-tool -in-lattice Lattice_1 -unk -vocab [voc_file]
> -read-htk -no-nulls -no-htk-nulls -lm [Trigram_file] -htk-lmscale 15
> -htk-logbase 2.71828183 -posterior-scale 15 -write-htk -out-lattice
> Lattice_2"
Two factors come into play here:
1) when you apply a trigram model to a bigram lattice the lattice is
expanded so that trigram contexts (i.e., the last two words) are encoded
uniquely at each node. Hence the size increase.
2) The options -no-nulls -no-htk-nulls actually imply a size increase
all on their own because of the way HTK lattices are represented
internally (arcs are encode as nodes, and then mapped back to arc on
output). You should not use them.
>
> I just want to use the new Trigram LM score to replace the old LM
> score in "Lattice_1", so I think "Lattice_2" and "Lattice_1" should
> have the same size, just each word's LM score will be different. But I
> found the size of "Lattice_2" are larger than "Latttice_1". Did I miss
> something? How can I only replace the LM score without expanding the
> size of the lattice?
>
>
>
> 2) I used a Trigram in FLM format to rescore "Lattice_1":
>
> First I converted all word nodes (HTk format) to FLM representation;
>
> Then rescored with:
>
> " lattice-tool -in-lattice Lattice_1 -unk -vocab [voc_file]
> -read-htk -no-nulls -no-htk-nulls -factored -lm
> [FLM_specification_file] -htk-lmscale 15 -htk-logbase 2.71828183
> -posterior-scale 15 -write-htk -out-lattice Lattice_3"
>
> I think "Lattice_2" and "Lattice_3" should be the same, since the
> perplexity of using Trigram and using Trigram in FLM format are same.
> However, they are different. Did I miss something?
This is a question about the equivalent encoding of standard word-based
LMs as FLMs, and I'm not an expert here.
However, as a sanity check, I would first do a simple perplexity
computation (ngram -debug 2 -ppl) with both models on some test set and
make sure you get the same word-for-word conditional probabilities. If
not, you can spot where the differences are and present a specific case
of different probabilities to the group for debugging.
>
>
>
> 3) Also I checked the accuracy from the decoding result of using
> "Lattice_2" and "Lattice_3", the result are:
>
> viterbi decode result is the same;
> n-best list are almost same, but using "Lattice_2"
> is better than using "Lattice_3";
> posterior decode result is quite different, using
> "Lattice_2" is better than using "Lattice_3";
>
> Did I miss something when I using FLM to rescore the lattice?
You need to resolve question 2 above first before tackling this one.
Andreas
More information about the SRILM-User
mailing list