[SRILM User List] lattice rescoring with conventional LM and FLM

Andreas Stolcke stolcke at icsi.berkeley.edu
Tue Oct 16 21:52:44 PDT 2012


On 10/16/2012 5:33 PM, yuan liang wrote:
> Hi Andreas,
>
> Thank you very much!
>
>
>         2) I used a Trigram in FLM format to rescore "Lattice_1":
>
>             First I converted all word nodes (HTk format) to FLM
>         representation;
>
>             Then rescored with:
>
>           " lattice-tool  -in-lattice  Lattice_1  -unk  -vocab
>         [voc_file]  -read-htk  -no-nulls  -no-htk-nulls  -factored
>          -lm [FLM_specification_file]  -htk-lmscale  15  -htk-logbase
>         2.71828183  -posterior-scale  15  -write-htk  -out-lattice
>         Lattice_3"
>
>            I think "Lattice_2" and "Lattice_3" should be the same,
>         since the perplexity of using Trigram and using Trigram in FLM
>         format are same. However, they are different. Did I miss
>         something?
>
>
>     This is a question about the equivalent encoding of standard
>     word-based LMs as FLMs, and I'm not an expert here.
>     However, as a sanity check, I would first do a simple perplexity
>     computation (ngram -debug 2 -ppl) with both models on some test
>     set and make sure you get the same word-for-word conditional
>     probabilities.  If not, you can spot where the differences are and
>     present a specific case of different probabilities to the group
>     for debugging.
>
>
> Actually I did the perplexity test on a test set of 6564 sentences 
> (72854 words). The total perplexity are the same using standard 
> word-based Trigram LM as using FLM Trigram. Also I checked the details 
> of the word-for-word conditional probability, for these 72854 words, 
> only 442 words' conditional probabilities are not exactly the same, 
> others are exactly the same. However the probability difference is 
> negligible ( like 0.00531048 and 0.00531049, 5.38809e-07 and 
> 5.38808e-07 ). So I thought we can say both models can get the same 
> word-for-word conditional probabilities.
>
> I also considered probably it's because of the FLM format, lattice 
> expanding with standard Trigram is seems different with FLM Trigram, 
> using FLM Trigram lattice expanded around 300 times larger than using 
> standard Trigram, maybe the expanding way is different. I'm not sure, 
> I still need to investigate more.

The lattice expansion algorithm makes use of the backoff structure of 
the standard LM to minimize the number of nodes that need to be 
duplicated to correctly apply the probabilities.  The FLM makes more 
conservative assumptions and always assumes you need two words of 
context, leading to more nodes after expansion.  That would explain the 
size difference.

You can also check the probabilities in expanded lattices.  The command

     lattice-tool -in-lattice LATTICE -ppl TEXT -debug 2 ...

will compute the probabilities assigned to the words in TEXT by 
traversing the lattice.  It is worth checking first that expansion with 
FLMs yields the right probabilities.

You say that viterbi decoding gives almost the same results (this 
suggests the expansion works correctly), but posterior  (confusion 
network) decoding doesn't.  It is possible there is a problem with 
building CNs from lattices with factored vocabularies.  I don't think I 
every tried that.  It would help to find a minimal test case that shows 
the problem.

Andreas

>
>
> Thank you very much for your advices!
>
> Regards,
> Yuan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20121016/008a936f/attachment.html>


More information about the SRILM-User mailing list