[SRILM User List] lattice rescoring with conventional LM and FLM
yuan liang
yuan at ks.cs.titech.ac.jp
Wed Oct 17 03:04:53 PDT 2012
Hi Andres,
Thank you very much!
I will test more.
Regards,
Yuan
On Wed, Oct 17, 2012 at 1:52 PM, Andreas Stolcke
<stolcke at icsi.berkeley.edu>wrote:
> On 10/16/2012 5:33 PM, yuan liang wrote:
>
> Hi Andreas,
>
> Thank you very much!
>
>
>>> 2) I used a Trigram in FLM format to rescore "Lattice_1":
>>>
>>> First I converted all word nodes (HTk format) to FLM representation;
>>>
>>> Then rescored with:
>>>
>>> " lattice-tool -in-lattice Lattice_1 -unk -vocab [voc_file]
>>> -read-htk -no-nulls -no-htk-nulls -factored -lm
>>> [FLM_specification_file] -htk-lmscale 15 -htk-logbase 2.71828183
>>> -posterior-scale 15 -write-htk -out-lattice Lattice_3"
>>>
>>> I think "Lattice_2" and "Lattice_3" should be the same, since the
>>> perplexity of using Trigram and using Trigram in FLM format are same.
>>> However, they are different. Did I miss something?
>>>
>>
>> This is a question about the equivalent encoding of standard word-based
>> LMs as FLMs, and I'm not an expert here.
>> However, as a sanity check, I would first do a simple perplexity
>> computation (ngram -debug 2 -ppl) with both models on some test set and
>> make sure you get the same word-for-word conditional probabilities. If
>> not, you can spot where the differences are and present a specific case of
>> different probabilities to the group for debugging.
>>
>>
>> Actually I did the perplexity test on a test set of 6564 sentences
> (72854 words). The total perplexity are the same using standard word-based
> Trigram LM as using FLM Trigram. Also I checked the details of the
> word-for-word conditional probability, for these 72854 words, only 442
> words' conditional probabilities are not exactly the same, others are
> exactly the same. However the probability difference is negligible ( like
> 0.00531048 and 0.00531049, 5.38809e-07 and 5.38808e-07 ). So I thought we
> can say both models can get the same word-for-word conditional
> probabilities.
>
> I also considered probably it's because of the FLM format, lattice
> expanding with standard Trigram is seems different with FLM Trigram, using
> FLM Trigram lattice expanded around 300 times larger than using standard
> Trigram, maybe the expanding way is different. I'm not sure, I still need
> to investigate more.
>
>
> The lattice expansion algorithm makes use of the backoff structure of the
> standard LM to minimize the number of nodes that need to be duplicated to
> correctly apply the probabilities. The FLM makes more conservative
> assumptions and always assumes you need two words of context, leading to
> more nodes after expansion. That would explain the size difference.
>
> You can also check the probabilities in expanded lattices. The command
>
> lattice-tool -in-lattice LATTICE -ppl TEXT -debug 2 ...
>
> will compute the probabilities assigned to the words in TEXT by traversing
> the lattice. It is worth checking first that expansion with FLMs yields
> the right probabilities.
>
> You say that viterbi decoding gives almost the same results (this suggests
> the expansion works correctly), but posterior (confusion network) decoding
> doesn't. It is possible there is a problem with building CNs from lattices
> with factored vocabularies. I don't think I every tried that. It would
> help to find a minimal test case that shows the problem.
>
> Andreas
>
>
>
>
> Thank you very much for your advices!
>
> Regards,
> Yuan
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20121017/651c47c9/attachment.html>
More information about the SRILM-User
mailing list