[SRILM User List] lattice rescoring with conventional LM and FLM

yuan liang yuan at ks.cs.titech.ac.jp
Wed Oct 17 03:04:53 PDT 2012


Hi Andres,

Thank you very much!
I will test more.

Regards,
Yuan

On Wed, Oct 17, 2012 at 1:52 PM, Andreas Stolcke
<stolcke at icsi.berkeley.edu>wrote:

>  On 10/16/2012 5:33 PM, yuan liang wrote:
>
> Hi Andreas,
>
> Thank you very much!
>
>
>>> 2) I used a Trigram in FLM format to rescore "Lattice_1":
>>>
>>>     First I converted all word nodes (HTk format) to FLM representation;
>>>
>>>     Then rescored with:
>>>
>>>   " lattice-tool  -in-lattice  Lattice_1  -unk  -vocab [voc_file]
>>>  -read-htk  -no-nulls  -no-htk-nulls  -factored  -lm
>>> [FLM_specification_file]  -htk-lmscale  15  -htk-logbase 2.71828183
>>>  -posterior-scale  15  -write-htk  -out-lattice Lattice_3"
>>>
>>>    I think "Lattice_2" and "Lattice_3" should be the same, since the
>>> perplexity of using Trigram and using Trigram in FLM format are same.
>>> However, they are different. Did I miss something?
>>>
>>
>>  This is a question about the equivalent encoding of standard word-based
>> LMs as FLMs, and I'm not an expert here.
>> However, as a sanity check, I would first do a simple perplexity
>> computation (ngram -debug 2 -ppl) with both models on some test set and
>> make sure you get the same word-for-word conditional probabilities.  If
>> not, you can spot where the differences are and present a specific case of
>> different probabilities to the group for debugging.
>>
>>
>>  Actually I did the perplexity test on a test set of 6564 sentences
> (72854 words). The total perplexity are the same using standard word-based
> Trigram LM as using FLM Trigram. Also I checked the details of the
> word-for-word conditional probability, for these 72854 words, only 442
> words' conditional probabilities are not exactly the same, others are
> exactly the same. However the probability difference is negligible ( like
> 0.00531048 and 0.00531049, 5.38809e-07 and 5.38808e-07 ). So I thought we
> can say both models can get the same word-for-word conditional
> probabilities.
>
> I also considered probably it's because of the FLM format, lattice
> expanding with standard Trigram is seems different with FLM Trigram, using
> FLM Trigram lattice expanded around 300 times larger than using standard
> Trigram, maybe the expanding way is different. I'm not sure, I still need
> to investigate more.
>
>
> The lattice expansion algorithm makes use of the backoff structure of the
> standard LM to minimize the number of nodes that need to be duplicated to
> correctly apply the probabilities.  The FLM makes more conservative
> assumptions and always assumes you need two words of context, leading to
> more nodes after expansion.  That would explain the size difference.
>
> You can also check the probabilities in expanded lattices.  The command
>
>     lattice-tool -in-lattice LATTICE -ppl TEXT -debug 2 ...
>
> will compute the probabilities assigned to the words in TEXT by traversing
> the lattice.  It is worth checking first that expansion with FLMs yields
> the right probabilities.
>
> You say that viterbi decoding gives almost the same results (this suggests
> the expansion works correctly), but posterior  (confusion network) decoding
> doesn't.  It is possible there is a problem with building CNs from lattices
> with factored vocabularies.  I don't think I every tried that.  It would
> help to find a minimal test case that shows the problem.
>
> Andreas
>
>
>
>
> Thank you very much for your advices!
>
> Regards,
> Yuan
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20121017/651c47c9/attachment.html>


More information about the SRILM-User mailing list