Fwd: Bug in lattice-tool?

Tom Murray yozhik at computer.org
Thu Jan 18 09:55:01 PST 2007


Thanks, Andreas. I'm forwarding this to the list because I think it
may be quite useful to a number of people.

---------- Forwarded message ----------
From: Andreas Stolcke <stolcke at speech.sri.com>
Date: Jan 17, 2007 10:57 PM
Subject: Re: Bug in lattice-tool?
To: Tom Murray <yozhik at computer.org>



Tom,

what you are trying to do can be done with lattice-tool as it is,
but it requires two passes.  That's how we rescore lattices ourselves.

step 1: expand lattice with new LM, write new lattices
step 2: read rescored lattices, choosing scaling factors and decoding
        1-best or n-best.

You are trying to combine these steps into one, and it fails because
the LM rescoring function overrides the combined scores.
This behavior is by design and some other functions depend on it,
but it needs to be better documented.

BTW, I don't think your patch will necessarily do the right thing.
It simply adds the new LM score to the old combined score, instead
of replacing the old LM score in the combination of scores.
There are ways to fix this, but it would require more extensive code
changes.

I would recommend the 2-step approach.  It also has the advantage
hat you can rerun step2 (n-best decoding) multiple times to try different
scaling factors.

One more thing:  since your LM does not contain multiwords you need
to split the multiwords prior to LM expansion. Simply add the -split-multiwords
option in step 1.

Andreas

In message <39abe3570701171423p4bb5d962qf6dbed50cca8aeda at mail.gmail.com>you wro
te:
> ------=_Part_119177_28709660.1169072629160
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
>
> Hi, Andreas--
>
> What we want to do with lattice-tool is this: generate an n-best list
> from a lattice using an external LM, where the path scores are a
> weighted sum of the AM and LM scores in the lattice and the scores of
> the external LM.
>
> Attached is a tarred directory with an HTK lattice, an LM, and a test
> script test-lattice.sh. Also included is the output of v1.5.1
> lattice-tool, compared with my patched version which adds the
> transition log weights as I described.
>
> The script runs lattice-tool three times, first with default
> -htk-lmscale and -htk-acscale, and then with the lmscale and the
> acscale zeroed out. You can see that the n-best list is the same for
> all three for the v1.5.1 output. For mine it differs.
>
> To give a little more detail of where I think the bug is, according to
> my understanding of what's going on:
>
> When you load the HTK file, you create a node for each HTK edge, and
> then connect this new node from the start node and to the end node.
> The weight of the connection from the start to the new node is the
> weighted sum (according to lmscale, acscale, etc.) of the various
> scores from the HTK edge.
>
> Now, during expansion, old nodes and transitions are replaced by new
> ones, with the old nodes deleted. I printed out all the node indices,
> and the initial nodes corresponding to the HTK edges are deleted
> during this stage. I became convince of this when I added a line to
> zero out the probs from the external LM, and all the hyp scores during
> n-best output had score = 0.
>
> Please let me know if I'm misunderstanding something. Thanks for your help,
>
> tm



More information about the SRILM-User mailing list