[SRILM User List] Lattice decoding problems
Andreas Stolcke
stolcke at icsi.berkeley.edu
Tue May 24 14:22:40 PDT 2011
Stephan Gouws wrote:
>> Lattice-tool has the -read-mesh option which allows it to read CNs directly.
>>
>
> Thank you for the reply, Andreas. I am going to use SRI-LM's
> -read-mesh function. Just to be very clear on the mesh-format:
>
> >From the documentation, the format is given as
> """
> name s
> numaligns N
> posterior P
> align a w1 p1 w2 p2 ...
> """
>
> Now, please correct me where I am wrong here:
> - name s can be any string, e.g. name "somename". Do I need quotes?
> - numaligns == the number of confusion sets in the CN, plus the
> initial and end nodes? Do I need explicit initial and end nodes?
> - what exactly is P??
>
The total posterior probability mass represented by the CN. This is
usually 1 but could be something else in certain scenarios.
> - a gives the current confusion set position, starting with 0 for
> "initial", 1 for the next, etc, and N-1 for "final" ?
>
Create a CN from a simple nbest lists, e.g.
nbest-lattice -nbest
$SRILM/lm/test/tests/nbest-rover/nbest-lists/sw_40008_A_0003136_0003462.score.gz
-use-mesh -write -
and the answers to the above questions will be obvious.
> - each individual confusion set's pi's must sum to 1?
>
yes, though this is not enforced when the file is read in.
Andreas
> So for this CN:
> [
> [(0.2, "a"), (0.8, "b")],
> [(0.3, "c"),(0.7, "d")]
> ],
>
> I would encode it as:
>
> name "somename"
> numaligns 4
> posterior P
> align 0 "initial" 1.0
> align 1 "a" 0.2 "b" 0.8
> align 2 "c" 0.3 "d" 0.7
> align 3 "final" 1.0
>
> Is this correct? And how do I compute P?
>
> Thank you very much for your help!
> Stephan
>
More information about the SRILM-User
mailing list