bug in lattice-tool?

ilya oparin ioparin at yahoo.co.uk
Wed Nov 8 00:28:43 PST 2006


We've possibly found a bug in lattice-tool. Here, in
Brno, we work with th Czech language that has
diacritized letters. So, lattice-tool does everything
well with all the calculations until it comes to
matching of the best path with the reference file to
get number of del, subs and ins - and finally WER. It
appears that if both files are in ISO encoding and
there is a diacritized letter in the reference, it can
be matched to a non-diacritized word in the output,
that is actually a different word. So, the WER goes
down significantly from what really is (and what is
correctly output by HResults in HTK).

best regards,

Send instant messages to your online friends http://uk.messenger.yahoo.com 

More information about the SRILM-User mailing list