[SRILM User List] SRILM trigram worse than HTK bigram?

Dmytro Prylipko dmytro.prylipko at ovgu.de
Thu Nov 22 05:06:38 PST 2012


Hi,

I found that the accuracy of the recognition results obtained with HVite 
is about 5% better with comparison to the hypothesis got after rescoring 
the lattices with lattice-tool.

HVite do not really use an N-gram, it is a word net, but I cannot really 
figure out why does it work so much better than SRILM models.

I use the following script to generate lattices (60-best):

HVite -A -T 1 \
-C GENLATTICES.conf \
-n 20 60 \
-l outLatDir \
-z lat \
-H hmmDefs \
-S test.list \
-i out.bigram.HLStats.mlf \
-w bigram.HLStats.lat \
-p 0.0 \
-s 8.0 \
lexicon \
hmm.mono.list

Which are then rescored with:

lattice-tool \
-read-htk \
-write-htk \
-htk-lmscale 10.0 \
-htk-words-on-nodes \
-order 3 \
-in-lattice-list srclat.list \
-out-lattice-dir rescoredLatDir \
-lm trigram.SRILM.lm \
-overwrite

find rescoredLatDir -name "*.lat" > rescoredLat.list

lattice-tool \
-read-htk \
-write-htk \
-htk-lmscale 10.0 \
-htk-words-on-nodes \
-order 3  \
-in-lattice-list rescoredLat.list\
-viterbi-decode \
-output-ctm | ctm2mlf_r > out.trigram.SRILM.mlf

Decoded with HVite (92.86%):

  LAB: <A> wie sieht es aus mit einem weiteren zweitaegigen mit einer 
weiteren zweitaegigen arbeitssitzu
  REC: <A> wie sieht es aus mit einem weiteren zweitaegigen in  einer 
weiteren zweitaegigen arbeitssitzu

... and with lattice-tool (64.29%):

  LAB: <A> wie sieht es aus mit einem weiteren zweitaegigen mit  einer 
weiteren zweitaegigen arbeitssitzu
  REC: <A> wie sieht es aus mit einen weiteren zweitaegigen dann bei   
einem    zweitaegigen arbeitssitzung

Corresponding word nets and LMs have been built using the same 
vocabulary and training data. I should say that for some sentences SRILM 
outperforms HTK, but in general it is roughly 5-7% behind.
Could you please suggest why is it so? Maybe some parameter values are 
wrong?
Or should it be like this?

I would be greatly appreciated for help.

Yours,
Dmytro Prylipko.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20121122/c4dcb03c/attachment.html>


More information about the SRILM-User mailing list