[SRILM User List] SRILM trigram worse than HTK bigram?

Sun Nov 25 08:51:03 PST 2012

1. Output is identical. Thus, LM scale factor does not play a decisive 
role. Conversion to MLF from CTM is fine too.

2. I built a bigram in ARPA format with HTK (using HLStats). Here after 
rescoring and decoding I have got the same recognition result as for LM 
built with SRILM:  I tried to change the LM scale factor from 10 to 8 
(the lattice was obtained with LM scale factor 8), but it gave no 
difference.

Thus, the changes are introduced when rescoring.

I suggested the reason is the difference between start/end sentence 
markers. For HTK they are !ENTER and !EXIT respectively, and for SRILM: 
<s> and </s>. I do take it into account: I replace !ENTER and !EXIT 
with <s> and </s> in the lattice file.
SRILM models are trained on the data, where <s> and </s> denote the 
boundaries.
Also, I replaced these markers in the language model built with HTK in 
order to let it process the existing lattice correctly.

However, when I tried to play around with those markers, it gave no 
result.
Namely, I tried to use the HTK format only: the lattice generated and 
the language model use !ENTER and !EXIT. Unfortunately, the output was 
the same.

Do you have any further suggestions?

Yours,
Dmytro.

On Fri 23 Nov 2012 07:12:50 PM CET, Andreas Stolcke wrote:
> You need to run a few sanity checks to make sure things are working as
> you expect them to.
>
> 1.  Decode 1-best from the HTK lattice WITHOUT rescoring.  The results
> should be the same as from the HTK decoder.  If not there might be a
> difference in the LM scaling factor, and you may have to adjust is via
> the command line option. There might also be issues with the CTM
> output and conversion back to MLF.
>
> 2. Rescore the lattices with the same LM that is used in the HTK
> decoder.   Again, the results should be essentially identical.
> I'm not familiar with the bigram format used by HTK, but you may have
> to convert it to ARPA format.
>
> 3. Then try rescoring with a trigram.
>
> Approaching your goal in steps hopefully will help you pinpoint the
> problem(s).
>
> Andreas
>
> On 11/22/2012 5:06 AM, Dmytro Prylipko wrote:
>> Hi,
>>
>> I found that the accuracy of the recognition results obtained with
>> HVite is about 5% better with comparison to the hypothesis got after
>> rescoring the lattices with lattice-tool.
>>
>> HVite do not really use an N-gram, it is a word net, but I cannot
>> really figure out why does it work so much better than SRILM models.
>>
>> I use the following script to generate lattices (60-best):
>>
>> HVite -A -T 1 \
>> -C GENLATTICES.conf \
>> -n 20 60 \
>> -l outLatDir \
>> -z lat \
>> -H hmmDefs \
>> -S test.list \
>> -i out.bigram.HLStats.mlf \
>> -w bigram.HLStats.lat \
>> -p 0.0 \
>> -s 8.0 \
>> lexicon \
>> hmm.mono.list
>>
>> Which are then rescored with:
>>
>> lattice-tool \
>> -read-htk \
>> -write-htk \
>> -htk-lmscale 10.0 \
>> -htk-words-on-nodes \
>> -order 3 \
>> -in-lattice-list srclat.list \
>> -out-lattice-dir rescoredLatDir \
>> -lm trigram.SRILM.lm \
>> -overwrite
>>
>> find rescoredLatDir -name "*.lat" > rescoredLat.list
>>
>> lattice-tool \
>> -read-htk \
>> -write-htk \
>> -htk-lmscale 10.0 \
>> -htk-words-on-nodes \
>> -order 3  \
>> -in-lattice-list rescoredLat.list\
>> -viterbi-decode \
>> -output-ctm | ctm2mlf_r > out.trigram.SRILM.mlf
>>
>> Decoded with HVite (92.86%):
>>
>>  LAB: <A> wie sieht es aus mit einem weiteren zweitaegigen mit einer
>> weiteren zweitaegigen arbeitssitzu
>>  REC: <A> wie sieht es aus mit einem weiteren zweitaegigen in  einer
>> weiteren zweitaegigen arbeitssitzu
>>
>> ... and with lattice-tool (64.29%):
>>
>>  LAB: <A> wie sieht es aus mit einem weiteren zweitaegigen mit  einer
>> weiteren zweitaegigen arbeitssitzu
>>  REC: <A> wie sieht es aus mit einen weiteren zweitaegigen dann bei
>> einem    zweitaegigen arbeitssitzung
>>
>> Corresponding word nets and LMs have been built using the same
>> vocabulary and training data. I should say that for some sentences
>> SRILM outperforms HTK, but in general it is roughly 5-7% behind.
>> Could you please suggest why is it so? Maybe some parameter values
>> are wrong?
>> Or should it be like this?
>>
>> I would be greatly appreciated for help.
>>
>> Yours,
>> Dmytro Prylipko.
>>
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>