[SRILM User List] HVite Lattice Generation Problem

Sam Bowman sbowman at uchicago.edu
Mon Apr 5 15:24:22 PDT 2010


I am trying to generate lattices in HTK, which I would then rescore using a language model in SRILM. The quoted sections are meant to represent the files being operated on.

> config/dev.crop.filelist.old:
> data/video004
> data/video013
> data/video014
> data/video016
> ...


#Generate lattices
HVite -w config/train.wdnet -A -T 1 -D -H models/mono-nmix2-npass1/MMF -S config/dev.crop.filelist.old  -p -100.0 -s 0.0 -l lattices -z lat -n 4 config/train.dictionary config/phonelist

> No HTK Configuration Parameters Set
> 
> Read 248 physical / 248 logical HMMs
> Read lattice with 118 nodes / 341 arcs
> Created network with 480 nodes / 703 links
> File: data/video004
> SOMETHING_ONE .  ==  [86 frames] -490.2613 [Ac=-41862.5 LM=-300.0] (Act=453.2)
> File: data/video013
> sil GO1 .  ==  [68 frames] -493.8586 [Ac=-33182.4 LM=-400.0] (Act=446.6)
> File: data/video014
> sil GO1 .  ==  [76 frames] -481.3506 [Ac=-36182.6 LM=-400.0] (Act=449.9)
> File: data/video016
> IX .  ==  [93 frames] -501.9774 [Ac=-46383.9 LM=-300.0] (Act=455.1)
> ...
> File: data/video192
> sil IX GIVE1 .  ==  [97 frames] -525.5662 [Ac=-50479.9 LM=-500.0] (Act=456.0)
> 
> No HTK Configuration Parameters Set


ls -1 lattices/*.lat | awk '{print $1}' > lattice_list

> lattice_list:
> lattices/video004.lat
> lattices/video013.lat
> lattices/video014.lat
> lattices/video016.lat
> ...


#apply LM
lattice-tool -read-htk -write-htk -in-lattice-list lattice_list -out-lattice-dir lattices_out/ -order 3 -lm ../lm/ukn.3.lm.gz -no-nulls

ls -1 lattices_out/ | awk '{print "lattices_out/" $1}' > trigram_lattice.lst

> trigram_lattice.lst 
> lattices_out/video004.lat
> lattices_out/video013.lat
> lattices_out/video014.lat
> lattices_out/video016.lat
> ...
> lattices_out/video144.lat
> lattices_out/video151.lat


#decode
lattice-tool -read-htk -htk-lmscale 10 -htk-wdpenalty 0 -in-lattice-list trigram_lattice.lst -viterbi-decode > rescore_trigram.out

> rescore_trigram.out:
> data/video004
> data/video013
> data/video014
> data/video016
> ...
> data/video151

I believe this is where I should see the result of the rescored lattices, in the form of transcriptions, but I get nothing at all.

For what it's worth, substituting the following code for the first line does produces some output in rescore_trigram.out, just for a few data files, and badly. If possible, we'd rather avoid reference to recout.mlf – the naive output of the recognizer before the LM is applied.

cat models/mono-nmix2-npass1/recout.mlf | perl -pe 's/rec/lab/g' > models/mono-nmix2-npass1/recout1.mlf

HVite -A -T 1 -D -I models/mono-nmix2-npass1/recout1.mlf -H models/mono-nmix2-npass1/MMF -S config/dev.crop.filelist.old  -p -100.0 -s 0.0 -l lattices -z lat -n 4 config/train.dictionary config/phonelist 


Does anyone see anything wrong with the program as is that might prevent any meaningful output?

Any help would be greatly appreciated.

Thanks,

Sam Bowman
University of Chicago

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20100405/24436681/attachment.html>


More information about the SRILM-User mailing list