[SRILM User List] Does keep-unk work with lattice-tool and htk format?

Andreas Stolcke stolcke at icsi.berkeley.edu
Fri Aug 24 00:07:40 PDT 2012


Congratulations, you found a bug! The patch attached to this message (to 
HTKLattice.cc) should fix this problem.

Andreas

On 8/21/2012 2:43 PM, Lluís Formiga i Fanals wrote:
> Hi Andreas,
>
> Sorry to bother you with this old issue.
>
> The two-step lattice-tool process worked perfectly. First the 
> rescoring and second the conversion to CN.
>
> But, unfortunately I have seen a few unks while rescoring the lattice 
> (not as many as writing the mesh).
>
> The command I use to rescore is:
>
> lattice-tool -lm ../../lm/interpolated-lm.en -in-lattice 
> wordlattice0.slf -read-htk -out-lattice out.slf-write-htk -keep-unk 
> -print-sent-tags -htk-logbase 2.71828
>
> And I find lines like these: (Whithin these lines the <unk> tag should 
> be queit)
>
> J=26 S=19 E=24 W=qu a=0 l=-13.8261 J=27 S=19 E=25 W=que a=0 l=-11.4986 
> J=28 S=19 E=26 W=<unk> a=0 l=-2.76367 J=29 S=19 E=27 W=quest a=0 
> l=-10.831 J=30 S=19 E=28 W=quiet a=0 l=-10.57 J=31 S=19 E=29 W=quit 
> a=0 l=-10.4455 J=32 S=20 E=21 W=row a=0 l=-10.1076 J=33 S=21 E=24 W=qu 
> a=0 l=-14.9448 J=34 S=21 E=25 W=que a=0 l=-12.6173 J=35 S=21 E=26 
> W=<unk> a=0 l=-3.88236 J=36 S=21 E=27 W=quest a=0 l=-11.9497 J=37 S=21 
> E=28 W=quiet a=0 l=-11.6887 J=38 S=21 E=29 W=quit a=0 l=-11.0153 J=39 
> S=22 E=19 W=arrow a=0 l=-12.6258
>
> I have to say that I use the rescoring to give probabilities to the 
> archs from misspelling corrections. So I do not have any acoustic 
> scores. (I set all them equal).
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120824/8ac46647/attachment.html>
-------------- next part --------------
*** lattice/src/HTKLattice.cc	3 Aug 2012 01:11:34 -0000	1.60
--- lattice/src/HTKLattice.cc	24 Aug 2012 07:02:40 -0000
***************
*** 1769,1776 ****
  					toNode->word == vocab.seIndex()) ||
  				   toNode->word == Vocab_None) ?
  				   HTK_null_word :
! 				    (node->htkinfo && node->htkinfo->wordLabel ?
! 					node->htkinfo->wordLabel :
  					vocab.getWord(toNode->word)),
  			    htkheader.useQuotes);
  	    }
--- 1769,1776 ----
  					toNode->word == vocab.seIndex()) ||
  				   toNode->word == Vocab_None) ?
  				    HTK_null_word :
! 				    (toNode->htkinfo && toNode->htkinfo->wordLabel ?
! 					toNode->htkinfo->wordLabel :
  					vocab.getWord(toNode->word)),
  			    htkheader.useQuotes);
  	    }


More information about the SRILM-User mailing list