Interpolation of word-based and POS-nased ngrams

Katrin Kirchhoff katrin at ssli-mail.ee.washington.edu
Tue Jul 20 11:25:54 PDT 2004



As far as I known you need to write your own script to compute
P(word|POS) and create the classes file. There's 
a compute-best-mix.gawk script in SRILM for estimating
interpolation weights. 

KK

On Tue, Jul 20, 2004 at 07:01:10PM +0200, Robert Wagner wrote:
> Hi Andreas,
>  my problem is that I use different data for both models. The
> word-based model uses a text consisting of recognized words, POS-based
> class model uses a text consistig of recognized words' POS. I have
> estimated this model simply by using the ngram-count tool from the
> text where words were replaced by their POS tags. 
>  POS-based classes are also not typical "simple classes"...
>   
> Robert
> 
> P.S.
>  It would be ideal to gain the interpolation weights by SRILM as well;-)
> 
> > 
> > In message <200407201526.i6KFQNr4006030 at www3.pobox.sk>you wrote:
> > > Hello SRILM users!
> > >  Does anybody know if there is an implementation of interpolation
> > > weights in SRILM? I have an ordinary word-based ngram and
> > > part-of-speech-based ngram and want to interpolate them to create HMM
> > > model for disfluency detection (using hidden-ngram tool). Is it
> > > possible to do it directly in SRILM?
> > 
> > By using the options
> > 
> > 	-lm
> > 	-classes
> > 	-simple-classes
> > 	-lambda
> > 	-mix-lm
> > 
> > with hidden-ngram you can tell it to use an interpolated LM where
> > (one or both of) the component models are class-based.
> > 
> > For details see the man page.
> > 
> > --Andreas 
> > 
> 
> ____________________________________
> http://www.pobox.sk/ - spolahliva a bezpecna prevadzka
> 
> 
> 

-- 
-----------------------------------------------------------------
Katrin Kirchhoff
Dept of Electrical Engineering, University of Washington
M422 EE/CS Building, Box 352500, Seattle, WA, 98195
Phone: (206) 616 5494
katrin at ee.washington.edu
-----------------------------------------------------------------



More information about the SRILM-User mailing list