general help
Andreas Stolcke
stolcke at speech.sri.com
Tue Sep 3 08:45:06 PDT 2002
Hongqin,
Two suggestions:
- interpolate your class-based LM with the word-based one
(class-based LMs alone usually don't give an improvement over word-based ones
except in very limited domains).
- use Kneser-Ney smoothing (with interpolation) for the 4gram LM:
-kndiscount1 -interpolate1 -kndiscount2 -interpolate2
-kndiscount3 -interpolate3 -kndiscount4 -interpolate4
You should see a perplexity reduction over the 3gram, and over GT
discounting. Of course you never know about WER...
--Andreas
In message <3D74D4C4.2A5F65BB at inzigo.com>you wrote:
> Hi, Crouching tigers & hidden dragons:
>
> I am using a word based trigram (GT backoff) for an application, and
> trying to make futher improvement. I tried to use class based, but
> seemed not so good as word based. Higher gram (4gram) seems also worse
> than 3gram. The WER (word error rate) i got now is about 8-10%, it seems
> that there is still some room for improvement. Anyone got good ideas --
> within ngram. Thanks in advance.
>
> Hongqin Liu
>
>
More information about the SRILM-User
mailing list