update
Andreas Stolcke
stolcke at speech.sri.com
Wed Sep 4 13:21:09 PDT 2002
Hongqin,
let me venture a guess: you are using your LM training data to do the mixture
optimization. You should be using held-out data set that has NOT been
used to estimate the component models. If you are optimizing on the LM training
data then it is no surprise that the word-ngram gets weight 1.
--Andreas
In message <3D761349.864CCE73 at inzigo.com>you wrote:
> Hi,
>
> I got iteration 3:
>
> teration 1, lambda = (0.5 0.5), ppl = 4.91204
> iteration 2, lambda = (0.514374 0.485626), ppl = 4.908
> iteration 3, lambda = (0.528604 0.471396), ppl = 4.90404
>
> It seems that the final ppl will not less than that from word-based
> trigram (4.80), in other words, there is no minimum between the two end
> points. The other end (class) is 5.08, not too bad. I'll wait until the
> iteration stops.
>
> Good day,
>
> Hongqin
>
>
>
More information about the SRILM-User
mailing list