update

Wed Sep 4 13:21:09 PDT 2002

Hongqin,

let me venture a guess: you are using your LM training data to do the mixture
optimization.  You should be using held-out data set that has NOT been
used to estimate the component models.   If you are optimizing on the LM training 
data then it is no surprise that the word-ngram gets weight 1.

--Andreas

In message <3D761349.864CCE73 at inzigo.com>you wrote:
> Hi,
> 
> I got iteration 3:
> 
> teration 1, lambda = (0.5 0.5), ppl = 4.91204
> iteration 2, lambda = (0.514374 0.485626), ppl = 4.908
> iteration 3, lambda = (0.528604 0.471396), ppl = 4.90404
> 
> It seems that the final ppl will not less than that from word-based
> trigram (4.80), in other words, there is no minimum between the two end
> points. The other end (class) is 5.08, not too bad.  I'll wait until the
> iteration stops.
> 
> Good day,
> 
> Hongqin
> 
> 
>