Interpolating Lang Models for Indonesian ASR

Andreas Stolcke stolcke at
Wed Sep 7 09:32:09 PDT 2005

My first suggestion is to make sure that all LMs you are comparing and
interpolating use the same vocabulary.  In SRILM you can enforce this 
by using the -vocab option when building the LM.

PPLs over different vocabularies are of course not comparable, and it is 
easy to mess this up when building LMs from different data sets.


In message <20050907105509.BSN53884 at>you wrote:
> Hi All,
> At present I am trying to use the SRI tools to improve the 
> LM for an Indonesian ASR system we are building. We have 
> just over ten hours of Australian Broadcast Commision 
> training data and at present the system gets just over 80% 
> on a heldout test set, with a bigram LM trained on the 
> training data. However, we also have approximately 12 
> million words of Text from Indonesian papers Kompass and 
> Tempo and were hoping that we could interpolate these with 
> the existing ABC LM to improve the ngram estimates and 
> subsequent perplexity.
> Evaluating the ABC data PPL on a separate dev transcript 
> provides a ppl=297
> Noting the advice given in the package notes when using 
> limited vocabs(vocab is 11000 words) I computed discount 
> coefficients first on unlimited vocab.and then subsequently 
> used these as input to a second pass of n-gram-count to get 
> the LM. I used good-turing.
> I then ran 
> ngram -lm $sPATH_OUTPUT/lm/ -order 2 -vocab 
> $DESIRED_VOCAB -limit-vocab -ppl output/sr\
> i_trans/$
> to get perplexity score
> Using a similar technique using the much larger set of 
> Kompass text produces a ppl score of 808 when evaluated on 
> the ABC dev set.
> All is well and good, until I try and interpolate the 2. I 
> have trialled two approaches. The first uses the dynamic 
> interpolation capabality incorporated in ngram.Using 
> ngram -bayes 0  -lm ./ABC/lm/ -mix-
> lm ./Kompass/lm/ -debug 2 -
> ppl ./sri_trans/ gives a ppl of 342  ie 
> much worse than the original 297.
> I then tried using the "compute-best-mix" utility which 
> starts of as expected at lambda values 0.5 and 0.5 and 
> iterates to 0.66 and 0.33.Plugging these vals into 
> ngram -lm ./ABC/lm/ -lambda 0.66 -mix-
> lm ./Kompass/lm/ -debug 1 -ppl output/sri_trans/
> $|tail yields 
> s
>          ppl= 331.8 ppl1= 608.504
> still worse. I would expect it to perhaps stay the same, and 
> iterate to lambda values which excluded the Kompass data, 
> but these seem to be at odds with the ppl score. I then 
> trialled the same technique using Switchboard and Gigaword 
> and got the normal expected behaviour ie improvement
> Unsure of whether this was because the Kompass data was 
> unsuitable or I was just making a foolish error somewhere I 
> trialled CMU LM toolkit. Agian using gt discounting to build 
> a lm and evaluate on ABC devset gives a ppl of 268 which was 
> a little surprising. More surprising was when I used their 
> interpolation tools. To cut the story short it produces:
> 			weights: 0.547  0.453  (7843 items) -
> -> PP=152.624029
> 	=============>  TOTAL PP = 152.624
> No doubt the devil is in the detail, but has anyone got some 
> suggestions. 
> Cheers 
> Terry Martin
> QUT Speech Lab
> Australia

More information about the SRILM-User mailing list