[SRILM User List] How to compare LMs training with different vocabularies?

Mon Nov 5 22:46:16 PST 2012

Hi, I'm training LMs for Mandarin Chinese ASR task with two different
vocabularies, vocab1(100635 vocabularies) and vocab2(102541 vocabularies).
In order to compare the performance of two vocabularies, the training
corpus is the same, the test corpus is the same, and the word segmentation
method is also the same, which is Forward Maximum Match. The only
difference is the segmentation vocabulary and LM training vocabulary. I
trained LM1 and LM2 with vocab1 and vocab2, and evaluate them on test set. The
result is as follows:

LM1: logprobs = -84069.7, PPL = 416.452.
LM2: logprobs = -82921.7, PPL = 189.564.

It seems LM2 is much better than LM1, either by logprobs or by PPL.
However, when I am doing decoding with the corresponding Acoustic Model.
The CER(Character Error Rate) of LM2 is higher than LM1. So I'm really
confused. What's the relationship between the PPL and CER?  How to compare
LMs with different vocabularies? Can you give me some suggestions or
references? I'm really confused.

ps: There is a mistake in last mail, so I sent it gain.

Thanks!

Meng CHEN
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20121106/d337818f/attachment.html>