[SRILM User List] Question of replace-words-with-classes

Meng Chen chenmengdx at gmail.com
Sat Mar 31 20:00:35 PDT 2012


Hi, I met a question when training class-based language model by
replace-words-with-classes command. My commands are as follows:


   - ngram-class -vocab wlist -text training_set -numclasses 200
   -incremental -classes output.classes
   - replace-words-with-classes classes=output.classes training_set >
   training_set_classes

After these two steps, I found that there are both words and classes in
training_set_classes. These words are OOVs in wlist, however, I don't need
them at all. Shouldn't these words belong to <unk> in CLASS-00001? So I
wonder to know how to process this situation? Does SRILM support some
scripts to map these OOVs to CLASS-00001? Or Do I need to write a script by
myself?

Thanks!

Meng Chen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120401/57771067/attachment.html>


More information about the SRILM-User mailing list