training with different weights

Behrang Mohit i_am_behrang at
Mon Apr 20 20:44:42 PDT 2009


Is there an option to give weights to certain training instances (sentences)?  For example if I have some sentences that are more relevant to my translation domain and I want them to influence the LM 4 times more than the rest of the data.  

Currently I'm doing that by just repeating those important sentences in the training corpus.  This way the training takes much longer.  Is there an alternative  way to do this?

Also I was wondering why there is such slowdown?
My guess is that the repetition changes the size of ngrams (mainly trigrams) dramatically. many of the infrequent bi or tri grams that are filtered in the baseline model, will be considered in the new model.  Is that right?




More information about the SRILM-User mailing list