[SRILM User List] How to use SRILM with trigrams only

Andreas Stolcke stolcke at icsi.berkeley.edu
Wed May 24 08:26:59 PDT 2017

On 5/24/2017 12:08 AM, claude.vividsky at gmail.com wrote:
> Hi,
> which command line parameters must be specified for ngram-count and ngram
> when only trigram probabilities should be applied?
> At the moment I use:
>    ngram-count -order 3 -gt3min 1 ...
>    ngram       -order 3 ...
> The documentation says on "-order":
>    Set the *maximal* N-gram order to be used ...
> Does this mean that bigrams and unigrams will be used too with "-order 3"?
> What means "use" here: Are bigrams and unigrams used only for discounting or
> are they used for the calculation of probabilities too?

The standard model type in SRILM is a backoff ngram  LM.  That means you 
always need the lower-order ngrams (unigrams, bigrams) for cases where 
the highest-order ngram (trigram) in the test data is not found in the 
training data, and therefore in the model itself.

See here 
for a description of the file format storing all orders of ngrams, and 
for a detailed description of how the parameters associated with those 
ngrams are computed.

If you want to disable the backing-off (i.e., smoothing) in a trigram 
LM, use -gt3max 0.   However, the file format will still contain all the 
lower-order ngrams.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.speech.sri.com/pipermail/srilm-user/attachments/20170524/bb6c37fd/attachment.html>

More information about the SRILM-User mailing list