<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 5/24/2017 12:08 AM,

      <a class="moz-txt-link-abbreviated" href="mailto:claude.vividsky@gmail.com">claude.vividsky@gmail.com</a> wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:002001d2d45c$8ad8cef0$a08a6cd0$@gmail.com">

      <pre wrap="">Hi,

which command line parameters must be specified for ngram-count and ngram

when only trigram probabilities should be applied?

At the moment I use:

  ngram-count -order 3 -gt3min 1 ...

  ngram       -order 3 ...

The documentation says on "-order": 

  Set the *maximal* N-gram order to be used ...

Does this mean that bigrams and unigrams will be used too with "-order 3"? 

What means "use" here: Are bigrams and unigrams used only for discounting or

are they used for the calculation of probabilities too?

</pre>

    </blockquote>

    Claude,<br>

    <br>

    The standard model type in SRILM is a backoff ngram  LM.  That means

    you always need the lower-order ngrams (unigrams, bigrams) for cases

    where the highest-order ngram (trigram) in the test data is not

    found in the training data, and therefore in the model itself.<br>

    <br>

    See <a moz-do-not-send="true"

href="http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html">here</a>

    for a description of the file format storing all orders of ngrams,

    and <a moz-do-not-send="true"

href="http://www.speech.sri.com/projects/srilm/manpages/ngram-discount.7.html">here</a>

    for a detailed description of how the parameters associated with

    those ngrams are computed.<br>

    <br>

    If you want to disable the backing-off (i.e., smoothing) in a

    trigram LM, use -gt3max 0.   However, the file format will still

    contain all the lower-order ngrams.<br>

    <br>

    Andreas<br>

    <br>

    <br>

    <br>

    <br>

    <br>

  </body>

</html>