<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 8/2/2017 12:41 AM, 徐 wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:3de486a8.951e.15da1e45479.Coremail.xulikui123321@163.com">

      <div

        style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">

        <div>Hi，</div>

        <div>    I trained a LM model, then my boss give me some text

          and tell me  Strengthen the probability of ngram in these

          texts, what i used to do is generate the count from the text

          and merge them with old ngram count, then retrain a model. Is

          there some command or method to do this faster?</div>

      </div>

    </blockquote>

    <br>

    Combining the counts of your main training data with those from the

    adaptation data is one approach.  There is no shortcut for this: 

    you have to actually combine the counts (which you can do by just

    cat'ing the two files together), then train a new model.<br>

    <br>

    The other approach is to train a separate model on the adaptation

    data, then interpolate that model with the base model.  This is

    usually more convenient because (1) you process the training data

    for the base model only once and (2) you can control the influence

    of the adaptation data by changing the weight of the models in

    adaptation.<br>

    <br>

    To interpolate two ngram models use<br>

    <br>

                    ngram -order N -lm BASEMODEL -mix-lm NEWMODEL

    -lambda WEIGHT -write-lm ADAPTEDMODEL<br>

    <br>

    WEIGHT is the weight of the BASEMODEL, typically something close to

    1, like 0.9, assuming the adaptation data is small compared to the

    main training corpus.<br>

    <br>

    For a comparison of the two LM adaptation approaches and more

    background see

    <a class="moz-txt-link-freetext" href="http://www.sciencedirect.com/science/article/pii/S0167639303001055">http://www.sciencedirect.com/science/article/pii/S0167639303001055</a> .<br>

    <br>

    Make sure you are not adapting on the test data that you use to get

    a realistic performance estimate.  Otherwise your result with be

    overly optimistic and your boss will be disappointed later ;-)<br>

    <br>

    Andreas<br>

    <br>

  </body>

</html>