<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix"><br>

      This is as expected.  You have two estimators (of conditional word

      probabilities, i.e., LMs), each with random deviations from the

      true probabilities.  By averaging their predictions you reduce the

      deviation from the truth (assuming the deviations are randomly

      distributed).<br>

      <br>

      For this reason you can almost always get a win out of

      interpolating models that are approximately on par in their

      individual performance.  Other examples are<br>

      <br>

      - random forest models<br>

      - sets of neural LMs initialized with different initial random

      weights<br>

      - log-linear combination of forward and backward running LMs<br>

      - sets of LMs trained on random samples from the same training set<br>

      <br>

      These techniques all reduce the "variance" part of the <a

        moz-do-not-send="true"

        href="https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FBias%25E2%2580%2593variance_tradeoff&data=01%7C01%7Csrilm-user%40speech.sri.com%7Cbf1d15c6a918412c407c08d5ecd93a1b%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=QOmL3M1bKDrphq%2FbwLkFVSVkqCxvLFOg3ibgx5FaAHw%3D&reserved=0" originalSrc="https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff" shash="vVUcJEJR7Q3AKFLYQX/JQM/f66+frap2xCKtDwkj01U7+jSv88rKMJIt5D4U7w++gyFzq3lMdvuoeLkz9thwa79RSUNyosD0QR0aLEl3cHhObCVFNfu7adEg43r9SuH7dk3GpB3UbuAvonZzuulpcAMjAX/whgqMCKv3xDyu25s=">modeling

        error</a>.  Other techniques (like interpolating models trained

      on different genres) do a similar thing for the "bias"  part of

      the error.<br>

      <br>

      Andreas<br>

      <br>

      On 7/17/2018 9:22 PM, Fed Ang wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CANXzshR1azT8RCqGZ0L-0Nquxvpr2Lwm87cwT3Pv_xecm79nBg@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=utf-8">

      <div dir="ltr">

        <div>Hi,</div>

        <div><br>

        </div>

        I don't know if it has been asked before, but does it make sense

        to interpolate on the basis of smoothing instead of

        domain/genre?  What should be the assumptions in considering

        this when the resulting perplexity is lower than any of the two

        separately?

        <div><br>

        </div>

        <div>Let's say: 5-gram Katz yields 100, and 5-gram Modified KN

          yields 90</div>

        <div>Then best-mix of the two yields 87</div>

        <div><br>

        </div>

        <div class="cye-lm-tag">On a theoretical perspective, is it

          sound to simply trust that the interpolated LM is

          better/generalizable to different smoothing combinations?</div>

        <div><br>

        </div>

        <div>-Fred</div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

SRILM-User site list

<a class="moz-txt-link-abbreviated" href="mailto:SRILM-User@speech.sri.com">SRILM-User@speech.sri.com</a>

<a class="moz-txt-link-freetext" href="http://mailman.speech.sri.com/cgi-bin/mailman/listinfo/srilm-user">http://mailman.speech.sri.com/cgi-bin/mailman/listinfo/srilm-user</a></pre>

    </blockquote>

    <p><br>

    </p>

  </body>

</html>