<div dir="ltr">Thank you for your insights.<div><br></div><div>-Fred</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jul 19, 2018 at 6:29 AM, Anand Venkataraman <span dir="ltr"><<a href="mailto:venkataraman.anand@gmail.com" target="_blank">venkataraman.anand@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Cool - Central Limit Theorem in action :-)<div><br></div><div>&</div><div><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 18, 2018 at 11:06 AM, Andreas Stolcke <span dir="ltr"><<a href="mailto:stolcke@icsi.berkeley.edu" target="_blank">stolcke@icsi.berkeley.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div class="m_1097336764512314132m_-7679451286045694861m_1430527651308571860moz-cite-prefix"><br>
This is as expected. You have two estimators (of conditional word
probabilities, i.e., LMs), each with random deviations from the
true probabilities. By averaging their predictions you reduce the
deviation from the truth (assuming the deviations are randomly
distributed).<br>
<br>
For this reason you can almost always get a win out of
interpolating models that are approximately on par in their
individual performance. Other examples are<br>
<br>
- random forest models<br>
- sets of neural LMs initialized with different initial random
weights<br>
- log-linear combination of forward and backward running LMs<br>
- sets of LMs trained on random samples from the same training set<br>
<br>
These techniques all reduce the "variance" part of the <a href="https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FBias%25E2%2580%2593variance_tradeoff&data=01%7C01%7Csrilm-user%40speech.sri.com%7Cbf1d15c6a918412c407c08d5ecd93a1b%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=QOmL3M1bKDrphq%2FbwLkFVSVkqCxvLFOg3ibgx5FaAHw%3D&reserved=0" target="_blank">modeling
error</a>. Other techniques (like interpolating models trained
on different genres) do a similar thing for the "bias" part of
the error.<br>
<br>
Andreas<br>
<br>
On 7/17/2018 9:22 PM, Fed Ang wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi,</div>
<div><br>
</div>
I don't know if it has been asked before, but does it make sense
to interpolate on the basis of smoothing instead of
domain/genre? What should be the assumptions in considering
this when the resulting perplexity is lower than any of the two
separately?
<div><br>
</div>
<div>Let's say: 5-gram Katz yields 100, and 5-gram Modified KN
yields 90</div>
<div>Then best-mix of the two yields 87</div>
<div><br>
</div>
<div class="m_1097336764512314132m_-7679451286045694861m_1430527651308571860cye-lm-tag">On a theoretical perspective, is it
sound to simply trust that the interpolated LM is
better/generalizable to different smoothing combinations?</div>
<div><br>
</div>
<div>-Fred</div>
</div>
<br>
<fieldset class="m_1097336764512314132m_-7679451286045694861m_1430527651308571860mimeAttachmentHeader"></fieldset>
<br>
<pre>______________________________<wbr>_________________
SRILM-User site list
<a class="m_1097336764512314132m_-7679451286045694861m_1430527651308571860moz-txt-link-abbreviated" href="mailto:SRILM-User@speech.sri.com" target="_blank">SRILM-User@speech.sri.com</a>
<a class="m_1097336764512314132m_-7679451286045694861m_1430527651308571860moz-txt-link-freetext" href="http://mailman.speech.sri.com/cgi-bin/mailman/listinfo/srilm-user" target="_blank">http://mailman.speech.sri.com/<wbr>cgi-bin/mailman/listinfo/srilm<wbr>-user</a></pre>
</blockquote>
<p><br>
</p>
</div>
<br>______________________________<wbr>_________________<br>
SRILM-User site list<br>
<a href="mailto:SRILM-User@speech.sri.com" target="_blank">SRILM-User@speech.sri.com</a><br>
<a href="http://mailman.speech.sri.com/cgi-bin/mailman/listinfo/srilm-user" rel="noreferrer" target="_blank">http://mailman.speech.sri.com/<wbr>cgi-bin/mailman/listinfo/srilm<wbr>-user</a><br></blockquote></div><br></div></div></div></div>
</blockquote></div><br></div>