[SRILM User List] OOV terminology

yangyang shi shiyang1983 at gmail.com
Wed Jul 3 13:18:45 PDT 2013


Hi Joris,

Is this a type of cut-off? If you set cut-off == 3, that means the words
occurs less than 3 times will be considered as OOV.

Cheers,

Yangyang Shi


On Wed, Jul 3, 2013 at 8:22 PM, Joris Pelemans <
Joris.Pelemans at esat.kuleuven.be> wrote:

> Hello all,
>
> My question is perhaps a little bit of topic, but I'm hoping for your
> cooperation, since it's LM related.
>
> Say we have a training corpus with lexicon V_train. Since some of the
> words have near-zero counts, we choose to exclude them from our LM. This
> gives us a new lexicon, let's call it V_final. However this also gives us
> two types of OOV words: those not in V_train and those not in V_final. I
> was wondering whether there are standard terms in the literature for these
> two types of OOVs. I have read my share of papers, but none of them seem to
> make this distinction.
>
> Kind regards,
>
> Joris
> ______________________________**_________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/**mailman/listinfo/srilm-user<http://www.speech.sri.com/mailman/listinfo/srilm-user>
>



-- 
Met vriendelijke groet,

Yangyang Shi

TU Delft / Interactive Intelligence Group
HB12.290, EWI,
Mekelweg 4,
2628 CD Delft,
T +31 (0) 152782549
E shiyang1983 at gmail.com; yangyangshi at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130703/e6663b24/attachment.html>


More information about the SRILM-User mailing list