[SRILM User List] FW: question about SRILM non-events feature
Andreas Stolcke
stolcke at icsi.berkeley.edu
Mon Jun 9 14:38:08 PDT 2014
> *From:* K. Richardson [mailto:kazimir.richardson at gmail.com]
>
> *Sent:* Monday, June 9, 2014 3:56 AM
> *To:* Andreas Stolcke
> *Subject:* question about SRILM non-events feature
>
> Hi Andreas,
>
> I apologize if there is some other official channel for asking SRILM
> technical questions (I tried writing to the srilm mailing list, but it
> bounced).
>
You need to join the mailing list to be able to post questions.
> I am using SRILM as a black box in an MT system. I am trying to build
> a LM that enforces that every sequence start with some default value,
> e.g. <s> X, such that X never occurs elsewhere in some other n-gram.
>
So do you want to (1) force X to occur always after <s>, or do you want
to (2) prevent it from occurring elsewhere, or both?
You can do (1) by manipulating the conditional probability of bigram <s>
X to be 1, and 0 for all other bigrams starting with <s>.
You can do (2) by giving X a unigram probability of 0 and have it not
occur in any other ngrams (other than those starting with <s>). The
zero probability prevents X from getting probability via backoff.
After you manipulate the probabilities you should use ngram -renorm to
recompute backoff weights.
> Is it possible to enforce this? Is this within the purview of what the
> -nonevents option does? I have been having a hard time understanding
> how this option works, and specifically how you specify the associated
> non-events file.
>
Non-events are tags like <s> are not predicted by the LM but that can
occur in the history (context) portion of an N-gram to condition the
next word.
It doesn't sound like that's what you want here.
Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20140609/c711f707/attachment.html>
More information about the SRILM-User
mailing list