[SRILM User List] FW: question about SRILM non-events feature

Andreas Stolcke stolcke at icsi.berkeley.edu
Mon Jun 9 14:38:08 PDT 2014


> *From:* K. Richardson [mailto:kazimir.richardson at gmail.com]
>
> *Sent:* Monday, June 9, 2014 3:56 AM
> *To:* Andreas Stolcke
> *Subject:* question about SRILM non-events feature
>
> Hi Andreas,
>
> I apologize if there is some other official channel for asking SRILM 
> technical questions (I tried writing to the srilm mailing list, but it 
> bounced).
>
You need to join the mailing list to be able to post questions.

> I am using SRILM as a black box in an MT system. I am trying to build 
> a LM that enforces that every sequence start with some default value, 
> e.g. <s> X, such that X never occurs elsewhere in some other n-gram.
>
So do you want to (1) force X to occur always after <s>, or do you want 
to (2) prevent it from occurring elsewhere, or both?

You can do (1) by manipulating the conditional probability of bigram <s> 
X to be 1, and 0 for all other bigrams starting with <s>.

You can do (2) by giving X a unigram probability of 0 and have it not 
occur in any other ngrams (other than those starting with <s>).  The 
zero probability prevents X from getting probability via backoff.

After you manipulate the probabilities you should use ngram -renorm to 
recompute backoff weights.



> Is it possible to enforce this? Is this within the purview of what the 
> -nonevents option does? I have been having a hard time understanding 
> how this option works, and specifically how you specify the associated 
> non-events file.
>
Non-events are tags like <s> are not predicted by the LM but that can 
occur in the history (context) portion of an N-gram to condition the 
next word.
It doesn't sound like that's what you want here.

Andreas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20140609/c711f707/attachment.html>


More information about the SRILM-User mailing list