SRILM 1.4

Andreas Stolcke stolcke at speech.sri.com
Tue Mar 2 16:15:46 PST 2004


ngram -prune-lowprobs does a -renorm implicitly AFTER eliminating pruned 
N-gram probabilties.  However, if you do specify both 
options the renormalization is done FIRST, then the pruning.

What this could mean is that your original model is not properly
normalized (so the -renorm operation changes the backoff weights before
pruning).  Even if the model is normalized (as it should be if produced
by SRILM) you might see small differences due to rounding or loss of
precision when writing/reading the log probabilities, or other numerical
inaccuracies.  Note that even small differences in values might affect the
pruning decisions in some cases, so you probably will end up with
slightly different sets of N-grams.  Again, the differences would be
small and the resulting models should perform equivalently in
practice.

As a sanity check, compute perplexity of the two models.  They should be 
essentially identical.

--Andreas

In message <OF42132E19.478316A4-ON88256E4B.0081849E-88256E4B.0081A4E8 at mohomine.
com>you wrote:
> This is a multipart message in MIME format.
> --=_alternative 0081A4E788256E4B_=
> Content-Type: text/plain; charset="US-ASCII"
> 
> Hello,
> 
> If I want to export a LM to an FSM, such as the AT&T FSM library, then I 
> need to do -prune-lowprobs... but what about -renorm?  I notice that if I 
> do/don't add this flag on the command line... it makes a different LM... 
> but I'm not sure which one is right.  I was assuming I needed both 
> -prune-lowprobs and -renorm, but the LM looks a little funny... so now I'm 
> not sure.
> 
> Thanks,
> Chris
> 



More information about the SRILM-User mailing list