question about warning message

Sarah E. Schwarm sarahs at cs.washington.edu
Wed May 23 13:44:09 PDT 2001


hi all,

I am running SRILM 1.0.1 on two different platforms (linux and
solaris) and got different results using the same data with exactly the
same commands.  I'm hoping that someone else might have some insight...  

I'm not doing anything fancy - in this case, I just used ngram-count to
build a trigram lm using the default settings for GT discounting, etc.  
Still, I get noticably different results ( ppl= 18.0975 ppl1= 40.7525 in
linux and  ppl= 17.2411 ppl1= 38.3 in solaris )

The solaris version gives the following warning, but the linux version
does not:
 warning: discount coeff 1 is out of range: 0.900585

I turned on the -debug 3 flag to get more information, and the output of
the two versions are nearly identical.  The differences are the warning
above, also, one verision discards 1 1-gram prob prdeicting a pseudo-event
while the other discards 2, and in the end, they have very different
left-over probability masses ( 0.00388768 vs.  4.55956e-06, where the
second number corresponds with the warning I quoted above )
although they distribute these over the same number of
unseen events and write the same number of n-grams.  The GT-count numbers
are also all the same in both versions.

I found the warning message in the code (in lm/src/Discount.cc) but I
don't really understand what's causing it, and I certainly don't
understand why I get it on one installation and not the other.  If anyone
has any insight to offer, I'd greatly appreciate it. 

thanks much,
Sarah

________________________
Sarah Schwarm
sarahs at cs.washington.edu




More information about the SRILM-User mailing list