Test failures on RHEL 5

Andreas Stolcke stolcke at speech.sri.com
Thu Jul 19 11:36:47 PDT 2007


David Brodbeck wrote:
> I'm trying to build SRILM 1.5.2 on Redhat Enterprise Linux Server 5.  
> The machine type is i686_m64.  Everything builds all right, but the 
> tests fail for make-ngram-pfsg, ngram-class, and 
> ngram-count-lm-limit-vocab.
>
> make-ngram-pfsg is the most obvious one, so I'll tackle that one 
> first.  I get the following in the stderr file:
> gawk: /opt/srilm/bin/i686-m64/add-pauses-to-pfsg:22: fatal: Invalid 
> collation character: /[[:lower:]-ÿ]/
>
> Has anyone else run into this?  I'm using GNU Awk 3.1.5, and the 
> locale is set to en_US.UTF-8.
This is odd since we're also using gawk 3.1.5 and I cannot replicate the 
problem even when setting LANG to en_US.UTF-8.
It seems that the interpretation of gawk regular expressions should not 
depend on the OS release version, but of course there may always be bugs.

ngram-class is very fickle.  Small changes in the implementation of math 
library functions or machine arithmetic can cause small numerical 
differences and then different clustering decisions as a result. In 
fact, I get different results with 32bit and 64bit Linux binaries, so 
don't worry about that one.

ngram-count-lm-limit-vocab should work. You can send me more details on 
how the output differs.

Andreas





More information about the SRILM-User mailing list