Test failures on RHEL 5
Andreas Stolcke
stolcke at speech.sri.com
Thu Jul 19 11:36:47 PDT 2007
David Brodbeck wrote:
> I'm trying to build SRILM 1.5.2 on Redhat Enterprise Linux Server 5.
> The machine type is i686_m64. Everything builds all right, but the
> tests fail for make-ngram-pfsg, ngram-class, and
> ngram-count-lm-limit-vocab.
>
> make-ngram-pfsg is the most obvious one, so I'll tackle that one
> first. I get the following in the stderr file:
> gawk: /opt/srilm/bin/i686-m64/add-pauses-to-pfsg:22: fatal: Invalid
> collation character: /[[:lower:]-ÿ]/
>
> Has anyone else run into this? I'm using GNU Awk 3.1.5, and the
> locale is set to en_US.UTF-8.
This is odd since we're also using gawk 3.1.5 and I cannot replicate the
problem even when setting LANG to en_US.UTF-8.
It seems that the interpretation of gawk regular expressions should not
depend on the OS release version, but of course there may always be bugs.
ngram-class is very fickle. Small changes in the implementation of math
library functions or machine arithmetic can cause small numerical
differences and then different clustering decisions as a result. In
fact, I get different results with 32bit and 64bit Linux binaries, so
don't worry about that one.
ngram-count-lm-limit-vocab should work. You can send me more details on
how the output differs.
Andreas
More information about the SRILM-User
mailing list