[Srilm-announce] [SRILM-Announce] SRILM 1.7.2 released
Wen Wang
wen.wang at sri.com
Thu Nov 10 17:32:08 PST 2016
The latest version of SRILM, 1.7.2, is now available from
http://www.speech.sri.com/projects/srilm/download.html
A list of changes appears below.
Functionality:
* Added interfaces to Lattice and WordMesh that allows external
programs
to map sausage nodes to their original lattice nodes.
* New VocabDistance subclass StemDistance, comparing words only
based on
their stems.
* New lattice-tool option -stem-dist triggers StemDistance use in
confusion network alignments, including -add-hyps and -add-refs
processing.
* Add optional support for keyword spotting (in Lattice.h and
LatticeIndex.cc) when writing a 1-gram index.
* Added new File field NBestOptions::nbestRttm2, if it exists
then write
(an approximation to) the NBestList2.0 format output.
* Added simple Trellis pruning based on relative thresholding
of forward
probabilities (Trellis::prune()).
* make-big-lm now understands the -ukndiscount option. The
make-kn-discounts
helper script has an option to compute unmodified KN discounts.
* The -version option now reports the compiler version used.
* Added ngram-count -write-text option to test conversion of
UTF-16 files
to ASCII/UTF-8.
* Added ngram -text-has-weights option to allow weighting
sentences in ppl
computation.
* Added scripts nbest-words and compute-sclite-nbest for
conveniently
computing nbest-optimize -errors information using sclite.
* Added the nbest-optimize -xval-files option to support
cross-validation.
* Added script search-rover-combo for searching for best
combination among
a list of systems.
* Added confidence value fields to NBestWordInfo class.
* Added check to compute-best-mix to warn about word label
mismatches between input files.
Portability:
* Honor TMPDIR environment variable in various scripts.
* Miscellaneous MacosX fixes.
* Include BSD rand48 functions so that random sentence
generation gives same
result on all platforms.
Bug fixes:
* Avoid leaky backoff by mapping very small probability sums to
0 in BOW
computation. Otherwise unseen ngrams may end up with nonzero
probabilities
in unsmoothed LMs.
* Fixed compare-ppls compute-best-mix compute-best-sentence-mix
ppl-from-log
to recognize the MSVC representation of -infinity.
* Fixed a bug in the handling of zero prefix probabilities in
ClassNgram,
HiddenNgram and HMMofNgrams.
* Fixed a memory allocation bug that caused the
ngram-count-maxent test
to crash.
* Fixes to lattice-tool rttm nbest output.
* Fix for possible endless loop in lattice-tool
-posterior-prune due to
limited float precision (from Seppo Enarvi).
* Fixed a problem with declaration of Map_nokeyP() that takes
reference
arguments and were missing "const"; was causing crash in
segment tool.
* Workaround for what looks like an optimizer bug in gcc >= 4.9
that can
cause ngram -prune to core dump.
* Output TextStats quantities (sentence/word counts, log probs,
perplexities),
model parameters, nbest and lattices scores, and other
quantities with full
precision so as to avoid loss of information.
* nbest-optimize -1best now outputs a rover-control file that
simulates
Viterbi decoding (by using a small posterior scale).
* nbest-optimize -errors now tolerates varying number of
reference words
for the same sentence. This can arise from sclite references
with alternate
words strings.
* Fixed a stupid bug in uniform-classes.gawk script.
* Allow combine-rover-controls to merge control files with the
same systems
in them, adding their weights.
* Updated zlib to version 1.2.8. This fixes a bug whereby
gzipped output files
could end up with zero size (instead of a legal gzipped file
that results in a
zero-length file when decompressed).
Cheers,
Wen
More information about the Srilm-announce
mailing list