[SRILM-Announce] SRILM 1.7.1 released

Wed Jun 4 15:53:25 PDT 2014

All,

The latest version of SRILM, 1.7.1, is now available from
http://www.speech.sri.com/projects/srilm/download.html .

A list of changes appears below.

Enjoy,

Wen

1.7.1   4 June 2014

         * Updated INSTALL, Copyright.  Added ACKNOWLEDGEMENTS.

         Functionality:

         * Integrated the maximum entropy extension by Tanel Alumae, 
described
         at http://www.phon.ioc.ee/~tanela/srilm-me/ .
         Please cite Tanel's paper (copied here in 
doc/is2010-maxent.pdf) if you
         use this functionality in your research.
         * Enable LM server to process multiple commands in a single message
         (separated by newlines).  This capability was never documented, but
         existed in the first implementation that used read/write system 
calls,
         but was lost when we switched to recv/send calls.
         * Generalized the BayesMix LM class to allow an arbitrary number of
         mixture components, similar to LoglinearMix.
         * Added the ngram -context-priors option to read context-dependent
         mixture weight priors from a file.
         * Added the ngram -read-mix-lms option to read the list of 
interpolated
         LMs, weights and options from a file, specified by the -lm option.
         * Use zlib for I/O from/to gzipped files. Benefits are: (a) 
works with
         native Windows binaries, (b) avoids subprocess, (c) allows reading
         (though still not writing) of gzipped binary LM and count files.
         * ngram-count -gtNmin options accept floating point values for more
         flexibility with LM estimation from fractional counts.
         * Added lattice-tool -set-lattice-names option to preserve input
         filenames inside lattices.
         * New script replace-unk-words, for replacing OOV words relative to
         a vocabulary with <unk> tag.
         * Added new lattice-tool options -hyp-list -hyp-file -hyp2-list
         -hyp2-file -add-hyps to add ASR hypotheses into word mesh 
(confusion
         network). The added options are similar to -ref-list -ref-file 
-add-refs,
         except that the added hypothesized words will not be indicated as
         reference words in the word mesh.
         * Added a function in WordMesh to compute slot-to-slot alignment
         between two confusion networks.
         * Added ngram-class option to limit number of words per class (from
         seppo.enarvi at aalto.fi).

         Portability:

         * Added support for 64bit cygwin builds (MACHINE_TYPE=cygwin64).

         Bug fixes:

         * ngram -rescore-ngram was not setting the handling of special word
         tokens (<s>, </s>) if the rescored LM was being evaluated in 
the same
         run.
         * ngram-count -skip needs to read counts one order higher than 
specified
         by -order .
         * SkipNgram will now try to reestimate the discounting 
parameters from
         expected counts on each EM iteration (but fall back on initial 
parameters
         if that fails, e.g., for discounting methods that cannot handle 
float
         counts).
         * SubVocab instances' handling of metatags and nonevent words 
is now
         tied to the base Vocab instance.
         * Avoid anomalies in random word generation due to nonzero 
probabilities
         for nonwords.
         * Cleaned-up select-vocab script from Anand Venkataraman. Now works
         with perl 5.12 and gives consistent results on different platforms.
         Added a test case.
         * Fixed removeTrie() bug that was leading to memory leak in Ngram
         destructor.
         * Fixed bug in LHash iterator that lead to potential double 
enumeration
         of items after deletions, and could affect Ngram pruning results.
         * Allow number of ngrams in ARPA LM to exceed 2^31. (Vocabulary 
size
         is still limited to 2^32.)
         * Initialize key and data objects in SArray and LHash 
containers after
         allocation.
         * Pass Trellis state parameters by reference to avoid copying of
         potentially complex objects.
         * Fixed memory access error in Ngram::clear() for order-1 models.
         * Fixed a problem handling null string states in Trellis.
         * Fix to preserve double precision in NBest acoustic and LM scores.
         * Fixed an error concerning the use of -gtNmin options in the 
srilm-faq(7)
         man page pointed out by dugast at systran.fr.
         * If a lattice-tool input lattice is a word mesh, avoid calling
         alignLattice() since the input is already a word mesh.
         * Fixes to reading/writing of quantization codebook files.
         * Fixed header comment and test program for Map2::remove().