unicode & many files

Alexy Khrabrov deliverable at gmail.com
Wed Sep 12 08:50:50 PDT 2007

Previous message: memory-resident LMs for ngram?
Next message: unicode & many files
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

How good is the unicode support -- e.g. for utf8?  I've fed it some  
utf8 Cyrillics and it did fine.  How does it know we're using  
multibyte or single byte characters?

Another question -- how do I feed many text files from a directory,  
should I do multiple -text options after cooking them somehow, or use  
-read on an accumulating count file?

Cheers,
Alexy

Previous message: memory-resident LMs for ngram?
Next message: unicode & many files
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the SRILM-User mailing list