[SRILM User List] Using SRILM for text classification
Andreas Stolcke
stolcke at icsi.berkeley.edu
Fri Jun 1 15:09:08 PDT 2012
On 6/1/2012 6:04 AM, Ali Asghar Toraby Parizy wrote:
> Hi
> I wanna use SRILM for text classification. I've successfully compiled
> srilm and I could reach the classes and utilities in my own project by
> including header files in include folder and adding libraries in lib
> folder.
> I'm also familiar with concepts of language modeling and text
> categorization but I don't know where to start for using srilm in this
> regard.
> I need to create some language models from the corpus that I have and
> then guess the best model for a new text file using perplexity.
> Can anybody give me a review of classes and utilities or possibly a
> document that explains the class hierarchies? I don't have enough time
> to explore all codes to found out how to use it!
You probably don't need to link into the C++ API to do what you want.
Instead, you can operate at the command line, train your LMs, and
postprocess the output of
ngram -debug 1 -ppl ...
to obtain the model likelihoods on your test data.
The file $SRILM/doc/lm-intro should contain all the info you need to
get that going.
Andreas
More information about the SRILM-User
mailing list