[SRILM User List] Using SRILM for text classification

Andreas Stolcke stolcke at icsi.berkeley.edu
Fri Jun 1 15:09:08 PDT 2012


On 6/1/2012 6:04 AM, Ali Asghar Toraby Parizy wrote:
> Hi
> I wanna use SRILM for text classification. I've successfully compiled 
> srilm and I could reach the classes and utilities in my own project by 
> including header files in include folder and adding libraries in lib 
> folder.
> I'm also familiar with concepts of language modeling and text 
> categorization but I don't know where to start for using srilm in this 
> regard.
> I need to create some language models from the corpus that I have and 
> then guess the best model for a new text file using perplexity.
> Can anybody give me a review of classes and utilities or possibly a 
> document that explains the class hierarchies? I don't have enough time 
> to explore all codes to found out how to use it!
You probably don't need to link into the C++ API to do what you want.
Instead, you can operate at the command line, train your LMs, and 
postprocess the output of

ngram -debug 1 -ppl ...

to obtain the model likelihoods on your test data.

The file $SRILM/doc/lm-intro  should contain all the info you need to 
get that going.

Andreas



More information about the SRILM-User mailing list