[SRILM User List] c++ sample for building language model

yasser hifny yhifny at yahoo.com
Fri Mar 8 12:49:53 PST 2013


Hi Mohsen,

code sample  that I use in my work is

// gloable variables
Vocab*g_srilm_vocab;

Ngram*g_model;

double GetLogWordProb(const std::string& strCurrWord,const vector<std::string>& vstrHistory)
{
floatfResult;

size_t len = 1 + vstrHistory.size()+1;
VocabIndex* WordIDs=new VocabIndex[len];
WordIDs[0] = g_srilm_vocab->getIndex((char*)(strCurrWord.c_str()), g_srilm_vocab->unkIndex());
for(size_t i=0;i<vstrHistory.size();i++)
WordIDs[i+1] = vstrHistory[i]!="<s>"?
g_srilm_vocab->getIndex((char*)(vstrHistory[i].c_str()), g_srilm_vocab->unkIndex()):Vocab_None;
WordIDs[vstrHistory.size()+1] = Vocab_None;


for(size_t k=0; k<len;k++)
{
DEBUG("k=%d wordindex:%d wordstring:%s",k,WordIDs[k],g_srilm_vocab->getWord(WordIDs[k]));
}
fResult=g_model->wordProb(WordIDs[0], &WordIDs[1]);
if(fResult==LogP_Zero)fResult=0.0;
DEBUG("prob=%f",fResult);

//g_model->sentenceProb(words, stats);
delete[] WordIDs;

return fResult;

}


//in the  main function


//--------------------------------
// Load LM
//--------------------------------
g_srilm_vocab= new Vocab;
g_model=new Ngram(*g_srilm_vocab,nOrder);
File file(strLangModelFile.c_str(), "r");
if(!file)
{
ERROR("Could not open file %s",strLangModelFile.c_str());
}
g_model->read(file, 0);
for(size_t i=0;i<nOrder;i++)
TRACE("Num of ngram in model order %d:%d",i+1,g_model->numNgrams(i+1));


Best regards,
Yasser


________________________________
 From: Andreas Stolcke <stolcke at icsi.berkeley.edu>
To: Yi Yang <yangyiycc at gmail.com> 
Cc: SRILM-User at speech.sri.com 
Sent: Friday, March 8, 2013 10:37 PM
Subject: Re: [SRILM User List] c++ sample for building language model
 

On 3/8/2013 8:06 AM, Yi Yang wrote:

Hi Mohsen, 
>
>
>Hope the following codes can be helpful:
you forgot to create the Vocab object.  In your case you could create it globally to your code works otherwise:

Vocab vocab;

Andreas



>
>void SrilmTest::srilm_init(const char* fname, int order) {
>  File file(fname, "r", 0);
>  assert(file);

  ngram = new Ngram(vocab, order);
>  ngram->read(file, false);
>  cerr << "Done\n";
>}
>
>
>int SrilmTest::srilm_getvoc(const char* word) {
>  return vocab.getIndex((VocabString)word);
>}
>
>
>float SrilmTest::srilm_wordprob(int w, int* context) {
>  return (float)ngram->wordProb(w, (VocabIndex*)context);
>}
>
>
>
>On Thu, Mar 7, 2013 at 3:23 PM, mohsen jadidi <mohsen.jadidi at gmail.com> wrote:
>
>Hey,  
>>
>>
>>I need to use srilm in my c++ code to build ML. all the examples and slides in the internet explained it using ngram-count command not code. I know should use <Ngram.h> <Vocab.h>. can you point me to a starting point?
>>
>>
>>cheers,
>>
>>
>>
-- 
>>Mohsen
>>
>>_______________________________________________
>>SRILM-User site list
>>SRILM-User at speech.sri.com
>>http://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>
>
>
>
-- 
>Sincerely,
>Yi Yang
>http://www.cc.gatech.edu/%7Eyyang319/
>
>
>
>_______________________________________________
SRILM-User site list SRILM-User at speech.sri.com http://www.speech.sri.com/mailman/listinfo/srilm-user

_______________________________________________
SRILM-User site list
SRILM-User at speech.sri.com
http://www.speech.sri.com/mailman/listinfo/srilm-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130308/679cf313/attachment.html>


More information about the SRILM-User mailing list