[SRILM User List] c++ sample for building language model

mohsen jadidi mohsen.jadidi at gmail.com
Fri Mar 8 13:10:52 PST 2013


Thanks for your reply. Very helpful. But don't we need to use ssIndex and
seIndex to build lexicon language model  and then use it for next step?




On Fri, Mar 8, 2013 at 9:49 PM, yasser hifny <yhifny at yahoo.com> wrote:

> Hi Mohsen,
>
> code sample  that I use in my work is
>
> // gloable variables
> Vocab* g_srilm_vocab;
> Ngram* g_model;
>
> double GetLogWordProb(const std::string& strCurrWord,const
> vector<std::string>& vstrHistory)
> {
> float fResult;
>
>  size_t len = 1 + vstrHistory.size()+1;
> VocabIndex* WordIDs=new VocabIndex[len];
>  WordIDs[0] = g_srilm_vocab->getIndex((char*)(strCurrWord.c_str()),
> g_srilm_vocab->unkIndex());
> for(size_t i=0;i<vstrHistory.size();i++)
> WordIDs[i+1] = vstrHistory[i]!="<s>"?
> g_srilm_vocab->getIndex((char*)(vstrHistory[i].c_str()),
> g_srilm_vocab->unkIndex()):Vocab_None;
>  WordIDs[vstrHistory.size()+1] = Vocab_None;
>
>
>  for(size_t k=0; k<len;k++)
> {
> DEBUG("k=%d wordindex:%d
> wordstring:%s",k,WordIDs[k],g_srilm_vocab->getWord(WordIDs[k]));
> }
> fResult=g_model->wordProb(WordIDs[0], &WordIDs[1]);
>  if(fResult==LogP_Zero)fResult=0.0;
> DEBUG("prob=%f",fResult);
>
> //g_model->sentenceProb(words, stats);
> delete[] WordIDs;
>
> return fResult;
>
> }
>
>
> //in the  main function
>
>
> //--------------------------------
>  // Load LM
> //--------------------------------
>  g_srilm_vocab= new Vocab;
> g_model=new Ngram(*g_srilm_vocab,nOrder);
>  File file(strLangModelFile.c_str(), "r");
>  if(!file)
> {
>  ERROR("Could not open file %s",strLangModelFile.c_str());
>  }
> g_model->read(file, 0);
>  for(size_t i=0;i<nOrder;i++)
> TRACE("Num of ngram in model order %d:%d",i+1,g_model->numNgrams(i+1));
>
>
> Best regards,
> Yasser
>
>   ------------------------------
> *From:* Andreas Stolcke <stolcke at icsi.berkeley.edu>
> *To:* Yi Yang <yangyiycc at gmail.com>
> *Cc:* SRILM-User at speech.sri.com
> *Sent:* Friday, March 8, 2013 10:37 PM
> *Subject:* Re: [SRILM User List] c++ sample for building language model
>
>  On 3/8/2013 8:06 AM, Yi Yang wrote:
>
> Hi Mohsen,
>
>  Hope the following codes can be helpful:
>
> you forgot to create the Vocab object.  In your case you could create it
> globally to your code works otherwise:
>
> Vocab vocab;
>
> Andreas
>
>
>  void SrilmTest::srilm_init(const char* fname, int order) {
>   File file(fname, "r", 0);
>   assert(file);
>
>
>     ngram = new Ngram(vocab, order);
>   ngram->read(file, false);
>   cerr << "Done\n";
> }
>
>  int SrilmTest::srilm_getvoc(const char* word) {
>   return vocab.getIndex((VocabString)word);
> }
>
>  float SrilmTest::srilm_wordprob(int w, int* context) {
>   return (float)ngram->wordProb(w, (VocabIndex*)context);
>  }
>
>
> On Thu, Mar 7, 2013 at 3:23 PM, mohsen jadidi <mohsen.jadidi at gmail.com>wrote:
>
> Hey,
>
>  I need to use srilm in my c++ code to build ML. all the examples and
> slides in the internet explained it using ngram-count command not code. I
> know should use <Ngram.h> <Vocab.h>. can you point me to a starting point?
>
>  cheers,
>
>
>  --
> Mohsen
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
>
>  --
> Sincerely,
> Yi Yang
> http://www.cc.gatech.edu/%7Eyyang319/
>
>
> _______________________________________________
> SRILM-User site listSRILM-User at speech.sri.comhttp://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>



-- 
Mohsen Jadidi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130308/c048457d/attachment.html>


More information about the SRILM-User mailing list