[SRILM User List] c++ sample for building language model

Anand Venkataraman venkataraman.anand at gmail.com
Sat Mar 9 14:50:42 PST 2013


Solomon

I'm not 100% sure exactly what you're trying to do, but with most
pronunciation modeling, you should be able to get what you want by
inserting the alternate pronunciations in the ASR dictionary. They will be
incorporated as multiple paths within each word in the ngram PFSG prior to
decoding.

&

On Sat, Mar 9, 2013 at 2:33 PM, Solomon Getachew <sgetachew92 at yahoo.com>wrote:

> Dear All
>  I will like developed Multiple pronunciation Modeling in ASR for Amharic
> language based on Knowledge Based i need to get sample code is there any
> one help me?
> thanks in advance
>
>    ------------------------------
> *From:* mohsen jadidi <mohsen.jadidi at gmail.com>
> *To:* yasser hifny <yhifny at yahoo.com>
> *Cc:* "SRILM-User at speech.sri.com" <SRILM-User at speech.sri.com>
> *Sent:* Friday, March 8, 2013 1:10 PM
>
> *Subject:* Re: [SRILM User List] c++ sample for building language model
>
> Thanks for your reply. Very helpful. But don't we need to use ssIndex and
> seIndex to build lexicon language model  and then use it for next step?
>
>
>
>
> On Fri, Mar 8, 2013 at 9:49 PM, yasser hifny <yhifny at yahoo.com> wrote:
>
> Hi Mohsen,
>
> code sample  that I use in my work is
>
> // gloable variables
> Vocab* g_srilm_vocab;
> Ngram* g_model;
>
> double GetLogWordProb(const std::string& strCurrWord,const
> vector<std::string>& vstrHistory)
> {
> float fResult;
>
>  size_t len = 1 + vstrHistory.size()+1;
> VocabIndex* WordIDs=new VocabIndex[len];
>  WordIDs[0] = g_srilm_vocab->getIndex((char*)(strCurrWord.c_str()),
> g_srilm_vocab->unkIndex());
> for(size_t i=0;i<vstrHistory.size();i++)
> WordIDs[i+1] = vstrHistory[i]!="<s>"?
> g_srilm_vocab->getIndex((char*)(vstrHistory[i].c_str()),
> g_srilm_vocab->unkIndex()):Vocab_None;
>  WordIDs[vstrHistory.size()+1] = Vocab_None;
>
>
>  for(size_t k=0; k<len;k++)
> {
> DEBUG("k=%d wordindex:%d
> wordstring:%s",k,WordIDs[k],g_srilm_vocab->getWord(WordIDs[k]));
> }
> fResult=g_model->wordProb(WordIDs[0], &WordIDs[1]);
>  if(fResult==LogP_Zero)fResult=0.0;
> DEBUG("prob=%f",fResult);
>
> //g_model->sentenceProb(words, stats);
> delete[] WordIDs;
>
> return fResult;
>
> }
>
>
> //in the  main function
>
>
> //--------------------------------
>  // Load LM
> //--------------------------------
>  g_srilm_vocab= new Vocab;
> g_model=new Ngram(*g_srilm_vocab,nOrder);
>  File file(strLangModelFile.c_str(), "r");
>  if(!file)
> {
>  ERROR("Could not open file %s",strLangModelFile.c_str());
>  }
> g_model->read(file, 0);
>  for(size_t i=0;i<nOrder;i++)
> TRACE("Num of ngram in model order %d:%d",i+1,g_model->numNgrams(i+1));
>
>
> Best regards,
> Yasser
>
>   ------------------------------
> *From:* Andreas Stolcke <stolcke at icsi.berkeley.edu>
> *To:* Yi Yang <yangyiycc at gmail.com>
> *Cc:* SRILM-User at speech.sri.com
> *Sent:* Friday, March 8, 2013 10:37 PM
> *Subject:* Re: [SRILM User List] c++ sample for building language model
>
>  On 3/8/2013 8:06 AM, Yi Yang wrote:
>
> Hi Mohsen,
>
>  Hope the following codes can be helpful:
>
> you forgot to create the Vocab object.  In your case you could create it
> globally to your code works otherwise:
>
> Vocab vocab;
>
> Andreas
>
>
>  void SrilmTest::srilm_init(const char* fname, int order) {
>   File file(fname, "r", 0);
>   assert(file);
>
>
>     ngram = new Ngram(vocab, order);
>   ngram->read(file, false);
>   cerr << "Done\n";
> }
>
>  int SrilmTest::srilm_getvoc(const char* word) {
>   return vocab.getIndex((VocabString)word);
> }
>
>  float SrilmTest::srilm_wordprob(int w, int* context) {
>   return (float)ngram->wordProb(w, (VocabIndex*)context);
>  }
>
>
> On Thu, Mar 7, 2013 at 3:23 PM, mohsen jadidi <mohsen.jadidi at gmail.com>wrote:
>
> Hey,
>
>  I need to use srilm in my c++ code to build ML. all the examples and
> slides in the internet explained it using ngram-count command not code. I
> know should use <Ngram.h> <Vocab.h>. can you point me to a starting point?
>
>  cheers,
>
>
>  --
> Mohsen
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
>
>  --
> Sincerely,
> Yi Yang
> http://www.cc.gatech.edu/%7Eyyang319/
>
>
> _______________________________________________
> SRILM-User site listSRILM-User at speech.sri.comhttp://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
>
>
> --
> Mohsen Jadidi
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130309/5c7e3d36/attachment.html>


More information about the SRILM-User mailing list