[SRILM User List] c++ sample for building language model

mohsen jadidi mohsen.jadidi at gmail.com
Sun Mar 10 06:02:45 PDT 2013


hi solomon,

I think it was better to open a new topic for this problem


On Sat, Mar 9, 2013 at 11:50 PM, Anand Venkataraman <
venkataraman.anand at gmail.com> wrote:

> Solomon
>
> I'm not 100% sure exactly what you're trying to do, but with most
> pronunciation modeling, you should be able to get what you want by
> inserting the alternate pronunciations in the ASR dictionary. They will be
> incorporated as multiple paths within each word in the ngram PFSG prior to
> decoding.
>
> &
>
> On Sat, Mar 9, 2013 at 2:33 PM, Solomon Getachew <sgetachew92 at yahoo.com>wrote:
>
>> Dear All
>>  I will like developed Multiple pronunciation Modeling in ASR for Amharic
>> language based on Knowledge Based i need to get sample code is there any
>> one help me?
>> thanks in advance
>>
>>    ------------------------------
>> *From:* mohsen jadidi <mohsen.jadidi at gmail.com>
>> *To:* yasser hifny <yhifny at yahoo.com>
>> *Cc:* "SRILM-User at speech.sri.com" <SRILM-User at speech.sri.com>
>> *Sent:* Friday, March 8, 2013 1:10 PM
>>
>> *Subject:* Re: [SRILM User List] c++ sample for building language model
>>
>> Thanks for your reply. Very helpful. But don't we need to use ssIndex and
>> seIndex to build lexicon language model  and then use it for next step?
>>
>>
>>
>>
>> On Fri, Mar 8, 2013 at 9:49 PM, yasser hifny <yhifny at yahoo.com> wrote:
>>
>> Hi Mohsen,
>>
>> code sample  that I use in my work is
>>
>> // gloable variables
>> Vocab* g_srilm_vocab;
>> Ngram* g_model;
>>
>> double GetLogWordProb(const std::string& strCurrWord,const
>> vector<std::string>& vstrHistory)
>> {
>> float fResult;
>>
>>  size_t len = 1 + vstrHistory.size()+1;
>> VocabIndex* WordIDs=new VocabIndex[len];
>>  WordIDs[0] = g_srilm_vocab->getIndex((char*)(strCurrWord.c_str()),
>> g_srilm_vocab->unkIndex());
>> for(size_t i=0;i<vstrHistory.size();i++)
>> WordIDs[i+1] = vstrHistory[i]!="<s>"?
>> g_srilm_vocab->getIndex((char*)(vstrHistory[i].c_str()),
>> g_srilm_vocab->unkIndex()):Vocab_None;
>>  WordIDs[vstrHistory.size()+1] = Vocab_None;
>>
>>
>>  for(size_t k=0; k<len;k++)
>> {
>> DEBUG("k=%d wordindex:%d
>> wordstring:%s",k,WordIDs[k],g_srilm_vocab->getWord(WordIDs[k]));
>> }
>> fResult=g_model->wordProb(WordIDs[0], &WordIDs[1]);
>>  if(fResult==LogP_Zero)fResult=0.0;
>> DEBUG("prob=%f",fResult);
>>
>> //g_model->sentenceProb(words, stats);
>> delete[] WordIDs;
>>
>> return fResult;
>>
>> }
>>
>>
>> //in the  main function
>>
>>
>> //--------------------------------
>>  // Load LM
>> //--------------------------------
>>  g_srilm_vocab= new Vocab;
>> g_model=new Ngram(*g_srilm_vocab,nOrder);
>>  File file(strLangModelFile.c_str(), "r");
>>  if(!file)
>> {
>>  ERROR("Could not open file %s",strLangModelFile.c_str());
>>  }
>> g_model->read(file, 0);
>>  for(size_t i=0;i<nOrder;i++)
>> TRACE("Num of ngram in model order %d:%d",i+1,g_model->numNgrams(i+1));
>>
>>
>> Best regards,
>> Yasser
>>
>>   ------------------------------
>> *From:* Andreas Stolcke <stolcke at icsi.berkeley.edu>
>> *To:* Yi Yang <yangyiycc at gmail.com>
>> *Cc:* SRILM-User at speech.sri.com
>> *Sent:* Friday, March 8, 2013 10:37 PM
>> *Subject:* Re: [SRILM User List] c++ sample for building language model
>>
>>  On 3/8/2013 8:06 AM, Yi Yang wrote:
>>
>> Hi Mohsen,
>>
>>  Hope the following codes can be helpful:
>>
>> you forgot to create the Vocab object.  In your case you could create it
>> globally to your code works otherwise:
>>
>> Vocab vocab;
>>
>> Andreas
>>
>>
>>  void SrilmTest::srilm_init(const char* fname, int order) {
>>   File file(fname, "r", 0);
>>   assert(file);
>>
>>
>>     ngram = new Ngram(vocab, order);
>>   ngram->read(file, false);
>>   cerr << "Done\n";
>> }
>>
>>  int SrilmTest::srilm_getvoc(const char* word) {
>>   return vocab.getIndex((VocabString)word);
>> }
>>
>>  float SrilmTest::srilm_wordprob(int w, int* context) {
>>   return (float)ngram->wordProb(w, (VocabIndex*)context);
>>  }
>>
>>
>> On Thu, Mar 7, 2013 at 3:23 PM, mohsen jadidi <mohsen.jadidi at gmail.com>wrote:
>>
>> Hey,
>>
>>  I need to use srilm in my c++ code to build ML. all the examples and
>> slides in the internet explained it using ngram-count command not code. I
>> know should use <Ngram.h> <Vocab.h>. can you point me to a starting point?
>>
>>  cheers,
>>
>>
>>  --
>> Mohsen
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>>
>>
>>
>>  --
>> Sincerely,
>> Yi Yang
>> http://www.cc.gatech.edu/%7Eyyang319/
>>
>>
>> _______________________________________________
>> SRILM-User site listSRILM-User at speech.sri.comhttp://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>>
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>>
>>
>>
>> --
>> Mohsen Jadidi
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>>
>> _______________________________________________
>> SRILM-User site list
>> SRILM-User at speech.sri.com
>> http://www.speech.sri.com/mailman/listinfo/srilm-user
>>
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
>



-- 
Mohsen Jadidi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130310/d6ded38b/attachment.html>


More information about the SRILM-User mailing list