From mrfox321 at gmail.com Wed May 16 13:35:58 2018 From: mrfox321 at gmail.com (Jonathan Mendoza) Date: Wed, 16 May 2018 16:35:58 -0400 Subject: [SRILM User List] Class-based probability using -expand-classes Message-ID: SRILM community, If I build a class based LM via replace-words-with-classes -> ngram then re-build the LM using -expand-classes, will the rebuilt LM follow the class based probabilities, P(w_n | w_n-1 ... w_1) ?= P(w_n | c_n) * P(c_n | c_n-1 ... c_1). Or is the mapping an approximate inverse to the original language model? -------------- next part -------------- An HTML attachment was scrubbed... URL: From stolcke at icsi.berkeley.edu Wed May 16 14:33:23 2018 From: stolcke at icsi.berkeley.edu (Andreas Stolcke) Date: Wed, 16 May 2018 14:33:23 -0700 Subject: [SRILM User List] Class-based probability using -expand-classes In-Reply-To: References: Message-ID: I'm not sure what "an approximate inverse to the original language model" means. But the purpose of ngram -expand-classes is to approximate the class LM probabilities (the equation you give) using only word ngram probabilities.  It does so by inserting all expanded word ngrams into the LM and giving them probabilities according to P(w_n | w_n-1 ... w_1) = P(w_n w_n-1 ... w_1) / P(w_n-1 ... w_1) where the joint probabilities on the right hand side are computed by the class-based LM. Andreas On 5/16/2018 1:35 PM, Jonathan Mendoza wrote: > SRILM community, > > If I build a class based LM via > > replace-words-with-classes -> ngram > > then re-build the LM using -expand-classes, > > will the rebuilt LM follow the class based probabilities, > > P(w_n | w_n-1 ... w_1) ?= P(w_n | c_n) * P(c_n | c_n-1 ... c_1). > > Or is the mapping an approximate inverse to the original language model? > > > _______________________________________________ > SRILM-User site list > SRILM-User at speech.sri.com > http://mailman.speech.sri.com/cgi-bin/mailman/listinfo/srilm-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrfox321 at gmail.com Wed May 16 15:37:26 2018 From: mrfox321 at gmail.com (Jonathan Mendoza) Date: Wed, 16 May 2018 18:37:26 -0400 Subject: [SRILM User List] Class-based probability using -expand-classes In-Reply-To: References: Message-ID: That answers my question! I am just looking to play around with default counts of rare in-class words to properly bias new words that are semantically similar. Thanks so much for your time! ~jon -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrfox321 at gmail.com Wed May 23 10:35:12 2018 From: mrfox321 at gmail.com (Jonathan Mendoza) Date: Wed, 23 May 2018 13:35:12 -0400 Subject: [SRILM User List] Constraining class building Message-ID: SRILM community, I am trying to work with ngram-classes. More specifically, I want to connect new vocabulary that is semantically similar to certain vocabulary within the corpus. e.g. My language model has the class $organization = {ibm, intel} and I know that {google}, which is not in the training corpus, will show up in the same context in some test corpus. The corpus / language model that I am working with is much simpler, meaning that the language is very much like a template (or mad libs). As a result of the structure of the corpus I am working with, I am only concerned with a few (2-5) multi-word clusters, while retaining single element classes for rest of the vocabulary. This means that numclasses is going to be on the order of {V - O(|C|)} where |C| is expected cardinality of the set. I also plan on defining the initial clusters that would be appended during the merging via ngram-classes. Does ngram-classes support a method for constraining the class merging to only work between single-word classes and the predefined multi-word classes? My initial attempt at a solution would be to iterate over a range of numclasses with the aforementioned base-classes and see how classes are formed from the initial conditions. My worry is that words not in the initial multi-word classes will merge, leading to a Null result. For the time being, I am going to use the -full flag to glean intuition about word clusters, then plan my class initialization accordingly. Best, Jon -------------- next part -------------- An HTML attachment was scrubbed... URL: From stolcke at icsi.berkeley.edu Wed Jun 6 13:44:09 2018 From: stolcke at icsi.berkeley.edu (Andreas Stolcke) Date: Wed, 6 Jun 2018 13:44:09 -0700 Subject: [SRILM User List] FW: question about lattice-tool In-Reply-To: References: Message-ID: *From:* Andreas Stolcke *Sent:* Wednesday, June 6, 2018 1:27 PM *To:* 'Michael Campbell' *Cc:* 'srilm-user at speech.sri.com' *Subject:* RE: question about lattice-tool Michael, Lattice-tool does not require a language model.   If non is given, the scores contained in the lattice will be used for decoding (-viterbi-decode, -nbest-decode, -posterior-decode) and confusion network building (-write-mesh). Andreas *From:* Michael Campbell > *Sent:* Wednesday, June 6, 2018 1:10 PM *To:* Andreas Stolcke > *Subject:* question about lattice-tool Hello Andreas, I am using the SRILM "lattice-tool" utility, for which you are listed as an author. I am new to this, and the documentation does not say whether or not "lattice-tool" requires a Language Model to be input in order to use Viterbi or Posterior algorithms on a lattice of words.      * I created a lattice of words, and would like to see the most probable sentence.      * If I use Viterbi, I get a result, *without using any language model options*. *Does "lattice-tool" use a built-in language model to give that result, or is the result 'nonsense' since I am not inputting a language model into "lattice-tool"? * Thank you very much for any feedback. All the best, Mike -- *Michael Campbell* mcampbell at veritone.com Veritone *Veritone, Inc.* 575 Anton Blvd. Suite 100, Costa Mesa, CA. 92626 https://na01.safelinks.protection.outlook.com/?url=www.veritone.com&data=01%7C01%7Csrilm-user%40speech.sri.com%7C64fb8e3ab5d745c5c91e08d5cbee427e%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=UZUM9faEwv3IGhu80JBkhKDO%2FpG9SUk7dPUo%2Bkwfl8U%3D&reserved=0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From stolcke at icsi.berkeley.edu Wed Jun 6 15:12:15 2018 From: stolcke at icsi.berkeley.edu (Andreas Stolcke) Date: Wed, 6 Jun 2018 15:12:15 -0700 Subject: [SRILM User List] question about lattice-tool In-Reply-To: References: Message-ID: <52725218-2bac-f42c-fe72-f5250a8d48c7@icsi.berkeley.edu> What I call "scores"  are usually log likelihoods or log probabilities, sometimes scaled in some fashion.   lattice-tool does not care about the probabilistic interpretation of such scores, it just combines the scores according to their weights and finds the path with the highest overall score. In the case of HTK lattices, the scores typically encoded in the lattices are acoustic model (a=), ngram model (n=), general language model (l=), and sometimes pronunciation weights (r=). The HTK lattice format has been generalized to allow up to 9 additional scores to be encoded (x1= through x9=) on nodes or links. The header of the lattice file can define the weights for these scores (acscale=, ngscale=, lmscale=, prscale=).  There is also a word insertion penalty (wdpenalty=) that implies a constant additional score on each word hypothesis. Score weights can be overridden on the command line (-htk-scscale, -htk-lmscale, etc.).  If no score weights are given they default to 1 (or 0 for the word penalty). If you specify an external language model that will override the l= (general lm) scores in the lattice, but that is optional. lattice-tool will generate an aggregate score from the weighted combination of all scores, and decode the lattice path with the highest overall score (from both nodes and links). Andreas On 6/6/2018 2:44 PM, Andreas Stolcke wrote: > > *From:* Michael Campbell > *Sent:* Wednesday, June 6, 2018 2:31 PM > *To:* Andreas Stolcke > *Subject:* Re: question about lattice-tool > > By scores, you mean probabilities of words? > > The lattice I input consists of edges representing multiple word > candidates per time step = node.  I use the HTK lattice format and do > not assign any probabilities to edges in my lattice. > > For example, > > /         __their_____         /            \ >        *----there----*---going---*    t=0 \____they're_/t=1        t=2/ > I thought lattice-tool would choose the path of highest probability > based on a built-in language model.  For example, that it would > produce output from above as "they're going" since it would have the > highest probability. > It is interesting that lattice-tool *does* produce output for such a > lattice (without probabilities or scores).  How does it compute that > output? > best, > Mike > > On Wed, Jun 6, 2018 at 1:27 PM, Andreas Stolcke > > > wrote: > > Michael, > > Lattice-tool does not require a language model.   If non is given, > the scores contained in the lattice will be used for decoding > (-viterbi-decode, -nbest-decode, -posterior-decode) and confusion > network building (-write-mesh). > > Andreas > > *From:* Michael Campbell > > *Sent:* Wednesday, June 6, 2018 1:10 PM > *To:* Andreas Stolcke > > *Subject:* question about lattice-tool > > Hello Andreas, > > I am using the SRILM "lattice-tool" utility, for which you are > listed as an author. > > I am new to this, and the documentation does not say whether or > not "lattice-tool" requires > > a Language Model to be input in order to use Viterbi or Posterior > algorithms on a lattice of words. > >    * I created a lattice of words, and would like to see the most > probable sentence. > >    * If I use Viterbi, I get a result, *without using any language > model options*. > > *Does "lattice-tool" use a built-in language model to give that > result, or is the result 'nonsense' since I am not inputting a > language model into "lattice-tool"? * > > Thank you very much for any feedback. > > All the best, > > Mike > > -- > > *Michael Campbell* > mcampbell at veritone.com > > Veritone > > *Veritone, Inc.* > 575 Anton Blvd. Suite 100, Costa Mesa, CA. 92626 > > https://na01.safelinks.protection.outlook.com/?url=www.veritone.com&data=01%7C01%7Csrilm-user%40speech.sri.com%7Cd43ee031eadb4929012708d5cbfa90ed%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=leasVxlXPPmSvkLLUyPUppLnez%2Fox14XFBVDXArFfhw%3D&reserved=0 > > > > > -- > > *Michael Campbell* > mcampbell at veritone.com > > Veritone > > *Veritone, Inc.* > 575 Anton Blvd. Suite 100, Costa Mesa, CA. 92626 > https://na01.safelinks.protection.outlook.com/?url=www.veritone.com&data=01%7C01%7Csrilm-user%40speech.sri.com%7Cd43ee031eadb4929012708d5cbfa90ed%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=leasVxlXPPmSvkLLUyPUppLnez%2Fox14XFBVDXArFfhw%3D&reserved=0 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: