From mrfox321 at gmail.com  Wed May 16 13:35:58 2018
From: mrfox321 at gmail.com (Jonathan Mendoza)
Date: Wed, 16 May 2018 16:35:58 -0400
Subject: [SRILM User List] Class-based probability using -expand-classes
Message-ID: <CAK=4f4mGGjU-VoDQWJMnWqOQqjSbJMv2Y3dWA0OEpSzFxCMJBQ@mail.gmail.com>

SRILM community,

If I build a class based LM via

replace-words-with-classes -> ngram

then re-build the LM using -expand-classes,

will the rebuilt LM follow the class based probabilities,

P(w_n | w_n-1 ... w_1) ?= P(w_n | c_n) * P(c_n | c_n-1 ... c_1).

Or is the mapping an approximate inverse to the original language model?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.speech.sri.com/pipermail/srilm-user/attachments/20180516/e917c5f5/attachment.html>

From stolcke at icsi.berkeley.edu  Wed May 16 14:33:23 2018
From: stolcke at icsi.berkeley.edu (Andreas Stolcke)
Date: Wed, 16 May 2018 14:33:23 -0700
Subject: [SRILM User List] Class-based probability using -expand-classes
In-Reply-To: <CAK=4f4mGGjU-VoDQWJMnWqOQqjSbJMv2Y3dWA0OEpSzFxCMJBQ@mail.gmail.com>
References: <CAK=4f4mGGjU-VoDQWJMnWqOQqjSbJMv2Y3dWA0OEpSzFxCMJBQ@mail.gmail.com>
Message-ID: <f40bf315-250d-41a2-9429-bd2a8abd40c1@icsi.berkeley.edu>

I'm not sure what "an approximate inverse to the original language 
model" means.

But the purpose of ngram -expand-classes is to approximate the class LM 
probabilities (the equation you give) using only word ngram 
probabilities.  It does so by inserting all expanded word ngrams into 
the LM and giving them probabilities according to

P(w_n | w_n-1 ... w_1) = P(w_n w_n-1 ... w_1) / P(w_n-1 ... w_1)

where the joint probabilities on the right hand side are computed by the 
class-based LM.

Andreas

On 5/16/2018 1:35 PM, Jonathan Mendoza wrote:
> SRILM community,
>
> If I build a class based LM via
>
> replace-words-with-classes -> ngram
>
> then re-build the LM using -expand-classes,
>
> will the rebuilt LM follow the class based probabilities,
>
> P(w_n | w_n-1 ... w_1) ?= P(w_n | c_n) * P(c_n | c_n-1 ... c_1).
>
> Or is the mapping an approximate inverse to the original language model?
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://mailman.speech.sri.com/cgi-bin/mailman/listinfo/srilm-user


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.speech.sri.com/pipermail/srilm-user/attachments/20180516/032ec1cb/attachment.html>

From mrfox321 at gmail.com  Wed May 16 15:37:26 2018
From: mrfox321 at gmail.com (Jonathan Mendoza)
Date: Wed, 16 May 2018 18:37:26 -0400
Subject: [SRILM User List] Class-based probability using -expand-classes
In-Reply-To: <f40bf315-250d-41a2-9429-bd2a8abd40c1@icsi.berkeley.edu>
References: <CAK=4f4mGGjU-VoDQWJMnWqOQqjSbJMv2Y3dWA0OEpSzFxCMJBQ@mail.gmail.com>
 <f40bf315-250d-41a2-9429-bd2a8abd40c1@icsi.berkeley.edu>
Message-ID: <CAK=4f4mxcZS20O4yjwCUfJ-vxP_p_La5h3-3XoUwgUuMJr8OJg@mail.gmail.com>

That answers my question!  I am just looking to play around with default
counts of rare in-class words to properly bias new words that are
semantically similar.

Thanks so much for your time!

~jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.speech.sri.com/pipermail/srilm-user/attachments/20180516/eba9f573/attachment.html>

From mrfox321 at gmail.com  Wed May 23 10:35:12 2018
From: mrfox321 at gmail.com (Jonathan Mendoza)
Date: Wed, 23 May 2018 13:35:12 -0400
Subject: [SRILM User List] Constraining class building
Message-ID: <CAK=4f4kouQrFYQoObmKFjkv9QjU1PEe8GhQEsgiW1+5hXxkK3g@mail.gmail.com>

SRILM community,

I am trying to work with ngram-classes.  More specifically, I want to
connect new vocabulary that is semantically similar to certain vocabulary
within the corpus.  e.g. My language model has the class $organization =
{ibm, intel} and I know that {google}, which is not in the training corpus,
will show up in the same context in some test corpus.  The corpus /
language model that I am working with is much simpler, meaning that the
language is very much like a template (or mad libs).

As a result of the structure of the corpus I am working with, I am only
concerned with a few (2-5) multi-word clusters, while retaining single
element classes for rest of the vocabulary.  This means that numclasses is
going to be on the order of {V - O(|C|)} where |C| is expected cardinality
of the set.  I also plan on defining the initial clusters that would be
appended during the merging via ngram-classes.

Does ngram-classes support a method for constraining the class merging to
only work between single-word classes and the predefined multi-word classes?

My initial attempt at a solution would be to iterate over a range of
numclasses with the aforementioned base-classes and see how classes are
formed from the initial conditions.  My worry is that words not in the
initial multi-word classes will merge, leading to a Null result.

For the time being, I am going to use the -full flag to glean intuition
about word clusters, then plan my class initialization accordingly.

Best,
Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.speech.sri.com/pipermail/srilm-user/attachments/20180523/b23b06f7/attachment.html>

From stolcke at icsi.berkeley.edu  Wed Jun  6 13:44:09 2018
From: stolcke at icsi.berkeley.edu (Andreas Stolcke)
Date: Wed, 6 Jun 2018 13:44:09 -0700
Subject: [SRILM User List] FW: question about lattice-tool
In-Reply-To: <MW2PR2101MB10529D9F2CF545AFC8123B51E8650@MW2PR2101MB1052.namprd21.prod.outlook.com>
References: <MW2PR2101MB10529D9F2CF545AFC8123B51E8650@MW2PR2101MB1052.namprd21.prod.outlook.com>
Message-ID: <d58a15ec-3441-7b00-e10f-19848a59c383@icsi.berkeley.edu>


*From:* Andreas Stolcke
*Sent:* Wednesday, June 6, 2018 1:27 PM
*To:* 'Michael Campbell' <mcampbell at veritone.com>
*Cc:* 'srilm-user at speech.sri.com' <srilm-user at speech.sri.com>
*Subject:* RE: question about lattice-tool

Michael,

Lattice-tool does not require a language model.   If non is given, the 
scores contained in the lattice will be used for decoding 
(-viterbi-decode, -nbest-decode, -posterior-decode) and confusion 
network building (-write-mesh).

Andreas

*From:* Michael Campbell <mcampbell at veritone.com 
<mailto:mcampbell at veritone.com>>
*Sent:* Wednesday, June 6, 2018 1:10 PM
*To:* Andreas Stolcke <Andreas.Stolcke at microsoft.com 
<mailto:Andreas.Stolcke at microsoft.com>>
*Subject:* question about lattice-tool

Hello Andreas,

I am using the SRILM "lattice-tool" utility, for which you are listed as 
an author.

I am new to this, and the documentation does not say whether or not 
"lattice-tool" requires

a Language Model to be input in order to use Viterbi or Posterior 
algorithms on a lattice of words.

      * I created a lattice of words, and would like to see the most 
probable sentence.

      * If I use Viterbi, I get a result, *without using any language 
model options*.

*Does "lattice-tool" use a built-in language model to give that result, 
or is the result 'nonsense' since I am not inputting a language model 
into "lattice-tool"? *

Thank you very much for any feedback.

All the best,

Mike

-- 

*Michael Campbell*
mcampbell at veritone.com <mailto:youremail at veritone.com>

Veritone 
<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.veritone.com%2F&data=02%7C01%7Candreas.stolcke%40microsoft.com%7C5372e4261e8a4949118308d5cbe9845f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C636639126171722470&sdata=%2B%2FgJZobo%2ByG2E1qfHJv44L41%2FCM2TeTDOos4YonJpBE%3D&reserved=0>
*Veritone, Inc.*
575 Anton Blvd. Suite 100, Costa Mesa, CA. 92626
https://na01.safelinks.protection.outlook.com/?url=www.veritone.com&data=01%7C01%7Csrilm-user%40speech.sri.com%7C64fb8e3ab5d745c5c91e08d5cbee427e%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=UZUM9faEwv3IGhu80JBkhKDO%2FpG9SUk7dPUo%2Bkwfl8U%3D&reserved=0 
<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.veritone.com%2F&data=02%7C01%7Candreas.stolcke%40microsoft.com%7C5372e4261e8a4949118308d5cbe9845f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C636639126171732475&sdata=LJFiuKplRH9lC%2BY%2Fy6tDc9jNvRIcJKD7UC7AdpYiMVE%3D&reserved=0>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.speech.sri.com/pipermail/srilm-user/attachments/20180606/a42439d4/attachment.html>

From stolcke at icsi.berkeley.edu  Wed Jun  6 15:12:15 2018
From: stolcke at icsi.berkeley.edu (Andreas Stolcke)
Date: Wed, 6 Jun 2018 15:12:15 -0700
Subject: [SRILM User List] question about lattice-tool
In-Reply-To: <MW2PR2101MB1052F5B3A159842FD160287BE8650@MW2PR2101MB1052.namprd21.prod.outlook.com>
References: <CAE2DauuV=FOyvthy7_HXfMa++WgOB8re_4D54_tRcN0+zm3+Dg@mail.gmail.com>
 <MW2PR2101MB10523264A1E6322BDAEF9EE9E8650@MW2PR2101MB1052.namprd21.prod.outlook.com>
 <CAE2DausW+YxGPSiD2KK0+ePDAPi0ZKOHB+Kv35akdE+PVJEDrQ@mail.gmail.com>
 <MW2PR2101MB1052F5B3A159842FD160287BE8650@MW2PR2101MB1052.namprd21.prod.outlook.com>
Message-ID: <52725218-2bac-f42c-fe72-f5250a8d48c7@icsi.berkeley.edu>


What I call "scores"  are usually log likelihoods or log probabilities, 
sometimes scaled in some fashion.   lattice-tool does not care about the 
probabilistic interpretation of such scores, it just combines the scores 
according to their weights and finds the path with the highest overall 
score.

In the case of HTK lattices, the scores typically encoded in the 
lattices are acoustic model (a=), ngram model (n=), general language 
model (l=), and sometimes pronunciation weights (r=).
The HTK lattice format has been generalized to allow up to 9 additional 
scores to be encoded (x1= through x9=) on nodes or links.

The header of the lattice file can define the weights for these scores 
(acscale=, ngscale=, lmscale=, prscale=).  There is also a word 
insertion penalty (wdpenalty=) that implies a constant additional score 
on each word hypothesis.

Score weights can be overridden on the command line (-htk-scscale, 
-htk-lmscale, etc.).  If no score weights are given they default to 1 
(or 0 for the word penalty).

If you specify an external language model that will override the l= 
(general lm) scores in the lattice, but that is optional.

lattice-tool will generate an aggregate score from the weighted 
combination of all scores, and decode the lattice path with the highest 
overall score (from both nodes and links).

Andreas


On 6/6/2018 2:44 PM, Andreas Stolcke wrote:
>
> *From:* Michael Campbell <mcampbell at veritone.com>
> *Sent:* Wednesday, June 6, 2018 2:31 PM
> *To:* Andreas Stolcke <Andreas.Stolcke at microsoft.com>
> *Subject:* Re: question about lattice-tool
>
> By scores, you mean probabilities of words?
>
> The lattice I input consists of edges representing multiple word 
> candidates per time step = node.  I use the HTK lattice format and do 
> not assign any probabilities to edges in my lattice.
>
> For example,
>
> /         __their_____         /            \ 
>        *----there----*---going---*    t=0 \____they're_/t=1        t=2/
> I thought lattice-tool would choose the path of highest probability 
> based on a built-in language model.  For example, that it would 
> produce output from above as "they're going" since it would have the 
> highest probability.
> It is interesting that lattice-tool *does* produce output for such a 
> lattice (without probabilities or scores).  How does it compute that 
> output?
> best,
> Mike
>
> On Wed, Jun 6, 2018 at 1:27 PM, Andreas Stolcke 
> <Andreas.Stolcke at microsoft.com <mailto:Andreas.Stolcke at microsoft.com>> 
> wrote:
>
>     Michael,
>
>     Lattice-tool does not require a language model.   If non is given,
>     the scores contained in the lattice will be used for decoding
>     (-viterbi-decode, -nbest-decode, -posterior-decode) and confusion
>     network building (-write-mesh).
>
>     Andreas
>
>     *From:* Michael Campbell <mcampbell at veritone.com
>     <mailto:mcampbell at veritone.com>>
>     *Sent:* Wednesday, June 6, 2018 1:10 PM
>     *To:* Andreas Stolcke <Andreas.Stolcke at microsoft.com
>     <mailto:Andreas.Stolcke at microsoft.com>>
>     *Subject:* question about lattice-tool
>
>     Hello Andreas,
>
>     I am using the SRILM "lattice-tool" utility, for which you are
>     listed as an author.
>
>     I am new to this, and the documentation does not say whether or
>     not "lattice-tool" requires
>
>     a Language Model to be input in order to use Viterbi or Posterior
>     algorithms on a lattice of words.
>
>        * I created a lattice of words, and would like to see the most
>     probable sentence.
>
>        * If I use Viterbi, I get a result, *without using any language
>     model options*.
>
>     *Does "lattice-tool" use a built-in language model to give that
>     result, or is the result 'nonsense' since I am not inputting a
>     language model into "lattice-tool"? *
>
>     Thank you very much for any feedback.
>
>     All the best,
>
>     Mike
>
>     -- 
>
>     *Michael Campbell*
>     mcampbell at veritone.com <mailto:youremail at veritone.com>
>
>     Veritone
>     <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.veritone.com%2F&data=02%7C01%7Candreas.stolcke%40microsoft.com%7C5372e4261e8a4949118308d5cbe9845f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C636639126171722470&sdata=%2B%2FgJZobo%2ByG2E1qfHJv44L41%2FCM2TeTDOos4YonJpBE%3D&reserved=0>
>     *Veritone, Inc.*
>     575 Anton Blvd. Suite 100, Costa Mesa, CA. 92626
>     <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmaps.google.com%2F%3Fq%3D575%2BAnton%2BBlvd.%2BSuite%2B100%2C%2BCosta%2BMesa%2C%2BCA.%2B92626%26entry%3Dgmail%26source%3Dg&data=02%7C01%7CAndreas.Stolcke%40microsoft.com%7C62343e8765c0435d255208d5cbf4bda3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636639174377239258&sdata=6z1l6Q9HvpZaRao5lfZoBLIbPTNchLm2Nw0tdpqVN2Y%3D&reserved=0>
>     https://na01.safelinks.protection.outlook.com/?url=www.veritone.com&data=01%7C01%7Csrilm-user%40speech.sri.com%7Cd43ee031eadb4929012708d5cbfa90ed%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=leasVxlXPPmSvkLLUyPUppLnez%2Fox14XFBVDXArFfhw%3D&reserved=0
>     <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.veritone.com%2F&data=02%7C01%7Candreas.stolcke%40microsoft.com%7C5372e4261e8a4949118308d5cbe9845f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C636639126171732475&sdata=LJFiuKplRH9lC%2BY%2Fy6tDc9jNvRIcJKD7UC7AdpYiMVE%3D&reserved=0>
>
>
>
> -- 
>
> *Michael Campbell*
> mcampbell at veritone.com <mailto:youremail at veritone.com>
>
> Veritone 
> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.veritone.com%2F&data=02%7C01%7CAndreas.Stolcke%40microsoft.com%7C62343e8765c0435d255208d5cbf4bda3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636639174377239258&sdata=eKT48dBBohsLkdUtK7zvfH9qto7l4YEymlGFkzFyLkI%3D&reserved=0>
> *Veritone, Inc.*
> 575 Anton Blvd. Suite 100, Costa Mesa, CA. 92626
> https://na01.safelinks.protection.outlook.com/?url=www.veritone.com&data=01%7C01%7Csrilm-user%40speech.sri.com%7Cd43ee031eadb4929012708d5cbfa90ed%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=leasVxlXPPmSvkLLUyPUppLnez%2Fox14XFBVDXArFfhw%3D&reserved=0 
> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.veritone.com%2F&data=02%7C01%7CAndreas.Stolcke%40microsoft.com%7C62343e8765c0435d255208d5cbf4bda3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636639174377249272&sdata=6a3lUiV%2BQ%2B4fWOu0zF813u52Ken1Vz8aRdYt%2FZsezVw%3D&reserved=0>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.speech.sri.com/pipermail/srilm-user/attachments/20180606/da704b41/attachment.html>