class LM
Mirjam Sepesy Maucec
mirjam.sepesy at uni-mb.si
Fri Sep 20 05:01:56 PDT 2002
Hi all,
I have a question about the class-based models. I have just started to
use them.
First I want to understand the test example in the toolkit.
I have problems with understanding the probability computation of the
devtest.text
Can you, please, explain me, which 1grams, 2grams, 3grams.... are meant
for example in this sentence:
kaybeck and lost ok
p( kaybeck | <s> ) = [1gram][2gram] 0.000845361 [ -3.07296 ] / 1
p( and | kaybeck ...) = [1gram][3gram] 0.443827 [ -0.352786 ] / 1
p( lost | and ...) = [2gram][2gram][4gram][4gram] 0.0305452 [ -1.51506
] / 1
p( ok | lost ...) = [3gram][3gram][4gram][4gram] 0.0703371 [ -1.15282
] / 0.999999
p( </s> | ok ...) = [3gram][4gram] 0.401395 [ -0.396428 ] / 1
I am familiar with the class model, where all words are mapped to
classes.
In this example, there are only two classes (GRIDLABEL and
SPELLED_GRIDLABEL) and
in the model we have ngrams of words and ngrams of words and classes.
I understand the idea, that if n-gram of words exists in is better to
use it
and if not, classes should help.
But what are the steps in probability computation?
Please, help!
Have a nice weekend!
Mirjam
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mirjam.sepesy.vcf
Type: text/x-vcard
Size: 302 bytes
Desc: Card for Mirjam Sepesy Maucec
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20020920/fd86aca0/attachment.vcf>
More information about the SRILM-User
mailing list