[SRILM User List] class based model

Andreas Stolcke stolcke at icsi.berkeley.edu
Tue Dec 17 17:22:45 PST 2013


On 12/17/2013 2:58 AM, Laatar Rim wrote:
> Dear Andreas ,
>
> First , sorry to disturb you by my stupid questions , but i  still 
> have an ambiguous about class based model and i will be very grateful 
> if you can help me.
>
> There are my questions:
>
> 1- The file : class format ( class p word1 word2 ...) , it supports 
> only a simple words or it can support word such as  :
> good-morning , thank-you ...

The expansion of a class can be one or more words, e.g.,

CITY    0.123        New York

>
> 2-Yhe class model can have a mixte of word and class definition ?

Yes.   The LM could have an ngram    "the CITY" (see above).

>
> 3- You say that A word label simply represents a class consisting only 
> of the word itself , but i don't have class that contains one word , 
> and is that means my model is wrong ?

what is meant is that a class ngrams with a mix of words and class 
labels is equivalent to class ngram model that has only class-based 
ngrams, where the word labels are replaced by classes that have only 
that one word as their membership.

>
> 4-  To execute this command :replace-words-with-classes 
> classes='/home/hp/Documents/SRILM/Replace_word_with_class_SRILM' 
> '/home/hp/Documents/SRILM/trainingData.txt' > 
> output_text_with_classes_2 :
>
> trainingData.txt must continue punctuation marks or only phrases.
It depends on whether your ngram model is supposed to include 
punctuation or not.  The software doesn't care whether you have 
punctuation, it treats period, comma, etc. as word strings just like any 
other.   It depends on your application (the program that uses the LM) 
whether punctuation is appropriate or not.

Andreas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20131217/f0dfa28a/attachment.html>


More information about the SRILM-User mailing list