[SRILM User List] class based language model

Andreas Stolcke stolcke at icsi.berkeley.edu
Tue Jun 5 15:26:49 PDT 2012


You can build class-based LMs using your own class assignments.

Step 2 works with a classfile with or without probabilities (the probs 
are optional in the format).

For step 3, you need some probability distribution over the words to 
obtain a proper language model.
For example, use the "uniform-classes" script to insert uniform 
probabilities for those class assignments that don't have any.
If you have a large training set, you can run

     replace-with-words-classes classes=<classfile> addone=1 normalize=1 
outfile=OUTPUT  TEXTFILE

to count the number of times each word occurs and estimate class 
expansion probabilities (written to OUTFILE).

Andreas

On 6/5/2012 1:37 AM, Shammur Absar Chowdhury wrote:
> Hello
>
> I am new to srilm and at the same time I am recently learning about 
> language model. My aim was to build a class based language model with 
> a given class definition.
>
> So far I have used, the below 3 commands from 
> http://www.speech.sri.com/pipermail/srilm-user/2010q1/000843.html
>
>
> 1. ngram-class -vocab vocab.txt \
>             -text LM.txt \
>             -numclasses 16 \
>             -classes classfile
> 2. replace-words-with-classes classes=classfile LM.txt > 
> Output_text_with_classes
> 3. ngram-count  -text Output_text_with_classes   -lm Class_based_model
>
>
> But as far as I think that the first command here induces the classes. 
> Now what if I want srilm to use my assigned class tag and its followed 
> words list to make the class model, how will I do it? I meant I try 
> formating my classes tag in the class-format and then run the second 
> step but as in the format I am suppose to assign a probability, p - 
> which I cant assign in my manual created class file.
>
> Could any one please help me and give a direction or suggest some 
> reading for me.
> Thank you .
>
> Shammur Absar Chowdhury
>
>
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120605/a3b3d11a/attachment.html>


More information about the SRILM-User mailing list