[SRILM User List] how to genereate character lm model

Nutthamon dcherubangel at gmail.com
Mon Mar 11 07:09:44 PDT 2013


Hello,

I am new to language modeling and SRILM toolkit.

Is this toolkit can generate language model in character level? If can
do that, what is a command for do that i can't find it.And please give
example to me.


i used this tool via cygwin terminal.

Example in training.txt

s i m p l y
g o o d
t h a n k y o u
c l o u n d

or
training2.txt
s i m p l y g o o d t h a n k y o u c l o u n d


which training text correct for LM built on character level?
first,right? If first i can directly enter to add more line or add
some symbol for add line?

i'm not sure what is <s>and</s> mean.

Is this command for train lm model character level (trigram)?
ngram-count -order 4 -text /srilm/sences.txt -write /srilm/corpus

i try training1.txt resul is

<s>     1
<s> ▒▒S 1
<s> ▒▒S </s>    1
▒▒S     1
▒▒S </s>        1
</s>
i don't know what is it wht it's not count character but when i try
word level. the result is sum of count of word

Many thank in advance

-- 
Best Regards,
Nutthamon Moknarong
dcherubangel at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130311/f2151df1/attachment.html>


More information about the SRILM-User mailing list