[SRILM User List] Data preparation for building language model using ngram-count

Abbas Malik abbas.malik at gmail.com
Fri Jan 15 20:30:56 PST 2010


Dear All,

Do we really need to add <s> at the start of each sentence and </s> at the
end of each sentence for the preparation of a language model using
ngram-count.

my data looks like:

=============
<s> sentnce1 </s>
<s> sentence2 </s>
so on...
=============

De we really need <s> and </s> tags?

thank you in advance,

-- 
---
M G Abbas Malik
Doctorant (PhD Student)
Université Joseph Fourier,
Groupe d'Etude pour la Traduction Automatique et le Traitement Automatisé
des Langues et de la Parole (GETALP)
Laboratoire d'Informatique de Grenoble (LIG) / Grenoble Informatics
Laboratory

GETALP, LIG-Campus, BP53
385 Rue de la Bibliothèque,
38041 Grenoble Cedex 9, France
Off:      +33 (0)4 76 51 48 17
Mob:    +33 (0)6 74 50 46 01
e-mail: abbas.malik at imag.fr abbas.malik at gmail.com
URL:    www.puran.info
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20100116/f9fb15ef/attachment.html>


More information about the SRILM-User mailing list