strange symbols

marco turchi marco.turchi at gmail.com
Tue Jul 29 18:03:40 PDT 2008


Dear all,
I'm using srilm on some data crawled from the Web. The lm contains some
strange symbols as these:
\1-grams:
-6.774207       ^A      0
-6.774207       ^C
-6.774207       ^D
-6.774207       ^E      0
-6.774207       ^F      0
-6.774207       ^G      0
-6.774207       ^H      0
-6.774207       ^K      0
-6.774207       ^N      0
-6.774207       ^O
-6.774207       ^P
-6.774207       ^T      0
-6.774207       ^X
-6.774207       ^Y      0
-6.774207       ^\
-6.774207       ^]
-6.774207       ^^      0
-6.774207       ^_

these symbols are not the simple combination of ^ and a letter but it seems
to be something different as a character that has been truncated or
something similar.
Do u have an idea what they are and how to remove them?

thanks a lot
Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20080730/31a4387e/attachment.html>


More information about the SRILM-User mailing list