strange symbols
marco turchi
marco.turchi at gmail.com
Tue Jul 29 18:03:40 PDT 2008
Dear all,
I'm using srilm on some data crawled from the Web. The lm contains some
strange symbols as these:
\1-grams:
-6.774207 ^A 0
-6.774207 ^C
-6.774207 ^D
-6.774207 ^E 0
-6.774207 ^F 0
-6.774207 ^G 0
-6.774207 ^H 0
-6.774207 ^K 0
-6.774207 ^N 0
-6.774207 ^O
-6.774207 ^P
-6.774207 ^T 0
-6.774207 ^X
-6.774207 ^Y 0
-6.774207 ^\
-6.774207 ^]
-6.774207 ^^ 0
-6.774207 ^_
these symbols are not the simple combination of ^ and a letter but it seems
to be something different as a character that has been truncated or
something similar.
Do u have an idea what they are and how to remove them?
thanks a lot
Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20080730/31a4387e/attachment.html>
More information about the SRILM-User
mailing list