[SRILM User List] ngram: sentence boundary markers in text file used with -ppl? [edit]
Andreas Stolcke
stolcke at icsi.berkeley.edu
Sat May 25 15:11:40 PDT 2013
On 5/25/2013 1:37 PM, Sander Maijers wrote:
> Hi,
>
> Should one surround the sentences in the sentences file for ngram's
> '-ppl' with <s> sos and </s> eos tokens? They are in the LM.
>
> I have tested it just now, and it seems that the sentence boundary
> markers are inferred by ngram when left out, and adopted when put in.
> Where is this documented?
In the man page
<http://www.speech.sri.com/projects/srilm/manpages/ngram.1.html>. The
relevant options are
-no-sos
Disable the automatic insertion of start-of-sentence
tokens for sentence probability computation. The
probability of the initial word is thus computed with an
empty context.
-no-eos
Disable the automatic insertion of end-of-sentence
tokens for sentence probability computation. End-
of-sentence is thus excluded from the total probability.
Andreas
>
> Best,
> Sander
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20130525/67c2fdb8/attachment.html>
More information about the SRILM-User
mailing list