[SRILM User List] Difference between segment and hidden-ngram

Eeva Nikkari eevanikkari at gmail.com
Tue Nov 22 23:26:02 PST 2016


I'm getting different results from using the segment function or the
hidden-ngram function with a hidden vocabulary of "</s>" in segmenting text
into sentences.
In both cases I used the same wb-discounted 3-gram model, but I get
different segmentations depending on whether I use hidden-ngram or segment.
(similairly with other models I've tried)

It seems segment assigns more sentence boundaries (and performs better).
What's the difference between using segment and hidden-ngram with
hidden-vocab "</s>" ?
I would like to use hidden-ngram since I want to test out higher order
models as well, but it's strange that segment works better.

Thank you,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.speech.sri.com/pipermail/srilm-user/attachments/20161123/e8e1a6f2/attachment.html>

More information about the SRILM-User mailing list