[SRILM User List] Assertion failed: (!Map_noKeyP(key)) in LHash.cc -Error when using segment
Eeva Nikkari
eevanikkari at gmail.com
Wed Nov 2 06:22:47 PDT 2016
Hi,
I'm trying out the SRILM toolkit
I'm trying to build a language model for sentence segmentation. The ngram
model I built to test the functions is
minicorpus.lm
\data\
ngram 1=10
ngram 2=18
\1-grams:
-0.5862657 </s>
-99 <s> -99
-1.431364 bark -99
-0.9542425 birds -99
-0.8293038 cats -7.050447
-0.8293038 chase -7.129629
-1.431364 chirp -99
-0.8293038 dogs -7.84033
-1.431364 meow -99
-1.130334 the -7.351478
\2-grams:
-0.544068 <s> cats
-0.243038 <s> dogs
-0.845098 <s> the
0 bark </s>
-0.1760913 birds </s>
-0.4771213 birds chirp
-0.30103 cats </s>
-0.60206 cats chase
-0.60206 cats meow
-0.30103 chase birds
-0.60206 chase cats
-0.60206 chase the
0 chirp </s>
-0.60206 dogs bark
-0.1249387 dogs chase
0 meow </s>
-0.30103 the birds
-0.30103 the cats
\end\
from the text (same used to test the segment function)
minicorpus.txt
dogs chase cats
dogs bark
cats meow
dogs chase birds
cats chase birds
dogs chase the cats
the birds chirp
When I try using the segment function I get the following error
$ segment -order 2 -lm minicorpus.lm -text minicorpus.txt -continuous
-debug 5
reading 10 1-grams
reading 18 2-grams
warning: p(w1) < p(<s> w1))
0: p(NOS) = 0, P(S) = 0.148148
1: p(NOS) = 0.111111, P(S) = 0
2: p(NOS) = 0.0277778, P(S) = 6.10653e-10
3: p(NOS) = 3.66393e-10, P(S) = 0.00793651
4: p(NOS) = 0.00198413, P(S) = 0
5: p(NOS) = 0, P(S) = 0.000566893
6: p(NOS) = 0.000141723, P(S) = 0
7: p(NOS) = 0, P(S) = 8.09848e-05
8: p(NOS) = 6.07386e-05, P(S) = 0
9: p(NOS) = 3.03693e-05, P(S) = 0
10: p(NOS) = 0, P(S) = 5.78463e-06
11: p(NOS) = 1.44616e-06, P(S) = 0
12: p(NOS) = 7.23079e-07, P(S) = 0
13: p(NOS) = 0, P(S) = 2.75459e-07
14: p(NOS) = 2.06594e-07, P(S) = 0
15: p(NOS) = 5.16485e-08, P(S) = 5.67708e-16
16: p(NOS) = 2.58243e-08, P(S) = 1.70313e-16
17: p(NOS) = 1.70313e-16, P(S) = 1.84459e-09
18: p(NOS) = 9.22294e-10, P(S) = 0
19: p(NOS) = 3.07431e-10, P(S) = 0
Assertion failed: (!Map_noKeyP(key)), function locate, file
../../include/LHash.cc, line 275.
Abort trap: 6
I get the
Assertion failed: (!Map_noKeyP(key)), function locate, file
../../include/LHash.cc, line 275.
Abort trap: 6
-error every time I use the segment function. I've tried with different
texts and language models (different orders, smoothing and corpora). Is my
model missing something? The man page says to use "standard backoff N-gram
model in ARPA ngram-format(5)
<http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html>,
modeling segmentation using the boundary tags <s> and </s>", which to my
understanding minicorpus.lm is. I use macOS Sierra Version 10.12.1
When I run the 'make test' the only test that fails is
*** Running test make-ngram-pfsg ***
real 0m0.056s
user 0m0.048s
sys 0m0.016s
sed: RE error: illegal byte sequence
make-ngram-pfsg: stdout output DIFFERS.
make-ngram-pfsg: stderr output IDENTICAL.
I'm be thankful for any advise you can provide,
Eeva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.speech.sri.com/pipermail/srilm-user/attachments/20161102/85673ab4/attachment.html>
More information about the SRILM-User
mailing list