[SRILM User List] FLM problem
Dimitris Babaniotis
dimbabaniotis at gmail.com
Mon Aug 15 03:31:31 PDT 2011
Hello, I have a problem with fngram-count and fngram commands. I create
a language model with fngram command gives me an error.
This is factor file:
## Best perplexity found
## logprob= -84709 ppl= 166.097 ppl1= 431.488
W : 4 W(-1) W(-2) M(0) S(0) /home/dimbaba/test.counts
/home/dimbaba/M0S0.txt 4
W1,W2, W2 ndiscount
M0 M0 ndiscount
S0 S0 ndiscount
0 0 ndiscount
This is an example input for train:
W-επαvάληψη:S-επα:M-ηψη W-της:S-της:M-της W-συvσδoυ:S-συv:M-δoυ
W-κηρύσσω:S-κηρ:M-σσω W-την:S-την:M-την W-επανάληψη:S-επα:M-ηψη
W-της:S-της:M-της W-συνόδου:S-συν:M-δου W-του:S-του:M-του
W-ευρωπαϊκού:S-ευρ:M-κού W-κοινοβουλίου:S-κοι:M-ίου W-η:S-η:M-η
W-οποία:S-οπο:M-οία W-είχε:S-είχ:M-ίχε W-διακοπεί:S-δια:M-πεί
W-την:S-την:M-την W-παρασκευή:S-παρ:M-ευή W-17:S-17:M-17
W-δεκεμβρίου:S-δεκ:M-ίου W-και:S-και:M-και W-σας:S-σας:M-σας
W-απευθύνω:S-απε:M-ύνω W-ξανά:S-ξαν:M-ανά W-τις:S-τις:M-τις
W-θερμές:S-θερ:M-μές W-ευχές:S-ευχ:M-χές W-μου:S-μου:M-μου W-,:S-,:M-,
W-ελπίζοντας:S-ελπ:M-τας W-να:S-να:M-να W-περάσατε:S-περ:M-ατε
W-καλά:S-καλ:M-αλά W-στις:S-στι:M-τις W-διακοπές:S-δια:M-πές W-.:S-.:M-.
These are the commands:
fngram -factor-file /home/dimbaba/test.ff -ppl aligned/el-de/el-test.txt
-nonull -no-virtual-begin-sentence
fngram-count -factor-file /home/dimbaba/test.ff -text
/home/dimbaba/factoredExample.txt -nonull -no-virtual-begin-sentence -lm ghu
This is the output of fngram command:
/home/dimbaba/test.counts: line 643: malformed N-gram count or more than
100 words per line
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
/home/dimbaba/M0S0.txt: line 21: error, ngram line has invalid number
(1) of fields, expecting either 2 or 3
format error in lm file
Where is the problem?
More information about the SRILM-User
mailing list