[SRILM User List] FLM problem

Dimitris Babaniotis dimbabaniotis at gmail.com
Mon Aug 15 03:31:31 PDT 2011


Hello, I have a problem with fngram-count and fngram commands. I create 
a language model with fngram command gives me an error.

This is factor file:

1
## Best perplexity found
## logprob= -84709 ppl= 166.097 ppl1= 431.488
W : 4 W(-1) W(-2) M(0) S(0) /home/dimbaba/test.counts 
/home/dimbaba/M0S0.txt 4
W1,W2, W2 ndiscount
M0 M0 ndiscount
S0 S0 ndiscount
0 0 ndiscount

This is an example input for train:

W-επαvάληψη:S-επα:M-ηψη W-της:S-της:M-της W-συvσδoυ:S-συv:M-δoυ
W-κηρύσσω:S-κηρ:M-σσω W-την:S-την:M-την W-επανάληψη:S-επα:M-ηψη 
W-της:S-της:M-της W-συνόδου:S-συν:M-δου W-του:S-του:M-του 
W-ευρωπαϊκού:S-ευρ:M-κού W-κοινοβουλίου:S-κοι:M-ίου W-η:S-η:M-η 
W-οποία:S-οπο:M-οία W-είχε:S-είχ:M-ίχε W-διακοπεί:S-δια:M-πεί 
W-την:S-την:M-την W-παρασκευή:S-παρ:M-ευή W-17:S-17:M-17 
W-δεκεμβρίου:S-δεκ:M-ίου W-και:S-και:M-και W-σας:S-σας:M-σας 
W-απευθύνω:S-απε:M-ύνω W-ξανά:S-ξαν:M-ανά W-τις:S-τις:M-τις 
W-θερμές:S-θερ:M-μές W-ευχές:S-ευχ:M-χές W-μου:S-μου:M-μου W-,:S-,:M-, 
W-ελπίζοντας:S-ελπ:M-τας W-να:S-να:M-να W-περάσατε:S-περ:M-ατε 
W-καλά:S-καλ:M-αλά W-στις:S-στι:M-τις W-διακοπές:S-δια:M-πές W-.:S-.:M-.

These are the commands:

fngram -factor-file /home/dimbaba/test.ff -ppl aligned/el-de/el-test.txt 
-nonull -no-virtual-begin-sentence

fngram-count -factor-file /home/dimbaba/test.ff -text 
/home/dimbaba/factoredExample.txt -nonull -no-virtual-begin-sentence -lm ghu

This is the output of fngram command:

/home/dimbaba/test.counts: line 643: malformed N-gram count or more than 
100 words per line
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
warning: no singleton counts
GT discounting disabled
/home/dimbaba/M0S0.txt: line 21: error, ngram line has invalid number 
(1) of fields, expecting either 2 or 3
format error in lm file

Where is the problem?

Thanks

Dimitris



More information about the SRILM-User mailing list