Class Language Modelling
geetu at clsp.jhu.edu
Tue Nov 19 08:18:45 PST 2002
Suppose i wish to build a language model P(w0/CW0,CW1,CW2) where CW0, CW1
& CW2 are the equivalence classes for the predicted word and the 2
preceding words respectively amd i wish to use absolute discounting with a
fixed D. The input files i have available are (1) a trigram count file
(format - w0 w1 w2 count) (2) a vocab file (3) 3 class files in format
classno word1 word2 ....) for w0, w1 & w2 positions .
Can someone please tell me the syntax of the ngram-count command needed to
build a LM using this sort of a class LM as i am not very sure i
understand it clearly.
More information about the SRILM-User