some tech queries

Sun Apr 7 17:43:16 PDT 2002

Hi,

I am new to SRILM, and quite new to language modelling at large
(coming from other domains of n-gram models usage).

I have run some perliminary probes with SRILM (on linux, smooth install)
and have the following questions:

1. in ngram-count:
   when using -lm with the default -order 3, i had expected -text <textfile>
   to yield the same model as -read <order1> -read <order2> -read <order3>
   where order{1-3} have been obtained through ngram-count -write{1-3}
   (all other paramters being equal). and yet the two LM files differ.
   how come?

2. in ngram-count:
   i'm not quite clear about the multiple -cdiscount flags.
   suppose i want a default -order 3 LM.
   mustn't i give all three D's and have the model interpolate over all
   of these, as eq. (18) in Chen&Goodman (p.15) implies?
   in practice it seems one can specify any subset of the 3 and get
   different models. (are there default Ds?)

3. in ngram-count:
   probably closely related to question 2.
   (and prob. due to some confusion i have between backoff & interpolation)
   why are there multiple -interpolate flags.
   again, eq. (18) in C&G appears to imply a recursive all levels
   interpolation. and yet ngram-count appears to take any subset of
   -interpolate{1-3} (in the above example) and yield different LMs.

4. combining 2+3:
   if i want an absolute discount model of order, say 3, 
   "by the book" C&G eq. (18), what is the proper way to run it? 
   assume i have ran ngram-count => get-gt-counts => make-abs-discount
   and obtained <D1> <D2> <D3>.
   a command line example will be highly appreciated.

5. ngram-count vs. ngram:
   if i use ngram-count with some combination of -prune and -minprune 
   to obtain a model and then use ngram -ppl, will the result be identical
   to running ngram-count without the pruning flags, and running ngram -ppl
   on the new model with -prune -minprune as was previously done for model
   building?

6. for ngram -ppl:
   in -debug 1, i believe, two measures are given per sentence, ppl and ppl1.
   how are they defined? 
   is one C&G's $PP_p(T)$ (p.9,top)? then, what is the other?

Help would be highly appreciated,
-Gill