[SRILM User List] Interpreting ngram output with -debug 2 , -cache and -cache-lambda options

Wed Apr 27 17:26:06 PDT 2011

zeeshan khan wrote:
> Hi all,
> I want to understand the debug 2 output given by ngram tool using (and 
> not using) the -cache and -cache-lambda options.
>
> here are the two commands using (and not using) the -cache and 
> -cache-lambda options  :
> ngram -unk "UNKNOWN" -order 4 -lm <LM> -ppl <text-file> -debug 2 
> -cache 350 -cache-lambda 0.1
> AND
> ngram -unk "UNKNOWN" -order 4 -lm <LM> -ppl <text-file> -debug 2
>
> I have the following questions:
> 1. What is the meaning of [cache=xxxx] in each line and how is it 
> calculated.
The xxxx part is the conditional probability due to the cache LM alone 
(i.e.,  the number of occurrence of the word in the cache window, 
divided by the total number of words).
> 2. I cannot understand why the 2 probabilities are different in those 
> lines of the output where the cache-probability is zero eg; in first 5 
> lines of both outputs.
Because you're interpolating the standard ngram probability with the 
cache LM probability.  If the latter is 0 it will "drag down" the 
overall probability.
> 3. Can there be any case where the first entry in each line i.e. 
> [ngram] will be different among the two outputs ? if yes, how can it be ?
The [Ngram]  part of the output should always be the same, because it is 
generated by the ngram LM alone.

Andreas

>
> and here are the first few lines of the outputs of each command:
>
>
> ------------------------------------------------------------------------------------------------------------------------
> WITHOUT the -cache and -cache-lambda options:
> ------------------------------------------------------------------------------------------------------------------------
> <s> this is a podcast of the highlights from today's woman's hour 
> copyright issues mean that we can't always include all the items from 
> the programme </s>
>         p( this | <s> )         = [2gram] 0.0155235 [ -1.80901 ]
>         p( is | this ...)       = [3gram] 0.384267 [ -0.415367 ]
>         p( a | is ...)  = [4gram] 0.171555 [ -0.765597 ]
>         p( podcast | a ...)     = [4gram] 7.7717e-06 [ -5.10948 ]
>         p( of | podcast ...)    = [4gram] 0.108064 [ -0.966317 ]
>         p( the | of ...)        = [4gram] 0.366697 [ -0.435692 ]
>         p( highlights | the ...)        = [3gram] 4.88751e-05 [ -4.31091 ]
>         p( from | highlights ...)       = [4gram] 0.077328 [ -1.11166 ]
>         p( today's | from ...)  = [4gram] 0.00790939 [ -2.10186 ]
>         p( woman's | today's ...)       = [2gram] 9.67272e-06 [ -5.01445 ]
>         p( hour | woman's ...)  = [3gram] 0.218998 [ -0.659561 ]
>         p( copyright | hour ...)        = [1gram] 3.56089e-06 [ -5.44844 ]
>         p( issues | copyright ...)      = [2gram] 0.0196718 [ -1.70615 ]
>         p( mean | issues ...)   = [2gram] 0.00024042 [ -3.61903 ]
>         p( that | mean ...)     = [3gram] 0.211744 [ -0.674189 ]
>         p( we | that ...)       = [3gram] 0.0179052 [ -1.74702 ]
>         p( can't | we ...)      = [4gram] 0.0186763 [ -1.72871 ]
>         p( always | can't ...)  = [4gram] 0.00198593 [ -2.70204 ]
>         p( include | always ...)        = [3gram] 0.000752505 [ -3.12349 ]
>         p( all | include ...)   = [3gram] 0.00575442 [ -2.24 ]
>         p( the | all ...)       = [4gram] 0.314584 [ -0.502263 ]
>         p( items | the ...)     = [4gram] 0.00158827 [ -2.79908 ]
>         p( from | items ...)    = [4gram] 0.0124186 [ -1.90593 ]
>         p( the | from ...)      = [4gram] 0.415841 [ -0.381072 ]
>         p( programme | the ...)         = [3gram] 0.000297532 [ -3.52647 ]
>         p( </s> | programme ...)        = [4gram] 0.288492 [ -0.539866 ]
> 1 sentences, 25 words, 0 OOVs
> 0 zeroprobs, logprob= -55.3437 ppl= 134.463 ppl1= 163.586
>
>
>
> -----------------------------------------------------------------------------------------------------------------------
> WITH the -cache and -cache-lambda options:
> -----------------------------------------------------------------------------------------------------------------------
> <s> this is a podcast of the highlights from today's woman's hour 
> copyright issues mean that we can't always include all the items from 
> the programme </s>
>         p( this | <s> )         = [2gram][cache=0] 0.0139712 [ -1.85477 ]
>         p( is | this ...)       = [3gram][cache=0] 0.34584 [ -0.461124 ]
>         p( a | is ...)  = [4gram][cache=0] 0.154399 [ -0.811355 ]
>         p( podcast | a ...)     = [4gram][cache=0] 6.99453e-06 [ 
> -5.15524 ]
>         p( of | podcast ...)    = [4gram][cache=0] 0.0972579 [ -1.01207 ]
>         p( the | of ...)        = [4gram][cache=0] 0.330028 [ -0.48145 ]
>         p( highlights | the ...)        = [3gram][cache=0] 4.39876e-05 
> [ -4.35667 ]
>         p( from | highlights ...)       = [4gram][cache=0] 0.0695952 [ 
> -1.15742 ]
>         p( today's | from ...)  = [4gram][cache=0] 0.00711845 [ -2.14761 ]
>         p( woman's | today's ...)       = [2gram][cache=0] 8.70545e-06 
> [ -5.06021 ]
>         p( hour | woman's ...)  = [3gram][cache=0] 0.197098 [ -0.705318 ]
>         p( copyright | hour ...)        = [1gram][cache=0] 3.2048e-06 
> [ -5.4942 ]
>         p( issues | copyright ...)      = [2gram][cache=0] 0.0177047 [ 
> -1.75191 ]
>         p( mean | issues ...)   = [2gram][cache=0] 0.000216378 [ 
> -3.66479 ]
>         p( that | mean ...)     = [3gram][cache=0] 0.190569 [ -0.719947 ]
>         p( we | that ...)       = [3gram][cache=0] 0.0161147 [ -1.79278 ]
>         p( can't | we ...)      = [4gram][cache=0] 0.0168087 [ -1.77447 ]
>         p( always | can't ...)  = [4gram][cache=0] 0.00178733 [ -2.74779 ]
>         p( include | always ...)        = [3gram][cache=0] 0.000677254 
> [ -3.16925 ]
>         p( all | include ...)   = [3gram][cache=0] 0.00517898 [ -2.28576 ]
>         p( the | all ...)       = [4gram][cache=0.05] 0.288126 [ 
> -0.540418 ]
>         p( items | the ...)     = [4gram][cache=0] 0.00142944 [ -2.84483 ]
>         p( from | items ...)    = [4gram][cache=0.0454545] 0.0157222 [ 
> -1.80349 ]
>         p( the | from ...)      = [4gram][cache=0.0869565] 0.382953 [ 
> -0.416855 ]
>         p( programme | the ...)         = [3gram][cache=0] 0.000267779 
> [ -3.57222 ]
>         p( </s> | programme ...)        = [4gram][cache=0] 0.259643 [ 
> -0.585623 ]
> 1 sentences, 25 words, 0 OOVs
> 0 zeroprobs, logprob= -56.3676 ppl= 147.226 ppl1= 179.764
>
> -----------------------------------------------------------------------------------------------------------------------
>
>
> best regards,
> Zeeshan Khan
> ------------------------------------------------------------------------
>
> _______________________________________________
> SRILM-User site list
> SRILM-User at speech.sri.com
> http://www.speech.sri.com/mailman/listinfo/srilm-user