cahce based models

Wed Mar 7 20:28:32 PST 2007

Hi there,
Thank you for your answers,
I would like to compare several language models, including the cache
model defined in Kuhn and De Mori paper.
I was using the CMU SLM toolkit, and moved recently to SRILM because
of the richness of the implemented algorithm. The only obstacle I
found is the sparse documentation of the project.

I can infer from your answers that to use cache model, I can either:
1- Use the subclass CacheLM using a programming language.
2- use the option -cache with the ngram command.
I still prefer to master the existing commands before using any API,
so now, suppose I want to use ngram -cache 10
and I would like to define to word classes,
The pdf paper says that "Word classes may be defined manually". I
would like to know how to do that, and how to pass the classes file to
ngram.

Finally, I have a comment to the maintainers of this wonderful
project. Why don't you provide a tutorial to use SRILM. This can help
many new comers, given that the documentation is not complete.

Thanks
Looking forward to hearing from you
regards
Hani

On 3/7/07, Andreas Stolcke <stolcke at speech.sri.com> wrote:
>
> In message <3BE78265-2376-4D96-8AB4-547D82E15E92 at gmail.com>you wrote:
> > Hi Hani,
> >
> > if I'm correctly interpreting your question, the LM subclass CacheLM
> > provides a simple cache component implementation.
> >
> > Word probability is boosted if the very same word occured in a window
> > of the last N words (more occurencies yield higher probability). You
> > get ngram to interpolate whatever model you're using with a cache
> > component using -cache. The source code of this one is very
> > straightforward if you're interested in the details.
> >
> > If you're looking for the original papers, Kuhn and De Mori published
> > on this in 1990 (as to my knowledge at least).
> >
> > Hope this helps.
> >
> > Cheers from Aachen,
> >
> > Juri
>
> Thanks for this dead-on response!
>
> At risk of stating the obvious, the code for CacheLM is in
> $SRILM/lm/src/CacheLM.cc, and is quite short and easy to follow.
>
> Best,
>
> Andreas
>
> >
> > On 8. Mar, 2007, at 01:17, Hani Safadi wrote:
> >
> > > Hi,
> > > I would like to get more information on the cache-based models
> > > implemented in SRILM. and how to use them.
> > > The paper briefly mentions them, and there is no information in the
> > > man pages.
> > > Thanks
> > > --
> > > Looking forward to hearing from you.
> > > Best wishes,
> > > Hani Safadi
> >
>
>

-- 
Looking forward to hearing from you.
Best wishes,
Hani Safadi