Class n-grams

Andreas Stolcke stolcke at
Thu Jul 3 22:17:59 PDT 2008

Basiou Nikoletta wrote:
> Dear Andreas,
> thanks a lot for your answer. Actually, i want to build the classes 
> from trigram statistics/counts. Is there any improvision for such an 
> implementation in the near future or there are restrictions due to 
> higher memory and process requirements?
It would take a lot longer and is currently not implemented. 
I vaguely recall a paper by Herman Ney and colleagues many years ago 
showing that inducing classes based on higher-order statistics doesn't 
buy that much
(i.e., it is sufficient to learn the classes using bigram statistics, 
and then use them in higher-order class-based models).


> Looking forward for your answer,
> Nikoletta
>     ------------------------------------------------------------------------
>     *From:* Andreas Stolcke [mailto:stolcke at]
>     *To:* Nikoletta Bassiou [mailto:nbassiou at]
>     *Cc:* srilm-user at
>     *Sent:* Tue, 01 Jul 2008 19:25:22 +0300
>     *Subject:* Re: Class n-grams
>     Nikoletta Bassiou wrote:
>     > I would like to build a class trigram using ngram-class but
>     according
>     > to the documentation only class bigram is implemented.
>     > If this is true, do you know any other way I can build a class
>     > trigram? Is there an improvision for extending ngram-class for
>     higher
>     > order n-grams (n>3)?
>     >
>     > Nikoletta
>     The bigram restriction only applies to the statistics used to
>     learn the
>     word classes. Once you have the classes you can apply them to your
>     text
>     and build an ngram of any order.
>     Andreas

More information about the SRILM-User mailing list