Class n-grams

> thanks a lot for your answer. Actually, i want to build the classes 
> from trigram statistics/counts. Is there any improvision for such an 
> implementation in the near future or there are restrictions due to 
> higher memory and process requirements?
It would take a lot longer and is currently not implemented. 
I vaguely recall a paper by Herman Ney and colleagues many years ago 
showing that inducing classes based on higher-order statistics doesn't 
buy that much
(i.e., it is sufficient to learn the classes using bigram statistics, 
and then use them in higher-order class-based models).


>     The bigram restriction only applies to the statistics used to
>     learn the
>     word classes. Once you have the classes you can apply them to your
>     text
>     and build an ngram of any order.
