I'd like to use srilm to generate bigram counts from the British National
Corpus in XML format. I see that the paper
 "SRILM - An Extensible Language Modeling Toolkit", in Proc. Intl. Conf.
Spoken Language Processing, Denver, Colorado, September 2002
mentions that support for SGML-tagged formats is regarded as desirable: has
this support been implemented in the toolkit at this time please?

