[SRILM User List] unknown Word found in LM

Tony Robinson tonyr at cantabresearch.com
Fri Jun 6 04:47:19 PDT 2014

On 06/06/2014 12:41 PM, Thipe Modipa wrote:
> Hi,
> I am decoding utterances with a dictionary containing about 800 unique 
> words with a language model containing about 63K unique words and I 
> get the following warning:
> WARNING [-9999]  ReadARPAunigram: unknown Word 'h' found in LM -- 
> ignored  in HDecode
> Will this warning have a negative impact on the word recognition 
> accuracy, or what is the general effect?
> Thanks
> Thipe

This is far more likely to be a HTK problem than a SRILM problem.

You need to look to see whether you really do have a work 'h' in your 
language model and whether this has an associated entry in your 
pronunciation dictionary.   Chances are that is is in the language model 
and not in the pronunciation dictionary so HDecode is simply ignoring 
it.    This won't have a big impact, you probably didn't want to 
recognise the word anyway.


** Cantab is hiring: www.cantabResearch.com/openings **
Dr A J Robinson, Founder, Cantab Research Ltd
Phone direct: 01223 778240 office: 01223 794497
Company reg no GB 05697423, VAT reg no 925606030
51 Canterbury Street, Cambridge, CB4 3QG, UK

More information about the SRILM-User mailing list