[SRILM User List] unknown Word found in LM
tonyr at cantabresearch.com
Fri Jun 6 04:47:19 PDT 2014
On 06/06/2014 12:41 PM, Thipe Modipa wrote:
> I am decoding utterances with a dictionary containing about 800 unique
> words with a language model containing about 63K unique words and I
> get the following warning:
> WARNING [-9999] ReadARPAunigram: unknown Word 'h' found in LM --
> ignored in HDecode
> Will this warning have a negative impact on the word recognition
> accuracy, or what is the general effect?
This is far more likely to be a HTK problem than a SRILM problem.
You need to look to see whether you really do have a work 'h' in your
language model and whether this has an associated entry in your
pronunciation dictionary. Chances are that is is in the language model
and not in the pronunciation dictionary so HDecode is simply ignoring
it. This won't have a big impact, you probably didn't want to
recognise the word anyway.
** Cantab is hiring: www.cantabResearch.com/openings **
Dr A J Robinson, Founder, Cantab Research Ltd
Phone direct: 01223 778240 office: 01223 794497
Company reg no GB 05697423, VAT reg no 925606030
51 Canterbury Street, Cambridge, CB4 3QG, UK
More information about the SRILM-User