disambig with "open vocabulary" LM
Andreas Stolcke
stolcke at speech.sri.com
Tue Jan 28 09:34:06 PST 2003
In message <3E36AB98.3070405 at ira.uka.de>you wrote:
> Hi,
> I would like to use the disambig program with an open-vocabulary LM
> (built with ngram-count and -unk option).
> I get the following error message: "warning: non-zero probability for
> <unk> in closed-vocabulary LM" (the LM read by disambig is not
> recognized as an open-vocabulary LM).
> What is the matter? Are we supposed to use only closed-vocabulary LM
> with disambig?
> Can anyone help?
> Thanks,
>
> Amélie
>
> PS: is there anywhere I can find an archive of the mailing-list?
>
Amélie,
this is an omission in disambig, to tell the vocabulary object that
<unk> is to be treated as a regular word. Please try the following patch:
===================================================================
RCS file: RCS/disambig.cc,v
retrieving revision 1.34
diff -c -r1.34 disambig.cc
*** /tmp/T00M2saV Tue Jan 28 09:30:49 2003
--- disambig.cc Tue Jan 28 09:23:02 2003
***************
*** 709,714 ****
--- 709,715 ----
vocab.toLower = tolower1? true : false;
hiddenVocab.toLower = tolower2 ? true : false;
+ hiddenVocab.unkIsWord = keepUnk ? true : false;
if (mapFile) {
File file(mapFile, "r");
===================================================================
A similar patch belongs in hidden-ngram.cc:
===================================================================
RCS file: RCS/hidden-ngram.cc,v
retrieving revision 1.37
diff -c -r1.37 hidden-ngram.cc
*** /tmp/T0aSC8P_ Tue Jan 28 09:32:03 2003
--- hidden-ngram.cc Tue Jan 28 09:24:59 2003
***************
*** 1007,1012 ****
--- 1007,1013 ----
*/
Vocab vocab;
vocab.toLower = toLower? true : false;
+ vocab.unkIsWord = keepUnk ? true : false;
SubVocab hiddenVocab(vocab);
SubVocab *classVocab = 0;
===================================================================
As to the mailing list archives: send a message to majordomo at speech.sri.com
with "help" in the body. You will receive instructions on how to retrieve
the archives of this mailing list. (Unfortunately there is no web interface.)
--Andreas
More information about the SRILM-User
mailing list