[SRILM User List] Predicting words
Andreas Stolcke
stolcke at icsi.berkeley.edu
Wed Aug 8 22:09:35 PDT 2012
On 7/20/2012 5:04 AM, Nouf Al-Harbi wrote:
> Hello,
>
> I am new to language modeling and was hoping that someone can help me
> with the following.
>
> I try to predict a word given an input sentence. For example, I would
> like to get a word replacing the ... that has the
> highest probability in sentences such as ' A man is ...' (e.g. sitting).
>
> I try to use disambig tool but I couldn't found any example illustrate
> how to use it especially how how I can create the map file and what is
> the type of this file ( e.g. txt, arpa, ...).
Indeed you can use disambig, at least in theory to solve this problem.
1. prepare a map file of the form:
a a
man man
... [for all words occurring in your data]
UNKNOWN_WORD word1 word2 .... [list all words in the vocabulary
here]
2. train an LM of word sequences.
3. prepare disambig input of the form
a man is sitting UNKNOWN_WORD
You can also add known words to the right of UKNOWN_WORD if you have
that information (see the note about -fw-only below).
4. run disambig
disambig -map MAPFILE -lm LMFILE -text INPUTFILE
If you want to use only the left context of the UNKNOWN_WORD use the
-fw-only option.
This is in theory. If your vocabulary is large it may be very slow and
take too much memory. I haven't tried it, so let me know if it works
for you.
Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120808/4a346594/attachment.html>
More information about the SRILM-User
mailing list