[SRILM User List] Predicting words
    Andreas Stolcke 
    stolcke at icsi.berkeley.edu
       
    Wed Aug  8 22:09:35 PDT 2012
    
    
  
On 7/20/2012 5:04 AM, Nouf Al-Harbi wrote:
> Hello,
>
> I am new to language modeling and was hoping that someone can help me 
> with the following.
>
> I try to predict a word given an input sentence. For example, I would 
> like to get a word replacing the ... that has the
> highest probability in sentences such as ' A man is ...' (e.g. sitting).
>
> I try to use disambig tool but I couldn't found any example illustrate 
> how to use it especially how how I can create the map file and what is 
> the type of this file ( e.g. txt, arpa, ...).
Indeed you can use disambig, at least in theory to solve this problem.
1. prepare a map file of the form:
     a       a
     man    man
     ...   [for all words occurring in your data]
     UNKNOWN_WORD  word1 word2  ....  [list all words in the vocabulary 
here]
2. train an LM of word sequences.
3. prepare disambig input of the form
                 a man is sitting UNKNOWN_WORD
    You can also add known words to the right of UKNOWN_WORD if you have 
that information (see the note about -fw-only below).
4. run disambig
             disambig -map MAPFILE -lm LMFILE -text INPUTFILE
If you want to use only the left context of the UNKNOWN_WORD use the 
-fw-only option.
This is in theory.  If your vocabulary is large it may be very slow and 
take too much memory.  I haven't tried it, so let me know if it works 
for you.
Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20120808/4a346594/attachment.html>
    
    
More information about the SRILM-User
mailing list