Problems about srilm

Andreas Stolcke stolcke at speech.sri.com
Thu Apr 19 16:09:37 PDT 2007


洪大弘 wrote:
> Hello!
> I am a student from Taiwan.
> I have some questions when I encountered difficulties in using srilm. The 
> problem is as the attaching field. And when I made google n-gram models, I 
> also encountered the same problem. Would you please tell me what the mistake 
> did I make? Thank you!
>   
It is impossible to read the entire google 5gram corpus into memory,
which is what you are trying to do.
You have to use the count-based LM, and estimate deleted interpolation
weights from a small amount of
data, so that only a small portion of the ngrams need to be kept in memory.

I'm sorry there is no good documentation of this process at this point
(you can piece it together by reading
the manual pages for ngram-count and ngram, and look at the example in

$SRILM/test/tests/ngram-count-lm-limit-vocab/run-test

We will make complete instructions for google ngram usage available in
the future.

Andreas


> --
> Chaoyang University of Technology
> WebMail http://webmail.cyut.edu.tw
>
>
>
>   
>
> ------------------------------------------------------------------------
>





More information about the SRILM-User mailing list