[SRILM User List] ngram-count large count file

王秋锋 wqfengnlpr at gmail.com
Tue Dec 15 07:59:07 PST 2009


Dear SRILM users,
  I wanted to get a BiGram from the word pair counts file,
so I took it as :
  ngram-count -read CountFile -lm -BiGrams -order 2
But several minutes later, it was killed.

I suspect my CountFile is too large(3.5GB) ,and my memmoy is 2.0GB.
so if the CountFiles is read in memory, it will be overflow.

So my question is ,
1:Does the SRILM reads the whole CountFile in memory?
 or read some lines and train some BiGrams ,them again and again?

2:How can I do to get the BiGram with this large CountFile?

Thanks,
   Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.speech.sri.com/pipermail/srilm-user/attachments/20091215/9683db86/attachment.html>


More information about the SRILM-User mailing list