limitations in ngram-merge

Andreas Stolcke stolcke at speech.sri.com
Wed Feb 8 23:18:03 PST 2006


In message <159323335F97074D9A594D676652B06754A028 at VS3.hdi.tvcabo>you wrote:
> Hi
> 
> We are currently having a problem with the merging of count files using ngram
> -merge. 
> It seems that there is a limitation in the size of the resulting file of 2GB.
> Can you give us some information if this is limitation is due to the program 
> or if it is a limitation due to the configuration of our system. We are runni
> ng ngram-merge in a PIV 2,66GHz 1GB RAM in Suse 10.0.

It's probably an OS limitation.  SRILM uses level-2 I/O functions
(see fopen(3)).  

We have certainly handled files larger than 2 GB on our Linux machines.
But those files that are usually gzipped (ending in .gz).  SRILM
doesn't read or write those directly, since the I/O is to a pipe 
that talks to the gzip program.  Maybe you can try using gzipped files
in your case too.

--Andreas 




More information about the SRILM-User mailing list