parallel ngram-count

Andreas Stolcke stolcke at speech.sri.com
Thu Nov 1 14:01:47 PDT 2007


In message <E9F47D5E-8138-49AA-9A7A-1101C276347F at gmail.com>you wrote:
> I see one quick way to parallelize ngram-count on a N-core box:
> 
> -- split file list into N sublists
> -- launch N ngram-count instances, giving each its own sublist
> -- merge counts
> 
> Is there any better way?

That's what I would do.  Make sure you are not i/o bound when running
many ngram-count in parallel, and watch for memory usage.

Andreas 




More information about the SRILM-User mailing list