[SRILM User List] rescoring with fngram and sapare probabilities
Andreas Stolcke
stolcke at icsi.berkeley.edu
Fri Jun 22 20:17:40 PDT 2012
On 6/22/2012 5:50 PM, Gregor Donaj wrote:
> Hi,
>
> I have two question about the fngram tool. I used it to re-score
> n-best lists of factored sentences. I took a look at the man pages,
> but I couldn't find my answers.
> 1)
> After taking a close look at the probabilities i realized, that the
> score seem already to be weighted by the factor 8. Is the any option
> to change this factor? How about the ngram tool?
I would not use fngram for nbest rescoring, lack of documentation being
one problem. Also, this program has not been updated in a while.
The better approach is to use fngram-count to train FLMs, but then use
ngram -factored to apply the LM to data. So you would use ngram
-factored -nbest or -nbest-files or -rescore (see ngram(1) man page).
> 2)
> Can fngram give also the original language score to output? I mean not
> just to replace the original language score with the re-scored
> probability but to write both in the output file?
Well, you can always save the original nbest lists and use its LM scores
as an additional input to you the score combination.
Using more than the standard three scores (AM, LM, and word count)
requires extra work, some of which is supported by the wrapper scripts
described in the nbest-scripts(5) man page.
The typical way to do this would be:
1) Use the rescore-decipher wrapper script with the -lm-only option (in
addition to -factored -lm ...) to produce score files that contain only
the FLM scores.
2) Use nbest-optimize (on a held-out tuning set) to determine the
optimal score weightings (see man page)
3) Use rescore-reweight to combine all scores and output new 1-best hyps.
Andreas
More information about the SRILM-User
mailing list