Modifying ngram.cc

lambert mathias lambert at jhu.edu
Fri Jul 8 08:36:50 PDT 2005


Hi,

This is kind of a C++ question
I wrote the following copy constructor in Ngram.h
Ngram(const Ngram 
&ng):LM(ng.vocab),contexts(ng.contexts),order(ng.order),_skipOOVs(ng._skipOO
Vs),_trustTotals(ng._trustTotals){};

I have the following declarations

Ngram ngramLM(vocab*,order);

While(//<some condition>)
    Ngram useLM(ngramLM);

    // Do some stuff with useLM
    .........
    ........
}

The problem is that the assignment useLM=ngramLM doesn't assign the original
ngramLM. Any changes I make to useLM shows up in ngramLM too.
I just want to make a copy (useLM) of the original ngramLM, work on that
copy and then reinitialize another useLM with the original ngramLM.

Any thoughts?


Lambert


>> . </s> 1
>> /PT </s> 1
>> 
>> Why is the slash considered as part of the tag?
> 
> The / in front of a token signifies that it's a tag, as opposed to a
> word.  It's just a way to encode word/tags, as well as
> word and tags individually, without ambiguity.
> 
>> 
>> b) as can be seen in the example, the n-grams with tags are only built
>> left-to-right, e.g. there is no bigram "la /N5", as I would have expected
>> (and needed).
> 
> The program collects only those N-gram statistics that are required
> by the underlying model.  Since the goal is to use the tags in backoff
> the statistics needed are asymmetrical.
> 
> If you want a different set of N-grams you can probably write a simple
> perl script to do the job.
> 
> --Andreas 
> 




More information about the SRILM-User mailing list