Modifying ngram.cc
lambert mathias
lambert at jhu.edu
Fri Jul 8 08:36:50 PDT 2005
Hi,
This is kind of a C++ question
I wrote the following copy constructor in Ngram.h
Ngram(const Ngram
&ng):LM(ng.vocab),contexts(ng.contexts),order(ng.order),_skipOOVs(ng._skipOO
Vs),_trustTotals(ng._trustTotals){};
I have the following declarations
Ngram ngramLM(vocab*,order);
While(//<some condition>)
Ngram useLM(ngramLM);
// Do some stuff with useLM
.........
........
}
The problem is that the assignment useLM=ngramLM doesn't assign the original
ngramLM. Any changes I make to useLM shows up in ngramLM too.
I just want to make a copy (useLM) of the original ngramLM, work on that
copy and then reinitialize another useLM with the original ngramLM.
Any thoughts?
Lambert
>> . </s> 1
>> /PT </s> 1
>>
>> Why is the slash considered as part of the tag?
>
> The / in front of a token signifies that it's a tag, as opposed to a
> word. It's just a way to encode word/tags, as well as
> word and tags individually, without ambiguity.
>
>>
>> b) as can be seen in the example, the n-grams with tags are only built
>> left-to-right, e.g. there is no bigram "la /N5", as I would have expected
>> (and needed).
>
> The program collects only those N-gram statistics that are required
> by the underlying model. Since the goal is to use the tags in backoff
> the statistics needed are asymmetrical.
>
> If you want a different set of N-grams you can probably write a simple
> perl script to do the job.
>
> --Andreas
>
More information about the SRILM-User
mailing list