Customizing GAlib
version 2.4

This document describes how to extend GAlib's capabilities by defining your own genomes and genetic operators. The best way to customize the behavior of an object is to derive a new class. If you do not want to do that much work, GAlib is designed to let you replace behaviors of existing objects by defining new functions.

see also: library overview, class hierarchy, programming interface

Table of contents







Deriving your own genome class

You can create your own genome class by multiply-inheriting from the base genome class and your own data type. For example, if you have already have an object defined, say MyObject, then you would derive a new genome class called MyGenome, whose class definition looks like this:
// Class definition for the new genome object, including statically defined
// declarations for default evaluation, initialization, mutation, and 
// comparison methods for this genome class.
class MyGenome : public MyObject, public GAGenome {
public:
  GADefineIdentity("MyGenome", 201);
  static void Init(GAGenome&);
  static int Mutate(GAGenome&, float);
  static float Compare(const GAGenome&, const GAGenome&);
  static float Evaluate(GAGenome&);
  static int Cross(const GAGenome&, const GAGenome&, GAGenome*, GAGenome*);

public:
  MyGenome() : GAGenome(Init, Mutate, Compare) { 
    evaluator(Evaluate); 
    crossover(Cross); 
  }
  MyGenome(const MyGenome& orig) { copy(orig); }
  virtual ~MyGenome() {}
  MyGenome& operator=(const GAGenome& orig){
    if(&orig != this) copy(orig);
    return *this;
  }

  virtual GAGenome* clone(CloneMethod) const {return new MyGenome(*this);}
  virtual void copy(const GAGenome& orig) {
    GAGenome::copy(orig);  // this copies all of the base genome parts
    // copy any parts of MyObject here
    // copy any parts of MyGenome here
  }

  // any data/member functions specific to this new class
};

void 
MyGenome::Init(GAGenome&){
  // your initializer here
}

int 
MyGenome::Mutate(GAGenome&, float){
  // your mutator here
}

float 
MyGenome::Compare(const GAGenome&, const GAGenome&){
  // your comparison here
}

float 
MyGenome::Evaluate(GAGenome&){
  // your evaluation here
}

int
MyGenome::Cross(const GAGenome& mom, const GAGenome& dad,
                GAGenome* sis, GAGenome* bro){
  // your crossover here
}

By convention, one of the arguments to a derived genome constructor is the objective function. Alternatively (as illustrated in this example), you can hard code a default objective function into your genome - just call the evaluator member somewhere in your constructor and pass the function you want used as the default.

Once you have defined your genome class, you should define the intialization, mutation, comparison, and crossover operators for it. The comparison operator is optional, but if you do not define it you will not be able to use the diversity measures in the genetic algorithms and/or populations.

Note that the genetic comparator is not necessarily the same as the boolean operator== and operator!= comparators. The genetic comparator returns 0 if the two individuals are the same, -1 if the comparison fails for some reason, and a real number greater than 0 indicating the degree of difference if the individuals are not identical but can be compared. It may be based on genotype or phenotype. The boolean comparators, on the other hand, indicate only whether or not two individuals are identical. In most cases, the boolean comparator can simply call the genetic comparator, but in some cases it is more efficient to define different operators (the boolean comparators are called much more often than the genetic comparators, especially if no diversity is being measured).

To work properly with the GAlib, you must define the following:

       MyGenome( -default-args-for-your-genome-constructor )
       MyGenome(const MyGenome&)
       virtual GAGenome* clone(GAGenome::CloneMethod) const
If your genome adds any non-trivial member data, you must define these:
       virtual ~MyGenome()
       virtual copy(const GAGenome&)
       virtual int equal(const GAGenome&) const
To enable streams-based reading and writing of your genome, you should define these:
       virtual int read(istream&)
       virtual int write(ostream&) const
When you derive a genome, don't forget to use the _evaluated flag to indicate when the state of the genome has changed and an evaluation is needed. If a member function changes the state of your genome, that member function should set the _evaluated flag to gaFalse.

Assign a default crossover, mutation, initialization, and comparison method so that users don't have to assign one unless they want to.

It is a good idea to define an identity for your genome (especially if you will be using it in an environment with multiple genome types running around). Use the DefineIdentity macro (defined in id.h) to do this in your class definition. The DefineIdentity macro sets a class ID number and the name that will be used in error messages for the class. You can use any number above 200 for the ID, but be sure to use a different number for each of your classes.

When run-time type information (RTTI) has stabilized across compilers, GAlib will probably use that instead of the Define/Declare identity macros.





Genome Testing

Use the following program to test your genome. The basic idea here is to test incrementally in order to isolate problems as they arise. If your genome works with a small test program such as this one, it will function properly with any genetic algorithm in GAlib. (This is no guarantee, however, that your genome will help you find the solution to your problem. That is another matter entirely.)
int
main() {
  MyGenome genome;      // test default constructor (if we have one)
  cout << "genome after creation:\n" << genome << endl;

  genome.initialize();  // test the initializer
  cout << "genome after initialization:\n" << genome << endl;

  genome.mutate();      // test the mutator
  cout << "genome after mutation:\n" << genome << endl;

  MyGenome* a = new MyGenome(genome);   // test copy constructor
  MyGenome* b = new MyGenome(genome);

  MyGenome* c = genome.clone(GAGenome::CONTENTS);
  cout << "clone contents:\n" << *c << "\n";
  MyGenome* d = genome.clone(GAGenome::ATTRIBUTES);
  cout << "clone attributes:\n" << *d << "\n";

  a->initialize();
  b->initialize();
  cout << "parents:\n" << *a << "\n" << *b << "\n";

  MyGenome::DefaultCrossover(*a, *b, c, d);   // test two child crossover
  cout << "children of crossover:\n" << *c << "\n" << *d << "\n";
  MyGenome::DefaultCrossover(*a, *b, c, 0);   // test single child crossover
  cout << "child of crossover:\n" << *c << "\n";

  a->compare(*b);       // test the comparator

  delete a;
  delete b;
  delete c;
  delete d;

  return 0;
}




Genome Initialization

The initializer takes a single argument: the genome to be initialized. The genome has already been allocated; the intializer only needs to populate it with appropriate contents.

Here is the implementation of an initializer for the GATreeGenome<int> class.

void
TreeInitializer(GAGenome & c)
{
  GATreeGenome<int> &child=(GATreeGenome<int> &)c;

// destroy any pre-existing tree
  child.root();
  child.destroy();

// Create a new tree with depth of 'depth' and each eldest node containing
// 'n' children (the other siblings have none).
  int depth=2, n=3, count=0;
  child.insert(count++,GATreeBASE::ROOT);

  for(int i=0; i<depth; i++){
    child.eldest();
    child.insert(count++);
    for(int j=0; j<n; j++)
      child.insert(count++,GATreeBASE::AFTER);
  }
}




Genome Mutation

The genome mutator takes two arguments: the genome that will receive the mutation(s) and a mutation probability. The exact meaning of the mutation probability is up to the designer of the mutation operator. The mutator should return the number of mutations that occured.

Most genetic algorithms invoke the mutation method on each newly generated offspring. So your mutation operator should base its actions on the value of the mutation probability. For example, an array of floats could flip a pmut-biased coin for each element in the array. If the coin toss returns true, the element gets a Gaussian mutation. If it returns false, the element is left unchanged. Alternatively, a single biased coin toss could be used to determine whether or not the entire genome should be mutated.

Here is an implementation of the flip mutator for the GA1DBinaryString class. This mutator flips a biased coin for each bit in the string.

int 
GA1DBinStrFlipMutator(GAGenome & c, float pmut)
{
  GA1DBinaryStringGenome &child=(GA1DBinaryStringGenome &)c;
  if(pmut <= 0.0) return(0);

  int nMut=0;
  for(int i=child.length()-1; i>=0; i--){
    if(GAFlipCoin(pmut)){
      child.gene(i, ((child.gene(i) == 0) ? 1 : 0));
      nMut++;
    }
  }
  return nMut;
}




Genome Crossover

The crossover method is used by the genetic algorithm to mate individuals from the population to form new offspring. Each genome should define a default crossover method for the genetic algorithms to use. The sexual and asexual member functions return a pointer to the preferred sexual and asexual mating methods, respectively. The crossover member function is used to change the preferred mating method. The genome does not have a member function to invoke the crossover; only the genetic algorithm can actually perform the crossover.

Some genetic algorithms use sexual mating, others use asexual mating. If possible, define both so that your genome will work with either kind of genetic algorithm. If your derived class does not define a cross method, an error message will be posted whenever crossover is attempted.

Sexual crossover takes four arguments: two parents and two children. If one child is nil, the operator should be able to generate a single child. The genomes have already been allocated, so the crossover operator should simply modify the contents of the child genome as appropriate. The crossover function should return the number of crossovers that occurred. Your crossover function should be able to operate on one or two children, so be sure to test the child pointers to see if the genetic algorithm is asking you to create one or two children.

Here is an implementation of the two-parent/one-or-two-child single point crossover operator for fixed-length genomes of the GA1DBinaryStringGenome class.

int
SinglePointCrossover(const GAGenome& p1, const GAGenome& p2, GAGenome* c1, GAGenome* c2){
  GA1DBinaryStringGenome &mom=(GA1DBinaryStringGenome &)p1;
  GA1DBinaryStringGenome &dad=(GA1DBinaryStringGenome &)p2;

  int n=0;
  unsigned int site = GARandomInt(0, mom.length());
  unsigned int len = mom.length() - site;

  if(c1){
    GA1DBinaryStringGenome &sis=(GA1DBinaryStringGenome &)*c1;
    sis.copy(mom, 0, 0, site);
    sis.copy(dad, site, site, len);
    n++;
  }
  if(c2){
    GA1DBinaryStringGenome &bro=(GA1DBinaryStringGenome &)*c2;
    bro.copy(dad, 0, 0, site);
    bro.copy(mom, site, site, len);
    n++;
  }

  return n;
}




Genome Comparison

The comparison method is used for diversity calculations. It compares two genomes and returns a number that is greater than or equal to zero. A value of 0 means that the two genomes are identical (no diversity). There is no maximum value for the return value from the comparator. A value of -1 indicates that the diversity could not be calculated.

Here is the comparator for the binary string genomes. It simply counts up the number of bits that both genomes share. In this example, we return a -1 if the genomes are not the same length.

float
GA1DBinStrComparator(const GAGenome& a, const GAGenome& b){
  GA1DBinaryStringGenome &sis=(GA1DBinaryStringGenome &)a;
  GA1DBinaryStringGenome &bro=(GA1DBinaryStringGenome &)b;
  if(sis.length() != bro.length()) return -1;
  float count = 0.0;
  for(int i=sis.length()-1; i>=0; i--)
    count += ((sis.gene(i) == bro.gene(i)) ? 0 : 1);
  return count/sis.length();
}




Genome Evaluation

The genome evaluator is the objective function for your problem. It takes a single genome as its argument. The evaluator returns a number that indicates how good or bad the genome is. You must cast the generic genome to the genome type that you are using. If your objective function works with different genome types, then use the genome object's className and/or classID member functions to determine the genome class before you do the casts.

Here is a simple evaluation function for a real number genome with a single element. The function tries to maximize a sinusoidal.

float
Objective(GAGenome& g){
  GARealGenome& genome = (GARealGenome &)g;
  return 1 + sin(genome.gene(0)*2*M_PI);
}




Population Initialization

This method is invoked when the population is initialized.

Here is an implemenation that invokes the initializer for each genome in the population.

void 
PopInitializer(GAPopulation & p){
  for(int i=0; i<p.size(); i++)
    p.individual(i).initialize();
}




Population Evaluation

This method is invoked when the population is evaluated. If your objective is population-based, you can use this method to set the score for each genome rather than invoking an evaluator for each genome.

Here is an implementation that invokes the evaluation method for each genome in the population.

void 
PopEvaluator(GAPopulation & p){
  for(int i=0; i<p.size(); i++)
    p.individual(i).evaluate();
}




Scaling Scheme

The scaling object does the transformation from raw (objective) scores to scaled (fitness) scores. The most important member function you will have to define for a new scaling object is the evaluate member function. This function calculates the fitness scores based on the objective scores in the population that is passed to it.

The GAScalingScheme class is a pure virtual (abstract) class and cannot be instantiated. To make your derived class non-virtual, you must define the clone and evaluate functions. You should also define the copy method if your derived class introduces any additional data members that require non-trivial copy.

The scaling class is polymorphic, so you should define the object's identity using the GADefineIdentity macro. This macro sets a class ID number and the name that will be used in error messages for the class. You can use any number above 200 for the ID, but be sure to use a different number for each of your objects.

Here is an implementation of sigma truncation scaling.

class SigmaTruncationScaling : public GAScalingScheme {
public:
  GADefineIdentity("SigmaTruncationScaling", 286);

  SigmaTruncationScaling(float m=gaDefSigmaTruncationMultiplier) : c(m) {}
  SigmaTruncationScaling(const SigmaTruncationScaling & arg){copy(arg);}
  SigmaTruncationScaling & operator=(const GAScalingScheme & arg)
    { copy(arg); return *this; }
  virtual ~SigmaTruncationScaling() {}
  virtual GAScalingScheme * clone() const 
    { return new SigmaTruncationScaling(*this); }
  virtual void evaluate(const GAPopulation & p);
  virtual void copy(const GAScalingScheme & arg){
    if(&arg != this && sameClass(arg)){
      GAScalingScheme::copy(arg);
      c=((SigmaTruncationScaling&)arg).c;
    }
  }
  float multiplier(float fm) { return c=fm; }
  float multiplier() const { return c; }
protected:
  float c;			// std deviation multiplier
};


void 
SigmaTruncationScaling::evaluate(const GAPopulation & p) {
  float f;
  for(int i=0; i<p.size(); i++){
    f = p.individual(i).score() - p.ave() + c * p.dev();
    if(f < 0) f = 0;
    p.individual(i).fitness(f);
  }
}




Selection Scheme

The selection object is used to pick individuals from the population. Before a selection occurs, the update method is called. You can use this method to do any pre-selection data transformations for your selection scheme. When a selection is requested, the select method is called. The select method should return a reference to a single individual from the population.

A selector may make its selections based either on the scaled (fitness) scores or on the raw (objective) scores of the individuals in the population. Note also that a population may be sorted either low-to-high or high-to-low, depending on which sort order was chosen. Your selector should be able to handle either order (this way it will work with genetic algorithms that maximize or minimize).

The selection scheme class is polymorphic, so you should define the object's identity using the GADefineIdentity macro. This macro sets a class ID number and the name that will be used in error messages for the class. You can use any number above 200 for the ID, but be sure to use a different number for each of your objects.

Here is an implementation of a tournament selector. It is based on the roulette wheel selector and shares some of the roulette wheel selector's functionality. In particular, this tournament selector uses the roulette wheel selector's update method, so it does not define its own. The select method does two fitness-proportionate selections then returns the individual with better score.

class TournamentSelector : public GARouletteWheelSelector {
public:
  GADefineIdentity("TournamentSelector", 255);

  TournamentSelector(int w=GASelectionScheme::FITNESS) : 
  GARouletteWheelSelector(w) {}
  TournamentSelector(const TournamentSelector& orig) { copy(orig); }
  TournamentSelector& operator=(const GASelectionScheme& orig) 
    { if(&orig != this) copy(orig); return *this; }
  virtual ~TournamentSelector() {}
  virtual GASelectionScheme* clone() const
    { return new TournamentSelector; }
  virtual GAGenome& select() const;
};


GAGenome &
TournamentSelector::select() const {
  int picked=0;
  float cutoff;
  int i, upper, lower;

  cutoff = GARandomFloat();
  lower = 0; upper = pop->size()-1;
  while(upper >= lower){
    i = lower + (upper-lower)/2;
    if(psum[i] > cutoff)
      upper = i-1;
    else
      lower = i+1;
  }
  lower = Min(pop->size()-1, lower);
  lower = Max(0, lower);
  picked = lower;

  cutoff = GARandomFloat();
  lower = 0; upper = pop->size()-1;
  while(upper >= lower){
    i = lower + (upper-lower)/2;
    if(psum[i] > cutoff)
      upper = i-1;
    else
      lower = i+1;
  }
  lower = Min(pop->size()-1, lower);
  lower = Max(0, lower);

  GAPopulation::SortBasis basis = 
    (which == FITNESS ? GAPopulation::SCALED : GAPopulation::RAW);
  if(pop->order() == GAPopulation::LOW_IS_BEST){
    if(pop->individual(lower,basis).score() <
       pop->individual(picked,basis).score())
      picked = lower;
  }
  else{
    if(pop->individual(lower,basis).score() >
       pop->individual(picked,basis).score())
      picked = lower;
  }

  return pop->individual(picked,basis);
}




Genetic Algorithm

Here is a sample derived class that does restricted mating. In this example, one of the parents is selected as usual. The second individual is select as the first, but it is used only if it is similar to the first individual. If not, we make another selection. If enough selections fail, we take what we can get.
class RestrictedMatingGA : public GASteadyStateGA {
public:
  GADefineIdentity("RestrictedMatingGA", 288);
  RestrictedMatingGA(const GAGenome& g) : GASteadyStateGA(g) {}
  virtual ~RestrictedMatingGA() {}
  virtual void step();
  RestrictedMatingGA & operator++() { step(); return *this; }
};

void
RestrictedMatingGA::step()
{ 
  int i, k;
  for(i=0; i<tmpPop->size()-; i++){
    mom = &(pop->select()); 
    k=0;
    do { k++; dad = &(pop->select()); }
    while(mom->compare(*dad) < THRESHOLD && k<pop->size());
    stats.numsel += 2;
    if(GAFlipCoin(pCrossover()))
      stats.numcro += (*scross)(*mom, *dad, &tmpPop->individual(i), 0);
    else
      tmpPop->individual(i).copy(*mom);
    stats.nummut += tmpPop->individual(i).mutate(pMutation());
  }

  for(i=0; i<tmpPop->size(); i++)
    pop->add(tmpPop->individual(i));

  pop->evaluate();		// get info about current pop for next time
  pop->scale();			// remind the population to do its scaling

  for(i=0; i<tmpPop->size(); i++)
    pop->destroy(GAPopulation::WORST, GAPopulation::SCALED);

  stats.update(*pop);		// update the statistics by one generation
}




Termination Function

The termination function determines when the genetic algorithm should stop evolving. It takes a genetic algorithm as its argument and returns gaTrue if the genetic algorithm should stop or gaFalse if the algorithm should continue.

Here are three examples of termination functions. The first compares the current generation to the desired number of generations. If the current generation is less than the desired number of generations, it returns gaFalse to signify that the GA is not yet complete.

GABoolean
GATerminateUponGeneration(GAGeneticAlgorithm & ga){
  return(ga.generation() < ga.nGenerations() ? gaFalse : gaTrue);
}
The second example compares the average score in the current population with the score of the best individual in the current population. If the ratio of these exceeds a specified threshhold, it returns gaTrue to signify that the GA should stop. Basically this means that the entire population has converged to a 'good' score.
const float desiredRatio = 0.95;    // stop when pop average is 95% of best

GABoolean
GATerminateUponScoreConvergence(GAGeneticAlgorithm & ga){
  if(ga.statistics().current(GAStatistics::Mean) /
     ga.statistics().current(GAStatistics::Maximum) > desiredRatio)
    return gaTrue;
  else
    return gaFalse;
}
The third uses the population diversity as the criterion for stopping. If the diversity drops below a specified threshhold, the genetic algorithm will stop.
const float thresh = 0.01;     // stop when population diversity is below this

GABoolean
StopWhenNoDiversity(GAGeneticAlgorithm & ga){
  if(ga.statistics().current(GAStatistics::Diversity) < thresh)
    return gaTrue;
  else
    return gaFalse;
}
A faster method of doing a nearly equivalent termination is to use the population's standard deviation as the stopping criterion (this method does not require comparisons of each individual). Notice that this judges diversity based upon the genome scores rather than their actual genetic diversity.
const float thresh = 0.01;     // stop when population deviation is below this

GABoolean
StopWhenNoDeviation(GAGeneticAlgorithm & ga){
  if(ga.statistics().current(GAStatistics::Deviation) < thresh)
    return gaTrue;
  else
    return gaFalse;
}

Matthew Wall, 23 March 1996