www.gp-field-guide.org.uk
Contents Top Previous Next

4.1 Preparatory Steps

The purpose of the first two preparatory steps is to specify the ingredients the evolutionary process can use to construct potential solutions. Because the problem is to find a mathematical function of one independent variable, x, the terminal set (the inputs of the to-be-evolved programs) must include this variable. The terminal set also includes ephemeral random constants drawn from some reasonable range,1 say from -5.0 to +5.0, as described in Section† 3.1 . Thus the terminal set, T, is

T = {x,¬}.

The statement of the problem does not specify which functions may be employed in the to-be-evolved program. One simple choice for the function set is the four ordinary arithmetic functions: addition, subtraction, multiplication and division. Most numeric regression problems will require at least these operations, sometimes with additional functions such as sin() and log(). We will use the simple function set

F = {+,- ,*,%},

where % is protected division as discussed in Section† 3.2.1 . Note that the target polynomial can be expressed exactly using the terminal and function sets we have chosen, so these primitives are sufficient (cf.†page† 56 ) for the quadratic polynomial problem.

The third preparatory step involves constructing the fitness measure that specifies what the user wants. The high-level goal of this problem is to find a program whose output is equal to the values of the quadratic polynomial x2+x+1. Therefore, the fitness assigned to a particular individual in the population must reflect how closely the output of an individual program comes to the target polynomial x2 + x + 1.

In principle, the fitness measure could be defined in terms of the mathematical integral of the difference between the evolved function and the target function. However, for most symbolic regression problems, it is not practical or even possible to compute the value of the integral analytically. Thus, it is common to define the fitness to be the sum of absolute errors measured at different values of the independent variable x in the range [-1.0 ,+1.0]. In particular, we will measure the errors for x ő{-1.0 ,-0.9 ,∑∑∑,0.9 ,1.0}. A smaller value of fitness (error) is better; a fitness (error) of zero would indicate a perfect fit. With this definition, our fitness is (approximately) proportional to the area between the parabola x2 + x + 1 and the curve representing the candidate individual (see Figure† 4.2 for examples).

The fourth step is where we set our run parameters. The population size in this small illustrative example will be just four. The population size for a run of GP typically consists of thousands of individuals, but we will use this tiny population size to keep the example manageable. The crossover operation is commonly used to generate about 90% of the individuals in the population; the reproduction operation (where a fit individual is simply copied from one generation to the next) is used to generate about 8% of the population; the mutation operation is used to generate about 1% of the population; and the architecture-altering operations (see Section† 6.1.2 ) are used to generate perhaps 1% of the population. However, because this example involves an abnormally small population of only four individuals, the crossover operation will be used twice (each time generating one individual), which corresponds to a crossover rate of 50%, while the mutation and reproduction operations will each be used to generate one individual. These are therefore applied with a rate of 25% each. For simplicity, the architecture-altering operations are not used for this problem.

In the fifth and final step we need to specify a termination condition. A reasonable termination criterion for this problem is that the run will continue from generation to generation until the fitness (or error) of some individual is less than 0.1. In this contrived example, our example run will (atypically) yield an algebraically perfect solution with a fitness of zero after just one generation.


www.gp-field-guide.org.uk
Contents Top Previous Next