001 /*
002 * Java Genetic Algorithm Library (jenetics-7.1.1).
003 * Copyright (c) 2007-2022 Franz Wilhelmstötter
004 *
005 * Licensed under the Apache License, Version 2.0 (the "License");
006 * you may not use this file except in compliance with the License.
007 * You may obtain a copy of the License at
008 *
009 * http://www.apache.org/licenses/LICENSE-2.0
010 *
011 * Unless required by applicable law or agreed to in writing, software
012 * distributed under the License is distributed on an "AS IS" BASIS,
013 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
014 * See the License for the specific language governing permissions and
015 * limitations under the License.
016 *
017 * Author:
018 * Franz Wilhelmstötter (franz.wilhelmstoetter@gmail.com)
019 */
020
021 /**
022 * <h2>Example</h2>
023 *
024 * The following example shows how to solve a GP problem with <em>Jenetics</em>.
025 * We are trying to find a polynomial (or an arbitrary mathematical function)
026 * which approximates a given data set.
027 *
028 * <table>
029 * <caption>Sample points</caption>
030 * <tr><th>x</th><th>y</th></tr>
031 * <tr><td>0.00</td><td>0.0000</td></tr>
032 * <tr><td>0.10</td><td>0.0740</td></tr>
033 * <tr><td>0.20</td><td>0.1120</td></tr>
034 * <tr><td>0.30</td><td>0.1380</td></tr>
035 * <tr><td>...</td><td>...</td></tr>
036 * </table>
037 *
038 * The sample points has been created with the function
039 * <em>f(x) = 4*x^3 - 3*x^2 + x</em>. The knowledge of the creating function
040 * makes it easier to compare the quality of the evolved function. For the
041 * example we created 21 data points.
042 *
043 * <p>
044 * <b>NOTE</b>: <em>The function which created the sample points is not
045 * needed in the error function we have to define for the GP. It just let
046 * us verify the final, evolved result.</em>
047 * </p>
048 *
049 * As first step, we have to define the set of allowed <em>non-terminal</em>
050 * and the <em>terminal</em> operations the GP is working with. Selecting the
051 * right set of operation has a big influence on the performance of the GP. If
052 * the operation set is bigger than necessary, we will expand the potential
053 * search space, and the execution time for finding a solution. For our
054 * <em>polynomial</em> example we will chose the following <em>operations</em>
055 * and <em>terminals</em>.
056 *
057 * <pre>{@code
058 * static final ISeq<Op<Double>> OPERATIONS = ISeq.of(
059 * MathOp.ADD,
060 * MathOp.SUB,
061 * MathOp.MUL
062 * );
063 *
064 * static final ISeq<Op<Double>> TERMINALS = ISeq.of(
065 * Var.of("x", 0),
066 * EphemeralConst.of(() -> (double)RandomRegistry.getRandom().nextInt(10))
067 * );
068 * }</pre>
069 *
070 * The chosen non-terminal operation set is sufficient to create any polynomial.
071 * For the terminal operations, we added a variable "x", with index zero, and
072 * an ephemeral integer constant. The purpose of the ephemeral constant is to
073 * create constant values, which will differ for every tree, but stay constant
074 * within a tree.
075 * <p>
076 * In the next step define the fitness function for the GP, which will be an
077 * error function we will minimize.
078 *
079 * <pre>{@code
080 * // The lookup table where the data points are stored.
081 * static final double[][] SAMPLES = new double[][] {
082 * {-1.0, -8.0000},
083 * {-0.9, -6.2460},
084 * ...
085 * };
086 *
087 * static double error(final ProgramGene<Double> program) {
088 * return Arrays.stream(SAMPLES).mapToDouble(sample -> {
089 * final double x = sample[0];
090 * final double y = sample[1];
091 * final double result = program.eval(x);
092 * return abs(y - result) + program.size()*0.00005;
093 * })
094 * .sum();
095 * }
096 * }</pre>
097 *
098 * The error function calculates the sum of the (absolute) difference between
099 * the sample value and the value calculated the by the evolved <em>program</em>
100 * ({@code ProgramGene}). Since we prefer compact programs over complex one, we
101 * will add a penalty for the program size (the number of nodes of the program
102 * tree).
103 *
104 * <p>
105 * <b>CAUTION</b>: <em>The penalty for the tree size must be small enough to
106 * not dominate the error function. We still want to find an approximating
107 * function and not the smallest possible one</em>
108 * </p>
109 *
110 * After we have defined the error function, we need to define the proper
111 * {@code Codec}.
112 *
113 * <pre>{@code
114 * static final Codec<ProgramGene<Double>, ProgramGene<Double>> CODEC =
115 * Codec.of(
116 * Genotype.of(ProgramChromosome.of(
117 * // Program tree depth.
118 * 5,
119 * // Chromosome validator.
120 * ch -> ch.root().size() <= 50,
121 * OPERATIONS,
122 * TERMINALS
123 * )),
124 * Genotype::gene
125 * );
126 * }</pre>
127 *
128 *
129 * There are two particularities in the definition of the
130 * {@code ProgramChromosome}:
131 *
132 * <ol>
133 * <li>Since we want to narrow the search space, we are limiting the depth
134 * of newly created program trees to 5.</li>
135 * <li>Because of crossover operations performed during evolution, the
136 * resulting programs can grow quite big. To prevent an unlimited growth of
137 * the program trees, we mark programs with more than _50_ nodes as
138 * invalid.</li>
139 * </ol>
140 *
141 * Now we are ready to put everything together:
142 *
143 * <pre>{@code
144 * public static void main(final String[] args) {
145 * final Engine<ProgramGene<Double>, Double> engine = Engine
146 * .builder(Polynomial::error, CODEC)
147 * .minimizing()
148 * .alterers(
149 * new SingleNodeCrossover<>(),
150 * new Mutator<>())
151 * .build();
152 *
153 * final ProgramGene<Double> program = engine.stream()
154 * .limit(500)
155 * .collect(EvolutionResult.toBestGenotype())
156 * .gene();
157 *
158 * System.out.println(Tree.toString(program));
159 * }
160 * }</pre>
161 *
162 * The GP is capable of finding the polynomial which created the sample data.
163 * After a few tries, we got the following (correct) output program:
164 *
165 * <pre>
166 * add
167 * ├── mul
168 * │ ├── x
169 * │ └── sub
170 * │ ├── 0.0
171 * │ └── mul
172 * │ ├── x
173 * │ └── sub
174 * │ ├── sub
175 * │ │ ├── sub
176 * │ │ │ ├── sub
177 * │ │ │ │ ├── 3.0
178 * │ │ │ │ └── x
179 * │ │ │ └── x
180 * │ │ └── x
181 * │ └── x
182 * └── x
183 * </pre>
184 *
185 * This program can be reduced to <em>4*x^3 - 3*x^2 + x</em>, which is exactly
186 * the polynomial, which created the sample data.
187 *
188 * @author <a href="mailto:franz.wilhelmstoetter@gmail.com">Franz Wilhelmstötter</a>
189 * @version 3.9
190 * @since 3.9
191 */
192 package io.jenetics.prog;
|