Overview

Genetic programming (GP) is perhaps the most general of local search algorithms. It is particularly useful in solving design and optimization problems. Strengths of GP include the ability to work with heterogeneous data, and that a relatively low amount of information needs to be specified to achieve success. A number of patents now exist that were achieved using genetic programming. For this reason GP has been touted as having the ability to develop solutions to problems at a level comparable to humans — or, to exhibit human-competitive intelligence. It has been applied to a wide range of real-world problems, including antenna design, analog and digital circuit design, financial analysis, bioinformatics, robotics and games. A general-purpose genetic programming framework could therefore be called a general-purpose problem-solver.

Machine learning (ML) is defined by Mitchell as “the study of computer algorithms that improve automatically through experience”. A much more romantic and elusive definition is due to Samuel, who used the phrase to mean computers programming themselves. Genetic programming aspires to do exactly that by inducing a population of computer programs that improve automatically as they experience the data on which they are trained. Accordingly, GP is part of the very large body of research called machine-learning.
The goal of this project is to learn about machine learning (problem formulation, search, and knowledge representation) by building a basic genetic programming framework and using it to solve problems. The framework will be build piece-by-piece, as concepts are introduced. Once the framework is fully functional, students will formulate a machine learning problem of interest, cast it as a GP problem, and use the framework to solve it. Suggestions and pointers are given in the relevant section. The learning objectives of the project are:
  • Understanding the basics of evolutionary search.
  • Understanding search operators.
  • Understanding the capabilities of genetic programming.
  • Understanding the importance of proper knowledge representation.
  • Learning to formulate problems for machine-learning and cast them as GP problems.
The project consists of deliverables that help incrementally build a GP framework, and finally solve a problem using it.
Students should have basic knowledge of algebra and statistics. Proficiency in data structures and a high-level programming language, such as C/C++, Java, or Lisp, is required.
For an introduction to evolutionary search/genetic algorithms, students can read the corresponding chapter in any good AI book. For example, one might assign chapter 4 of:
More in-depth readings:
  • Banzhaf, W., P. Nordin, R. E. Keller, F. D. Francone. Genetic Programming: An introduction. Morgan Kaufmann Publishers, 1998. ISBN-13: 978-1-55860-510-7.
  • Koza, John R. Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge, MA: The MIT Press 1992.
  • Langdon, W.B. Genetic Programming and Data Structures: Genetic Programming + Data Structures = Automatic Programming! Springer, 1998.
  • Langdon, W.B., and P. Poli. Foundations of Genetic Programming. Springer, 2002.
  • Mitchell, T. Machine Learning. McGraw-Hill, New York, 1996.
  • Goldberg, D. E. Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, MA: Addison-Wesley 1989.
The detailed project description is available in the PDF file GPPSProject.pdf. You will need the free Adobe Acrobat Reader to view this file.
This project is customizable to accommodate different approaches to teaching and different implementations. Additional exercises are also included for students seeking more extended challenges.
A sample syllabus used at Oklahoma State University when this project was assigned is available at:
Syllabus for AI Course at Oklahoma State University

Additional readings are included in the Background section above.