Most approaches to machine learning assume the knowledge to be learned can be expressed as a set of attribute-value pairs, i.e., an entity and its properties. However, much richer knowledge can be found in the relationships between multiple entities. For example, computational chemistry applications look for patterns in the relationships between atoms and molecules, not just patterns in the properties of a set of atoms. Such analysis can discover chemical structures (e.g., a benzene ring); whereas, attribute-value-based learning methods cannot express such knowledge. Relational learning is an area within machine learning that specifically addresses the challenge of learning relational knowledge.

There are basically two approaches to learning relational knowledge, and they center around the two main representations for relational knowledge: first-order logic and graphs. First-order logic is vastly more expressive than propositional logic, which is the representation of attribute-value-based learning methods. Several methods exist for learning relational knowledge in the form of first-order logic, and these methods fall under the area of Inductive Logic Programming (ILP). Graphs (i.e., a collection of nodes and links between nodes) are not quite as expressive as first-order logic, but they are an efficient and intuitive representation for relational knowledge. Several methods exist for learning relational knowledge in the form of graphs, and these methods fall under the area of Graph-based Data Mining or Graph-based Relational Learning.

This project will explore the challenge of learning relational knowledge by experimenting with two relational learning methods, one logic based and one graph based. In addition to simple experiments with these learners, the project will apply the methods to an important problem in computational chemistry; namely, learning structures common to chemical carcinogens.
This project has the following objectives:
  • Study the area of relational learning, specifically the two main approaches to relational learning based on a logic-based and graph-based representation of knowledge.
  • Gain experience with two relational learning systems.
  • Apply relational learning to a real-world problem in computational chemistry.
Students should have a basic knowledge of discrete structures and logic. They will also need the ability to compile and run C code in a UNIX environment. Prior knowledge in first-order logical inference and propositional logic-based (attribute-value-based) learning methods is recommended, but not essential to completing this project. The project will require students to download, compile and execute the PROGOL ILP system and the SUBDUE graph-based relational learning system, which are available at the following URLs:
Most of the background necessary for completing this project can be found in the following text. Specifically, chapters 8 and 9 cover first-order logic and inference. Chapter 18 covers propositional learning approaches, and chapter 19 covers approaches to relational learning:
Specific background on the two relational learning systems (PROGOL and SUBDUE) can be found in the following references:
The detailed project description is available in the PDF file LearningRelationalKnowledge.pdf. You will need the free Adobe Acrobat Reader to view this file.
This project is customizable to accommodate different approaches to teaching and different implementations. Additional exercises are also included for students seeking more extended challenges.
A sample syllabus used at Washington State University when this project was assigned is available at:
Syllabus for AI Course at Washington State University

Additional readings are included in the Background section above.