









Overview
Most approaches to machine learning assume the knowledge to be learned can be expressed as a set of attributevalue pairs, i.e., an entity and its properties. However, much richer knowledge can be found in the relationships between multiple entities. For example, computational chemistry applications look for patterns in the relationships between atoms and molecules, not just patterns in the properties of a set of atoms. Such analysis can discover chemical structures (e.g., a benzene ring); whereas, attributevaluebased learning methods cannot express such knowledge. Relational learning is an area within machine learning that specifically addresses the challenge of learning relational knowledge.
There are basically two approaches to learning relational knowledge, and they center around the two main representations for relational knowledge: firstorder logic and graphs. Firstorder logic is vastly more expressive than propositional logic, which is the representation of attributevaluebased learning methods. Several methods exist for learning relational knowledge in the form of firstorder logic, and these methods fall under the area of Inductive Logic Programming (ILP). Graphs (i.e., a collection of nodes and links between nodes) are not quite as expressive as firstorder logic, but they are an efficient and intuitive representation for relational knowledge. Several methods exist for learning relational knowledge in the form of graphs, and these methods fall under the area of Graphbased Data Mining or Graphbased Relational Learning.
This project will explore the challenge of learning relational knowledge by experimenting with two relational learning methods, one logic based and one graph based. In addition to simple experiments with these learners, the project will apply the methods to an important problem in computational chemistry; namely, learning structures common to chemical carcinogens.













This project has the following objectives:
 Study the area of relational learning, specifically the two main approaches to relational learning based on a logicbased and graphbased representation of knowledge.
 Gain experience with two relational learning systems.
 Apply relational learning to a realworld problem in computational chemistry.













Students should have a basic knowledge of discrete structures and logic. They will also need the ability to compile and run C code in a UNIX environment. Prior knowledge in firstorder logical inference and propositional logicbased (attributevaluebased) learning methods is recommended, but not essential to completing this project. The project will require students to download, compile and execute the PROGOL ILP system and the SUBDUE graphbased relational learning system, which are available at the following URLs:













Most of the background necessary for completing this project can be found in the following text. Specifically, chapters 8 and 9 cover firstorder logic and inference. Chapter 18 covers propositional learning approaches, and chapter 19 covers approaches to relational learning:
Specific background on the two relational learning systems (PROGOL and SUBDUE) can be found in the following references:













The detailed project description is available in the PDF file LearningRelationalKnowledge.pdf. You will need the free Adobe Acrobat Reader to view this file.


This project is customizable to accommodate different approaches to teaching and different implementations. Additional exercises are also included for students seeking more extended challenges.













