











The aim of this project is to investigate approaches and algorithms needed to develop
an intelligent web browser that is able to adjust automatically to user preferences.
The project focuses on the use of Decision Tree learning to create models of web
users.
The learning objectives of the project are:
 Learning the basics of Information Retrieval and Machine Learning
 Gaining experience in using recent software
applications in these areas and
 Better understanding of fundamental AI concepts such as Knowledge Representation
and Search.













The students should have a basic knowledge of algebra, discrete mathematics and
statistics. Another prerequisite is the data structures course. While not necessary,
experience with Java would be of help as the basic tool needed for this project
 the Weka Machine Learning system  is implemented in Java. Before starting the
project, students may want to cover the recommended reading so that they understand
better the fundamental concepts of Information Retrieval and Machine Learning.
In support of the exercises and project, students should download Weka which is
available at
http://www.cs.waikato.ac.nz/~ml/weka/index.html and a text corpus
analysis package such as TextSTAT, a freeware software available from
http://www.niederlandistik.fuberlin.de/textstat/softwareen.html.













For an introduction to machine learning and to decision tree learning, students
can read the corresponding chapter in any good AI book. For example, one might assign
chapter 18 of:
To understand the basic concepts of Information Retrieval, Web Search and Web document
classification the students are encouraged to read chapters 3 and 5 of:
 Soumen Chakrabarti, Mining the Web  Discovering Knowledge from Hypertext
Data, Morgan Kaufmann Publishers, 2002.
The following book is recommended for the basic principles of Machine Learning and
practical information about the Weka Machine Learning system:
 Ian H. Witten and Eibe Frank, Data Mining: Practical Machine Learning
Tools and Techniques with Java Implementations, Morgan Kaufmann, 2000.
Chapters 2, 3 and 4 give important background information. Chapter 8 which is also
available online at
http://weka.sourceforge.net/wekadoc/index.php/en%3APrimer
discusses the use of the Weka system. Students should read chapter 8. Students are
required to install and use the Weka system. They are also encouraged to experiment
with the examples provided in the book and in the software package.













The detailed project description is available in the PDF file UserProfiling.pdf. You will need the free Adobe Acrobat Reader to view this file.


This project is customizable to accommodate different approaches to teaching and different implementations. Additional exercises are also included for students seeking more extended challenges.













