As computers and the Internet become increasingly popular, malicious activities in the cyberspace have increased significantly. Intrusion detection is an area of computer security that focuses on detecting these attacks reliably. Intrusion detection systems (IDS) usually have a knowledge base containing rules that characterize attacks. Building such knowledge base manually can be time consuming. Machine learning can help build such knowledge base in a more efficient manner.

In order to detect attacks, we need to differentiate between instances of normal and attack behavior. Based on previous instances of normal and attack behavior, a machine learning algorithm can gain the knowledge on how to differentiate between the two types of behavior and represent the knowledge in a form than can be used to predict if current instances are malicious or not.

This project aims to investigate machine learning techniques for detecting attacks/intrusions.
More specifically, the objectives are:
  • machine learning can be achieved from historical data (experience)
  • machine learning algorithms can be applied to computer security
  • understanding the learning task of trying to detect attacks
  • understanding a decision-tree learning algorithm
  • a better understanding of search and knowledge representation
  • evaluation of machine learning algorithms

Students should understand fundamental computer science concepts in Data Structures and Algorithms (CSE 2010) before taking Introduction to Artificial Intelligence (CSE 4301).

In the AI course the students should have learned the fundamental concepts in search, knowledge representation, and a decision-tree learning algorithm, for example, (Russell & Norvig, 2003, p. 653-660).

The decision-tree learning algorithm in (Russell & Norvig, 2003, p. 653-660) is based on Quinlan's (1986) ID3 algorithm. Given a dataset with each data instance labeled with a class, the algorithm recursively finds an attribute that can "best" split the instances into homogeneous subsets with respect to the class labels. The learned tree can then be used to predict class labels of instances that are not used during the learning process.

The detailed project description is available in the PDF file Machine Learning for Computer Seciruty.pdf. You will need the free Adobe Acrobat Reader to view this file.

Russell, S. & Norvig, P. (2003). Artificial Intelligence: A Modern Approach. Second Edition. Prentice Hall, Upper Saddle River, NJ.

Quinlan, J. R. (1986). Induction of Decision Trees. Machine Learning, 1:81-106.