COSC 285: Data Mining

Course Description:

This course covers concepts and techniques in the field of data mining. This includes both supervised and unsupervised algorithms. Various issues in the pre-processing of the data are addressed. The students learn the material by building various data mining models and using various data preprocessing techniques, performing experimentations and provide analysis of the results.

Prerequisite:

Data Structures and comfortable programming knowledge!

Textbook:

You should have one of these textbooks– your choice to choose! Both are very good books:

  • Jiawei Han and Micheline Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann
  • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison Wesley

Teaching Assistant:

TBD

Office hours: see Blackboard.

Grading & Due Dates (Tentative- will be finalized by the 1st day of the class):

Projects (Individual)

40%

3-4 projects – The outcome of each project is a data mining engine .

(TBD)Research Paper Presentation

(TBD)%10

.

Exams (2-3 exams)

(TBD) 50%-60%

Tentative Course Outline:

Introduction to Data Mining: Knowledge Discovery, Data Warehousing, Data Mining

Data preprocessing

Intro to Classification

Evaluation

Naive Bayes

Neural Networks

Decision Tree

Rule Based Classification

K-Nearest Neighbor

Support Vector Machine

Ensemble Methods

Association rules

Cluster analysis

Text Categorization

Students Presentations

Late Assignment Policy:

Will be posted on the class syllabus by the first week of each semester.

Academic Integrity:

Visit the Honor System Website at http://gervaseprograms.georgetown.edu/honor/