Data Mining and Knowledge Discovery (KSE525)

Spring 2019


  1. Instructor:
    Jae-Gil Lee (Office: E2 2203, Phone: x 1617, E-mail: jaegil(at)kaist.ac.kr)
  2. Time and Place:
    09:00 a.m. ~ 10:30 a.m. Monday and Wednesday, E11 103
  3. KakaoTalk Open Chat Room: 2019S KSE525/IE646 
  4. Course Summary:
    Data mining plays an important role in discovering useful knowledge from huge amounts of data. This course teaches the basic concepts and methods of data mining. More specifically, frequent patterns and associations; classification and prediction; and cluster analysis will be covered. The main goal of this course is to give the students a broad knowledge of various data mining methods without confining to a specific domain. This course is intended as a prerequisite for advanced data mining courses and thus is suitable for both undergraduate and graduate students. The students will understand how data mining can be exploited for discovering useful knowledge.
  5. Textbooks:
    • Main textbook: Jiawei Han, Micheline Kamber, and Jian Pei, Data Mining: Concepts and Techniques, 3rd edition, Morgan Kaufmann, 2011.
    • Auxiliary textbook: Yuxi Liu, Python Machine Learning By Example: The Easiest Way to Get into Machine Learning, Packt Publishing, 2017.
    • Auxiliary textbook: John D. Kelleher, Brian Mac Namee, and Aoife D'Arcy, Fundamentals of Machine Learning for Predictive Data Analytics, MIT Press, 2015.
  6. Grading Policy:
    • Midterm exam: 30%
    • Final exam: 40%
    • Assignments: 20% (latency penalty: 20%)
    • Project: 10%
    • Class participation: optional (deduct 1 point for each absence after 3 absences)
  7. Teaching Materials:
    • Introduction (February 25, 27): download
    • Getting to Know Your Data (March 4, 6): download
    • Data Preprocessing (March 11, 13): download
    • Association Analysis (Basic) (March 18, 20, 25): download
    • Association Analysis (Advanced) (March 27): download
    • Data Mining in Python (April 1, 3): download
    • Classification (Decision Tree) (April 8, 10, 22): download
    • Classification (Bayes, Lazy) (April 24): download
    • Classification (SVM, Ensemble) (April 29, May 1, 8): download
    • Clustering (Basic 1) (May 13, 15): download
    • Clustering (Basic 2) (May 20, 22, 27): download
    • Case Studies (May 29, June 3): TBD
    • Conclusion (June 5): TBD
  8. Additional Materials: full list
  9. Online Lectures:
    The students who enrolled in this course can watch the video lectures being recorded, which are available here.
  10. Assignments:
    • Assignment #1 (released, due: 11:59 p.m. on March 27): Available on KLMS
    • Assignment #2 (released, due: 11:59 p.m. on April 10): Available on KLMS
    • Assignment #3 (released, due: 11:59 p.m. on May 15): Available on KLMS
    • Assignment #4 (released, due: 11:59 p.m. on May 29): Available on KLMS
  11. Project: TBD (release: May 29)
  12. Teaching Assistants:
    • Susik Yoon (E-mail: susikyoon(at)kaist.ac.kr)
    • Sejin Kim (E-mail: ksj614(at)kaist.ac.kr)
    • Minseok Kim (E-mail: minseokkim(at)kaist.ac.kr)
  13. Syllabus: download