Analytical Methodologies for Big Data (KSE526)

Fall 2018

  1. Instructor:
    Jae-Gil Lee (Office: E2 2203, Phone: x 1617, E-mail: jaegil(at)
  2. Time and Place:
    10:30 a.m. ~ 12:00 p.m. Monday and Wednesday, E2 1122
  3. Facebook Group: 2018 KSE526 
  4. Course Summary:
    This course discusses basic analytical methodologies for big data, which are vital to data scientists. Big data analytics calls for extending existing algorithms so that they can support big data. In this course, the instructor will first teach MapReduce, which is the representative framework of processing big data, and then the methodologies of extending data mining algorithms into MapReduce. The students will also learn how to implement those algorithms using Apache Hadoop. As a result, the students will achieve the basic capabilities needed to design the algorithms of big data analytics.
  5. Prerequisites:
    • Data Mining and Knowledge Discovery (KSE525) or equivalent course
    • Java programming skills: a programming intensive course
  6. Textbooks:
    • Main textbook: Tom White, Hadoop: The Definitive Guide, 4th edition, O'Reilly, 2015.
    • Main textbook: Mahmoud Parsian, Data Algorithms: Recipes for Scaling Up with Hadoop and Spark, O'Reilly, 2015.
    • Auxiliary textbook: Donald Miner and Adam Shook, MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems, O'Reilly, 2013.
  7. Course Requirements:
    • Three programming assignments
  8. Grading Policy:
    • Midterm exam: 30% (class time on October 17)
    • Final exam: 30% (class time on December 12)
    • Programming assignments: 30% (latency penalty: 20%)
    • Class activity (quizzes and/or sudden questions): 10%
    • Class participation: optional (deduct 1 point for each absence after 3 absences)
  9. Teaching Materials:
  10. Programming Assignments:
  11. Video Lectures:
    The students who enrolled in this course can watch the video lectures being recorded, which are available here.
  12. Teaching Assistants:
    • Hwanjun Song (E-mail: songhwanjun(at)
    • Sejin Kim (E-mail: ksj614(at)
    • Minseok Kim (E-mail: minseokkim(at)
  13. Syllabus: download