CS780 / CS880: Introduction to Machine Learning

When and Where

Tue & Thu, 12:40 pm - 2:00 pm Kingsbury N133

See class overview for more information on textbooks, syllabus, assignments, office hours, and grading.

Assignments

Please use Piazza for questions about assignments.

Assignment Due Date
Assignment 1 2/14/17 at 12:40PM
Assignment 2 2/21/17 at 12:40PM
Assignment 3 3/09/17 at 12:40PM
Assignment 4 4/06/17 at 12:40PM
Assignment 5 4/20/17 at 12:40PM

Syllabus

Date Slides Reading Notebooks
1/26 Statistical learning ISL 1,2 (html) (RMD)
1/31 Linear regression I ISL 3.1-2 (html) (RMD)
2/02 No class
2/07 Linear regression II ISL 3.3-6
2/09 No class
2/14 Logistic regression ISL 4.1-3 (html)(RMD)
2/16 LDA, QDA, Bayes ISL 4.4-6
2/21 Cross-validation ISL 5
2/23 Model selection ISL 6.1-6.2
2/28 Dimensionality ISL 6.3-6.4
3/2 PCA ML/MAP ISL 10.1-2 ML PCA
3/6 Clustering and EM ISL 10.3-5 kmeans
3/9 Midterm Review ISL 1-6, 10
3/21 ** Midterm **
3/23 Linear algebra LAO 1.1-2,2,3
3/28 LA in ML LAR linear algebra
3/30 LA in ML LAR linear algebra
4/04 SVM ISL 9
4/06 Decision trees and boosting ISL 8
4/11 Nonlinear methods ISL 7
4/13 Recommender systems
4/18 Bayes nets MLP 10
4/20 Reinforcement learning RL
4/25 Final exam review
4/27 Project presentations (Graduate)
5/02 Deep learning and big data DL
5/04 Project presentations (Undergraduate)

Project

See the project overview for details on the details of deliverables. The deliverable are due by the end of the day (midnight).

Date Deliverable Page Limit
2/24 Project description and data sources 1
3/07 Evaluation methodology 1
3/23 Method and literature overview 2
4/06 Preliminary results 3
4/27 Final report 7

Exams

See practice questions for questions you should be able to answer to be ready for the midterm and final exams.

Date Exam
3/21 Midterm (take home)

Textbooks

Main reference:

ISL: James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning

More in-depth material:

ESL: Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning. Springer Series in Statistics (2nd ed.)

See class overview for more information on the textbook.

Class Content

The goal of this class is to teach you how to use machine learning to understand data and make predictions in practice. The class will cover the fundamental concepts and algorithms in machine learning and data science as well as a wide variety of practical algorithms. The main topics we will cover are:

  1. The maximum likelihood principle
  • Regression: Linear regression
  • Classification: Logistic regression and linear discriminant analysis
  • Cross-validation, bootstrap, and over-fitting
  • Model selection: Regularization, Lasso
  • Nonlinear models: Decision trees, Support vector machines
  • Unsupervised: Principal component analysis, k-means
  • Advanced topics: Bayes nets and deep learning

The graduate version of the class will cover the same topics in greater depth.

Programming Language

The class will involve hand-on data analysis using machine learning methods. The recommended language for programming assignments is R which is an excellent tool for statistical analysis and machine learning. No prior knowledge of R is needed or expected; the book and lecture will cover a gentle introduction to the language. Experienced students may also choose other alternatives, such as Python or Matlab.

Pre-requisites

Basic programming skills (scripting languages like Python are OK) and some familiarity with statistics and calculus. If in doubt, please email me.