Studiegids

nl en

Statistical learning - old curriculum

Vak
2022-2023

Admission Requirements

  • Familiarity with the genarlized linear model

  • Ability to program in R (preferred) or in Python

  • Basic knowledge of university-level probability theory, calculus, and linear algebra

Description

Supervised statistical learning involves building a model for predicting an output (response, dependent) variable based on one or more input (predictor) variables. There are many areas where such a predictive question is of interest— - for example, Netflix recommendations, self-driving cars, predicting disease status/vulnerability and finding early markers of diseases.

In unsupervised statistical learning, there are only input variables but no supervising output (dependent) variable; nevertheless, we can learn relationships and structures from such data using cluster analysis and methods for dimension reduction.

This course provides a firm theoretical basis for understanding and evaluating statistical learning techniques and teaches the skills to apply and evaluate them.
Statistical Learning is very similar to an area within computer science called "machine learning" since many methods have their origin in computer science (pattern recognition, artificial intelligence).
The supervised learning methods discussed will include various classical and state-of-the-art classification methods: regularized regression (Ridge, Lasso, and other L1- methods), naive Bayes, decision trees, logistic regression, splines, random forests, support vector machines, and deep learning. We explain interrelations between these methods and analyze their behavior.
We will also discuss model selection, where we consider both classical and state of the art methods, including various forms of cross-validation and permutation tests.
With regard to unsupervised learning, we consider methods for clustering (i.e., the classic k-means but also more advanced methods) and dimension reduction methods (like PCA, ICA and others).

Course objectives

An introduction to Statistical Learning.

Timetable

You will find the timetables for all courses and degree programmes of Leiden University in the tool MyTimetable (login). Any teaching activities that you have sucessfully registered for in MyStudyMap will automatically be displayed in MyTimeTable. Any timetables that you add manually, will be saved and automatically displayed the next time you sign in.

MyTimetable allows you to integrate your timetable with your calendar apps such as Outlook, Google Calendar, Apple Calendar and other calendar apps on your smartphone. Any timetable changes will be automatically synced with your calendar. If you wish, you can also receive an email notification of the change. You can turn notifications on in ‘Settings’ (after login).

For more information, watch the video or go the the 'help-page' in MyTimetable. Please note: Joint Degree students Leiden/Delft have to merge their two different timetables into one. This video explains how to do this.

Mode of Instruction

Lectures and computer practicals. We will use Brightspace to share all course material.

Assessment method

The final grade is based on (each with a weight of 1/3):

1) a written structured assignment (individual, half way the course)
2) a written structured assignment (individual, at the end of the course)
3) oral presentation regarding the analysis of a data set of students’ own choice (in group, at the end of the course)

Students receive (during the lecture) feedback on the assignments and the oral presentation.

Reading list

Main book:

T. Hastie, R. Tibshirani, J. Friedman (2009). The Elements of Statistical Learning, (2nd edition) (available for free at https://web.stanford.edu/~hastie/Papers/ESLII.pdf)

Additional resources:

  • Bishop, C. M. (2006). Pattern recognition and machine learning (1st edition). Springer.

  • Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press.

  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press. (available for free at http://www.deeplearningbook.org/)

  • Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: Support vector machines, regularization, optimization, and beyond. MIT press.

  • Vapnik, V. N. (1998). Statistical learning theory. Wiley.

Registration

From the academic year 2022-2023 on every student has to register for courses with the new enrollment tool MyStudyMap. There are two registration periods per year: registration for the fall semester opens in July and registration for the spring semester opens in December. Please see this page for more information.

Please note that it is compulsory to both preregister and confirm your participation for every exam and retake. Not being registered for a course means that you are not allowed to participate in the final exam of the course. Confirming your exam participation is possible until ten days before the exam.

Extensive FAQ's on MyStudymap can be found here.

Contact

Julian Karch: j.d.karch@fsw.leidenuniv.nl

Remarks