Data Science emerged at the crossroads of many different fields, including statistics, machine learning, natural language processing, databases, and others. This course serves a dual purpose. On the one hand, several speakers will introduce students to a number of the topics that they will encounter during the data science master specialization or perhaps later as a professional data scientist. Company visits are included in the schedule, familiarising students with what they can expect in practice. On the other hand, the course teaches basic technical skills, namely, learning to do basic programming in Python and use the popular data analysis libraries.
The goal is to gain a better understanding of the very diverse field of data science, and to be able to write small but readable and robust Python programs to solve statistical problems.
Mode of Instruction
There are two kind of lectures: invited speakers and excursions are interwoven with technical lectures. After each technical lecture there is a practical session. During the practicals, homework in the form of a Jupyter notebook is distributed via Brightspace. The homework may include some additional theory, and can have both theoretical questions and practical exercises that have to be completed in the form of Python programs.
See the Leiden University students' website for the Statistical Science programme -> Schedules 2020-2021
Completion of the course depends on three factors: (1) attending the company visits and guest lectures, (2) scoring at least 5/10 on the homework problems and (3) scoring at least 5/10 on the written exam. On meeting these requirements, the final grade is determined as the average of the homework and exam grades. The written exam and homework problems require the use of a laptop.
If the exam does not take place in the Snellius building, then an announcement will be sent via Brightspace.
For succesful completion of the course, both the average homework grade and the exam grade should not be below 5.
There is no compulsory literature. The course involves programming in Python, and some statistical topics. For students who desire backup material, here are recommended textbooks for these three topics. (Consider that some of these textbooks may be compulsory reading for other courses.)
John A. Rice. Mathematical Statistics and Data Analysis. Brooks/Cole
Normal Matloff. The Art of R Programming. No Starch Press
Allen B. Downey. Think Stats. O'Reilly. (Freely available online.)
Enroll in Brightspace for the course materials and course updates.
To be able to obtain a grade and the ECTS for the course, sign up for the (re-)exam in uSis ten calendar days before the actual (re-)exam will take place. Note, the student is expected to participate actively in all activities of the program and therefore uses and registers for the first exam opportunity.
Exchange and Study Abroad students, please see the Prospective students website for information on how to apply.
Laura Zwep: l [dot] b [dot] zwep [at] lacdr [dot] leidenuniv [dot] nl
This is a compulsory course of the Master Statistical Science with the specialisation Data Science.