The student is expected to be familiar with key concepts of data mining or machine learning at the Bachelor’s level, and have been exposed to basic statistics. Necessary key concepts in data mining include classification, regression, clustering, cross-validation, overfitting, sampling, feature construction, ROC-curves. The Leiden CS Bachelor’s course on Data Mining provides such basic understanding, but many similar courses exist elsewhere.
The course aims to outline the (growing) role of data science in sports. Through various technological advances over the last years, it’s becoming easier to collect substantial datasets in sports, and practitioners are beginning to see the added benefit of collecting and analysing such data to achieve their goals. These goals might include optimizing one’s training efforts, personalising training schedules to the individual needs, understanding injury incidence and its prevention, optimizing the strategy in team sports and making data-driven tactical decisions. The course offers an introduction into the key concepts of sports science (including physiology and testing methods), an understanding of the basic questions to be answered in a range of sports disciplines, what types of data to expect or collect, and the specific data science techniques that are applied to these settings. The data science aspect of the course will focus on time series techniques, feature construction, aggregation, and subgroup discovery.
The course aims to provide the student with sufficient knowledge of both sports science as well as specific data science techniques to start a career in the field of sports, for example at a specific sports federation, at a commercial club (e.g. football, cycling, speed skating) or as a consultant across sports disciplines. The student should be able to execute a data science project in a given sports discipline independently.
The course will run for the entire Spring semester, running from February to June. In a weekly timeslot, lectures will be provided, papers will be discussed, training facilities will be visited and you will work on a hands-on data science project in a specific sport.
Mode of instruction
The course will offer a rich mixture of instruction modes, including:
introductory lectures on sports science and specific data science techniques
seminar-like presentation and discussion of selected papers
guest lectures from invited experts from the fields of football, tennis, speed skating, volleyball, …
field trips to various training locations, including Thialf (largest Dutch ice stadium), KNVB campus (national football facility) and the Amsterdam Human Performance Lab
data science exercise based on real data from elite sports (in teams)
The grading of the course will be based on the following components:
written exam (multiple choice), 40%
paper presentation and participation, 20%
data science exercise execution, 40%
To be determined.
Students will have to register in advance for this course. At most 25 students can participate each year. In the case more than 25 students register for the course, a ranking based on prior education and current programme will determine who can be admitted to the course. The following elements (in order of importance) will be taken into consideration:
Enrolled in Leiden MSc CS with Data Science specialisation as primary programme.
Leiden BSc CS/I&E/BioInf completed cum laude.
Leiden BSc CS/I&E/BioInf completed.
Leiden Minor Data Science completed.
Enrolled in Data Science programme from other university
dr. Arno Knobbe (primary contact)
dr. Rens Meerhoff
dr. Arie-Willem de Leeuw
Further details will follow in Fall 2020