Data mining tries to extract interesting patterns from large amounts of data. In this course, a number of basic concepts from statistics will first be discussed and then the following topics will be discussed:
Underlying principles of data mining algorithms and their application: data collection, visualization, data analysis and uncertainty.
Data mining: algorithms, models and patterns, scoring functions, search and optimization methods, descriptive modelling, classification, regression.
Data mining problems from practice.
Programming assignments are part of the lecture.
The aim is to gain insight into the data mining process and being able to apply data mining algorithms and tools.
You will find the timetables for all courses and degree programmes of Leiden University in the tool MyTimetable (login). Any teaching activities that you have sucessfully registered for in MyStudymap will automatically be displayed in MyTimetable. Any timetables that you add manually, will be saved and automatically displayed the next time you sign in.
MyTimetable allows you to integrate your timetable with your calendar apps such as Outlook, Google Calendar, Apple Calendar and other calendar apps on your smartphone. Any timetable changes will be automatically synced with your calendar. If you wish, you can also receive an email notification of the change. You can turn notifications on in ‘Settings’ (after login).
For more information, watch the video or go the the 'help-page' in MyTimetable. Pleas note: Joint Degree students Leiden/Delft have to merge their two different timetables into one. This video explains how to do this.
Mode of Instruction
The final grade is determined by three components:
a final multiple-choice exam (70% of the course grade).
an individual practical assignment in the form of a machine learning competition (15% of the course grade). The grade for the assignment is determined based on the student’s success in the competition, with participation leading to at least a 6.
an individual practical assignment (15%) using the RapidMiner platform.
In order to pass the course, all three components need to have a grade of at least a 5 and the weighted average needs to be at least 5.5. Graded components from previous year can be carried over to this year upon request. No grade will be registered in uSis until all three components are completed with at least a 5. For each component, a regular re-sit opportunity exists.
- Data Mining. Practical Machine Learning Tools and Techniques (Third Edition), Morgan Kaufmann, January 2011, ISBN 978-0-12-374856-0.
From the academic year 2022-2023 on every student has to register for courses with the new enrollment tool MyStudymap. There are two registration periods per year: registration for the fall semester opens in July and registration for the spring semester opens in December. Please see this page for more information.
Please note that it is compulsory to both preregister and confirm your participation for every exam and retake. Not being registered for a course means that you are not allowed to participate in the final exam of the course. Confirming your exam participation is possible until ten days before the exam.
Extensive FAQ on MyStudymap can be found here.