This course is about a large variety of methods for multivariate analysis and multidimensional data analysis. The first part (six course-days) deals with the analysis of measurements for N objects (persons) on P variables (attributes), and we typically wish to understand the relationships between those objects and variables. The data are usually given in one or more multivariate data matrices. The course extends classical approaches to multivariate analysis in various ways. We will not only deal with numeric, but also with categorical (both nominal and ordinal) multivariate data. In addition, we will be able to deal with nonlinear relationships between variables. Both extensions are part of the same optimal quantification/nonlinear transformation framework. Key concepts are dimension reduction and visualization (in principal components and correspondence analysis), and prediction and regularization (in multiple regression analysis).
The second part of the course (two course-days) is about a very important group of multidimensional techniques for the analysis of proximity data between objects (given in one or more N by N matrices) and preference data between row objects and column objects (in one or more N by M matrices). For the analysis of proximities and preferences, we use the terms multidimensional scaling and multidimensional unfolding, respectively. Here dimension reduction and visualization are of utmost importance by definition, while nonlinear transformations also play an important part.
The third part of the course (six course-days) will focus on classification methods. Here the interest is primarily in the question whether we can predict the class an object (subject, person) belongs to from a predefined set of classes given a set of explanatory variables. Two methods will be presented in detail: discriminant analysis and multinomial logistic regression. For both, dimension reduction will be discussed. Dimension reduction can be performed in a distance framework or in an inner product framework. These methods will be presented, and students will also learn how to program them in R.
Next to R, the first two parts of the course will also use the IBM-SPSS package CATEGORIES, which has been developed in Leiden.
For the course days, course location and class hours check the Time Table 2013-14 under the tab “Masters Programme” at http://www.math.leidenuniv.nl/statscience
Mode of Instruction
The course consists of 2 course-days per week. Each course-day contains a two-hour lecture and a two-hour practical.
Assessment will be based on a written exam (50%) and 5 assignments (50%).
There will be 5 take home assignments. For each assignment you can earn either 2 points or 0 points. When the assignment is delivered one week to late, only 1 or 0 points can be earned.
The date for the written exam is scheduled on June 6, 2014 from 14.00 to 17.00 (room is tba), the resit is scheduled on July 8, 2014 from 10.00 to 13.00 (room is tba).
The written exam is a closed exam. Books, laptop, internet or any other sources of external information are not allowed during the exam.
Reading material will be announced at the start of the course.
Apart from registration for the (re-)exams in uSis, course registration via blackboard is obligatory.
Exchange and Study Abroad students, please see the Prospective students website for information on how to apply.
jmeulman [at] math [dot] leidenuniv [dot] nl
- This is a compulsory course in the Master’s programme of the specialisation Statistical Science for the Life & Behavioural sciences.