## Admission requirements

It is recommended that students are familiar with linear and generalized linear models, such as the logistic regression for binary data. Students should also be familiar with matrix algebra and programming in R. Within this master this prerequisite knowledge can be acquired from the courses 'Linear and generalized linear models', 'Mathematics for statisticians' and 'Statistical Computing with R'. Thus, it is strongly recommended to have followed these courses first.

## Description

Linear regression models and generalized linear models, such as the logistic regression model for binary data or the log-linear model for count data, are widely used to analyze data in a variety of applications. However, these models are only appropriate for independent data. In many fields of application dependent data may occur. For instance, when individuals belong to the same family or when data are collected repeatedly in time for the same subjects.

Introduction of random effects in the linear or generalized linear model is a simple and constructive expedient to generate feasible dependence structures. The extended classes of models are referred to as linear mixed models (LMMs) and generalized linear mixed models (GLMMs). The use of such models is the subject of this course. Competing models, where dependence is not modeled by introduction of extra random effects, will be discussed as well. Part of this course will focus upon analysis of repeated measurements or longitudinal data.

Inferential techniques comprise restricted (or residual) maximum likelihood (REML), a modified version of maximum likelihood, but also generalized estimation equations (GEE) that require less strenuous model assumptions.

In particular, the course consists of five main sections:

Marginal models

Linear mixed models

Generalized Estimating equations

Missing Data

Generalized Linear Mixed Models.

In this course, emphasis will be on gaining an understanding of the models and the kind of data that can be analyzed with these models. Different inferential techniques will be discussed, but without undue emphasis on mathematical rigor.

## Course Objectives

In general, when students are confronted with practical data they should be able: (1) to decide whether there is a need to model dependence between the data, (2) to decide upon a model with an appropriate dependence structure and (3) to perform a proper analysis.

At the end of the course, the MSc student can:

Distinguish which of the methods presented can be used for the analysis of normal longitudinal data and which for non-normally distributed measurements.

Identify and explain the limitations of simplistic analysis methods that ignore the correlations in repeatedly measured data.

Recommend which of the methods are appropriate in studies with unbalanced measurements and long follow-up or missing data.

Identify the different mechanisms that generate missing data and which of the discussed methods in this course give valid inference under the different mechanisms.

Explain the pros and cons of a population averaged approach versus a subject specific approach in modelling dependent discrete data.

List the strengths and limitations of various estimation procedures for generalized linear mixed models.

Identify which are the hypotheses of interest, which model parameters are involved in these hypotheses, and which tests are appropriate.

Be able to identify, for a practical problem, which factors and variables should be in the model and whether they should be represented by fixed or random effects.

Determine a proper strategy for model building.

Apply tools to evaluate the validity of the model assumptions.

Use statistical software, e.g., R, to perform an analysis with multivariate models and generalized linear mixed models, using the (Restricted) Maximum Likelihood or Generalized Estimating Equations method.

Interpret the output from the software in terms of the practical problem.

Be able to interpret fixed and random effects in terms of population means and dependence structures.

Be able to decide what kind of test is required for testing fixed effects (Kenward & Roger Approximate F-test) or dispersion parameters (likelihood ratio test) for unbalanced data.

Be aware of possible boundary problems and remedies in testing dispersion parameters with the likelihood ratio test.

## Timetable

You will find the timetables for all courses and degree programmes of Leiden University in the tool MyTimetable (login). Any teaching activities that you have sucessfully registered for in MyStudyMap will automatically be displayed in MyTimeTable. Any timetables that you add manually, will be saved and automatically displayed the next time you sign in.

MyTimetable allows you to integrate your timetable with your calendar apps such as Outlook, Google Calendar, Apple Calendar and other calendar apps on your smartphone. Any timetable changes will be automatically synced with your calendar. If you wish, you can also receive an email notification of the change. You can turn notifications on in ‘Settings’ (after login).

For more information, watch the video or go the the 'help-page' in MyTimetable. Please note: Joint Degree students Leiden/Delft have to merge their two different timetables into one. This video explains how to do this.

## Mode of instruction

The material will be covered using lectures, quizzes and practical sessions. The course is given in a blended learning style integrating online media as well as traditional face-to-face on campus teaching:

During the lectures the theory will be covered and worked-out examples will be discussed. The lectures will be given mainly with online media combined with quizzes and face-to-face teaching sessions where a short review of the material will be provided followed by questions and discussions on the covered topics.

During the practical sessions, the theory covered will be applied by analysing real datasets. Questions on the online components and the practicals may be posted online on the forum before each face-to-face teaching session.

Lecture notes are leading and worked-out case studies in R are given with solutions for self-study. Some books are suggested (optional) for further details. Study material, including data sets for the case studies mentioned, is available on Brightspace.

About halfway down the course students will start working in groups on case studies that are handed out, under supervision of the teacher. Each group of students will hand in a written report about their case study. This report will be graded and together with the grade of the written exam determines the final grade of an individual student.

## Assessment method

A written exam (2/3) with open questions and case study report (1/3). The case study report and the written exam should each be assessed with a minimum grade of 5 to obtain the course credits. The final grade should be at least 5.5 (which will be rounded to 6) to get a pass. Students may take a written re-exam following the university rules. Unless the student decides to follow the course again in a next year, the final grade for the case study is binding. The date for handing in the case study report will be agreed upon during the course.

## Reading list

The following books are occasionally referred to for further reading, but they are not compulsory reading for the exam.

Fitzmaurice, G., Laird, N., and Ware, J. (2011). Applied Longitudinal Analysis, 2nd Ed. Hoboken: John Wiley & Sons.

Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. New York: Springer-Verlag.

Diggle, P., Heagerty, P., Liang, K.-Y., and Zeger, S. (2002). Analysis of Longitudinal Data, 2nd edition. New York: Oxford University Press.

Faraway (2006). Extending the linear model with R. generalized linear, mixed effects and nonparametric regression models. Chapman & Hall/CRC.

McCulloch, Searle & Neuhaus (2008) Generalized, linear and mixed models. Wiley Blackwell.

The first two books are indicative for the applied level of this course. The third and fifth books are more technical and intended as reference. The Faraway book is relevant for the course about linear and generalized linear models, as well. These books are occasionally referred to for further reading, but they are not compulsory reading for the exam.

## Registration

From the academic year 2022-2023 on every student has to register for courses with the new enrollment tool MyStudyMap. There are two registration periods per year: registration for the fall semester opens in July and registration for the spring semester opens in December. Please see this page for more information.

Please note that it is compulsory to both preregister and confirm your participation for every exam and retake. Not being registered for a course means that you are not allowed to participate in the final exam of the course. Confirming your exam participation is possible until ten days before the exam.

Extensive FAQ's on MyStudymap can be found here.

## Contact

s.tsonaka@lumc.nl

## Remarks

--