## Prerequisites

a basic understanding of introductory statistical concepts and some familiarity with R as taught in Inleiding Mathematische Statistiek.

## Description

An overview about each of the four topics topic presented in this course is given here below

### Safe Testing (Prof. Dr. P. D. Grünwald).

In traditional hypothesis testing, the sample size or at least the sampling protocol must be

determined in advance. In practice, it is desirable to use more flexible stopping rules.

Researchers do this even though the methods do not allow for it, leading to false results

appearing in the literature. We will outline some exciting recent techniques that can guarantee

small error probabilities with 'optional stopping' after all. The underlying mathematics builds upon

the insight that, in a casino, you do not expect to get rich, no matter what is your rule for

continuing to gamble or going home.

### Bayesian methods (Dr. M. A. Hadji)

Bayesian inference is based on the Bayesian interpretation of probabilities. In Bayesian statistics,

we assume the parameter is a random variable which we endow with our prior belief. The data

will update our belief about the parameter through the computation of a posterior distribution. It

can be difficult to directly access the posterior distribution. In these cases, it is common to use

Markov chains Monte Carlo (MCMC) methods. The most common choices of priors in wellknown

models will be presented. Some MCMC methods to sample from the posterior will be

introduced.

### Survival analysis (Prof. Dr. M. Fiocco)

This area of statistics deals with time to event data, whose analysis is complicated not only by

the dynamic nature of events occurring in time but also by censoring where some events are not

observed directly but it is only known that they fall in some interval or range. Different types of

censored and truncated data, non-parametric methods to estimate the survival function and

regression models to study the effect of risk factors on survival outcomes will be discussed.

Special aspects such as time-dependent covariates and stratification will be introduced.

### Longitudinal data analysis (Dr. M. Signorelli)

Longitudinal data (sometimes called panel data) are data collected through a series of repeated

observations of the same subjects over time. Since repeated measurements from the same

subject are typically correlated, the analysis of longitudinal data requires statistical methods that

do not rely on the usual independence assumptions. In this part of the course, the two most

widely used statistical models for longitudinal data - linear mixed models, and generalized linear

mixed models – will be discussed. Estimation of the models will be performed using the R

software environment.

## Course objectives

The overall aim of the course is to introduce students to four different areas of statistics. By the

end of the course, students are expected to have a basic understanding of the topics discussed

and to be able to use existing software to apply the methods covered during the course.

## Mode of instruction

Weekly 2 × 45 min of lecture in class, and 2 × 45 min of practical sessions with exercises. Laptop

with the statistical package R (http://www.r-project.org) already installed is required for each

practical section.

## Assessment method

Four individually written reports (20% each), and a presentation (20%) on a selected topic. The presentations will be held individually or in pairs, depending on the group size. The reports are regarded as practical assignments, and can not be retaken. The presentation can be retaken.

## Literature

Lecture material provided in class.

## Registration

Enroll in Usis to obtain the course material and course updates from Brightspace.

## Contact

Tijn Jacobs - t.jacobs.3@umail.leidenuniv.nl