Advice: Knowledge of: Basic statistics: t-test, ANOVA and linear regression. Basic probability theory: Normal, Binomial and Poisson distributions. Basic mathematics: Linear algebra at life science bachelor level.

In order to test your statistical knowledge, a test, with accompanying video lectures, will be made available. If the result of the test is unsatisfactory we advise you to follow the introductory course Basic statistics for Master students first; please contact the study advisor for further details.

N.B. For the combination master programs (Business Studies, Education and Science Communication) Advanced Statistics is not compulsory. These students can also choose to take Basic Statistics for Master students.

Description

This course discusses probabilistic theory, statistical analysis and statistical modelling in the context of research in the life sciences. Leading concepts in statistics are introduced from the perspective of empirical inquiry and study design. Basic statistics are quickly reviewed and more advanced statistical methods are introduced to deal with data that cannot be analyzed using standard classical methods:

• Mixed models are introduced to deal with data that are not independent like repeated and nested designs.

• Generalized models to deal with deviations form normality and heteroscedasticity.

• Machine learning methodologies are discussed to deal with high dimensional data allowing for both prediction and conformational statistics.

• Some of the statistics discussed will be evaluated in the context of bioinformatics.

• In a short detours we explain important statistical perspectives like the Bayesian view on statistics, information entropy, GAM’s and statistical network theory. These topic might vary depending on interests. All statistical examples and assignments are done in R and Rstudio, including simulation of data based on your own experimental designs.

Course Objectives

After completion of the course, students are able to:
1. Apply methods discussed in GRS/Basic Statistics with extensions to generalized methods and supervised & unsupervised learning methodologies.
2. Identify key data properties of complex study designs from which the student can infer the correct statistical methods to be used and analytic strategies to be followed.
3. Identify statistical pitfalls and fallacies that can occur in statistical analysis.
4. Motivate the use of statistics based on the fundamental principles of a disbalance between the degree of freedom of the model and the data, infer the expected distribution of the residuals and apply this knowledge to the interpretation of statistical results.
5. Deduce and interpret generic mathematical formulas of important statistical concepts.
6. Reason why a particular test statistic takes on certain values under the null hypothesis.
7. Combine statistical data from different literature sources, combine them in a meta-analysis and relate the underlying methodology to mixed models.
8. Convert complex data to tidy format, create subsets, detailed data summaries using a scientific programming language (e.g., R).
9. Simulate and analyze complex data in a scientific programming language (e.g., R).
10. Produce publication-grade figures in a scientific programming language (e.g., R), using basic and advanced plotting routines (ggplot, plotly).

Mode of instruction

Lectures, tutorials and assignment. Some lectures must be prepared by the students with the use of web lectures and tutorials.

Assessment method

Written exam and a group assignment.
A weight of 75 % for exam and 25% for the group assignment.

Contact

Coordinator: Dr. H.G.J. van Mil
Email: h.g.j.van.mil@umail.leidenuniv.nl

Remarks

Timetable will be communicated through Brightspace.