# Statistics for Computer Scientists

Course
2024-2025

Not applicable.

## Description

Statistics is the science concerning the description and analysis of data, with the aim to draw generally valid conclusions. Statistics forms the core of many methods in data science and artificial intelligence, making it an essential foundation for other courses (e.g., machine learning, data mining).

The focus of the course is on thoroughly understanding and correctly applying statistical methods, not on the formal justification or derivation of those methods. We consider both descriptive statistics, i.e., methods for describing a given collection of data, and inferential statistics, i.e., methods for inferring properties of a population based on a limited yet respresentative sample.

The course introduces the necessary basic concepts (probability, random variables, statistics, parameters, probability distributions, inference, point and interval estimates, hypothesis testing), various inference methods for specific parameters (e.g., for a single mean or proportion, for two samples, for correlelation, ...), and methods for constructing predictive models (linear regression).

## Course objectives

At the end of the course, students should be able to:

• Explain basic probability concepts, such as outcome space, events, random variables, independence, and various types of probability distributions (including population, sample, sampling, normal distributions).

• Summarise a dataset using appropriate descriptive statistics (including numeric ones such as the mean, median, mode, and standard deviation, and graphical ones such as bar plots, scatter plots, and box plots).

• Explain various data collection methods (such as simple random, cluster, stratified, and multistage sampling), and their potential forms of bias (including selection, response, and non-responsed bias).

• Explain the central limit theorem, the concepts of hypothesis testing, hypothesis pairs, Type I and Type II errors, and supervised learning and model selection.

• Identify the appropriate statistical method (from estimation and hypothesis testing) for a research question.

• Perform various estimates and statistical tests by hand, including point estimates, confidence intervals, and one-sample and two-sample tests for a mean and proportion, the Chi-square test, and F-test.

• Interpret (software) results from estimates and statistical tests (including all mentioned in the previous learning outcome).

• Construct a (simple and multiple) linear regression model, interpret its results, and test for independence.

• Discuss limitations of statistical inference and especially statistical hypothesis testing, including the concepts of p-hacking and publication bias.

## Timetable

In MyTimetable, you can find all course and programme schedules, allowing you to create your personal timetable. Activities for which you have enrolled via MyStudyMap will automatically appear in your timetable.

Additionally, you can easily link MyTimetable to a calendar app on your phone, and schedule changes will be automatically updated in your calendar. You can also choose to receive email notifications about schedule changes. You can enable notifications in Settings after logging in.

Questions? Watch the video, read the instructions, or contact the ISSC helpdesk.

Note: Joint Degree students from Leiden/Delft need to combine information from both the Leiden and Delft MyTimetables to see a complete schedule. This video explains how to do it.

## Mode of instruction

One 2-hour lecture and one 2-hour tutorial per week.

## Assessment method

• There will be an in-person written, open-book exam.

• During the semester there will be two assignments that are to be made individually and submitted via Brightspace.

• The assignments are not mandatory, do not have any resit possibilities, and grades from previous years cannot be used.

• Final grade F will be computed as
F = max(E, 0.1 A + 0.9 E),
where E is the grade for the exam and A is the average of the grades for the two assignments.

The teacher will inform the students how the inspection of and follow-up discussion of the exams will take place.

• Statistical Methods for the Social Sciences, Global Edition – Fifth Edition, Alan Agresti, Pearson Education, ISBN 9781292220314. Needed for tutorials and open-book exam, hence having a physical copy is mandatory. (Pearson New International Edition – Fourth Edition, ISBN 9781292021669 is also allowed.)

• The slides that will be published via Brightspace.

## Registration

As a student, you are responsible for enrolling on time through MyStudyMap.

In this short video, you can see step-by-step how to enrol for courses in MyStudyMap.
Extensive information about the operation of MyStudyMap can be found here.

There are two enrolment periods per year:

• Enrolment for the fall opens in July

• Enrolment for the spring opens in December

Note:

• It is mandatory to enrol for all activities of a course that you are going to follow.

• Your enrolment is only complete when you submit your course planning in the ‘Ready for enrolment’ tab by clicking ‘Send’.

• Not being enrolled for an exam/resit means that you are not allowed to participate in the exam/resit.

## Contact

Marieke Vinkenoog

Matthijs van Leeuwen

## Remarks

Software
Starting from the 2024/2025 academic year, the Faculty of Science will use the software distribution platform Academic Software. Through this platform, you can access the software needed for specific courses in your studies. For some software, your laptop must meet certain system requirements, which will be specified with the software. It is important to install the software before the start of the course. More information about the laptop requirements can be found on the student website.