nl en

Corpus Lexicography


Admission requirements

Not applicable.


In this course, students are introduced to the field of lexicography with a focus on computational and corpus linguistic aspects in lexicography. Lexicography is concerned with the theory and practice of composing dictionaries.

The course discusses various computational tools lexicographers use whilst making a dictionary. In technologically advanced dictionary-making, lexicographers work with two main systems on their computer: the Corpus Query System for analysis and the Dictionary Writing System for synthesis. Both systems will be covered in this course.

This course also teaches students about theoretical and practical issues involved in compiling complex data sets for lexicographic purposes. Students will learn about corpus design and annotation, computational lexicons, semantic networks, etc. They will learn how to manipulate text using regular expressions and get a basic background in databases for lexicography.

We will illustrate the theory with practical activities such as compiling a corpus and preparing it for lexicographic use as well as using a dictionary writing system and a corpus query system.

Course objectives

By the end of the programme, students will have acquired knowledge on the following notions and subjects:

  • Computational lexicography

  • Dictionary Writing Sytems

  • Corpus linguistics

    • History of corpus linguistics
    • Corpus design Size, sampling, text type, genre, XML/TEI encoding, metadata
    • Corpus analysis and annotation POS-tagging, tokenisation, lemmatisation, parsing, frequency lists
    • Corpus Query Systems Concordance, collocations, word sketches
  • Computational lexicons

    • Knowledge representation
    • Inheritance
  • Perl/ Databases

    • Regular expressions
    • Simple Perl scripts
    • Database structure for lexicography
  • Semantic Networks

    • WordNet, FrameNet, semagram, ontology, taxonomy, hierarchy
    • Semantic relations
  • Dictionary use

    • Log files
    • Forum


The timetable will be available by June 1st at the website of Research Master in Linguistics

Mode of instruction

Lectures and tutorials/practical sessions.

Assessment method



A blackboard site will be made available. Blackboard

Reading list

Text book:
Atkins and Rundell (2008) The Oxford Guide to Practical Lexicography. Oxford University Press. 540 pages. ISBN 978-0-19-927770-4 (Hardback). Price: £ 85.00 (Paperback £ 29.99).

The reading list will be distributed at the beginning of the course


Exchange and Study Abroad students, please see the Study in Leiden website for information on how to apply.

Application for Contractonderwijs

Students can register for courses en exams through uSis

Contact information

Coordinator of Studies.