Natural Language Processing (NLP)

Course content

Have you ever wondered how systems like ChatGPT, which can generate human-like text, are built? Are you intrigued by the idea of creating a system that can process, understand, or generate text automatically? Are you interested in building applications that can translate between languages, answer questions, or recognise named entities in text? If so, this course is designed for you.

This course provides an introduction to the fundamentals of Natural Language Processing (NLP), which involves computational models of language and their applications to text. As language is the core of human intelligence, NLP holds a pivotal role in Artificial Intelligence research and development.

We will integrate machine learning (ML), including its fundamental formalisms and algorithms, with a robust hands-on experience. This means you will gain practical skills in implementing these methods for real-world NLP problems.

The course utilises interactive lecture materials constructed with Jupyter notebooks. Course materials from last year are publicly available here. The course will closely follow the structure of the previous year's iteration. If you're unsure about the course prerequisites or content, please review these materials.

 

The course covers the following topics:

  • NLP tasks: tokenisation, text classification, language modelling, named entity recognition, part-of-speech tagging, parsing, information extraction, machine translation, question answering
  • Methods: log-linear models, structured prediction, and neural network models such as recurrent neural networks and transformers, including representation learning, pre-training, transfer learning and interpretability methods
  • Implementations: relationship between NLP tasks, efficient implementations, and the use of modern NLP libraries such as Hugging Face's Transformers

 

Throughout the course, we will also explore the themes of discriminative and generative learning and various ways of obtaining supervision for training statistical NLP models. An important aspect of our discussions will be the application of these techniques in multilingual settings, understanding how NLP can be adapted and applied to a variety of languages beyond English.

Learning outcome

Knowledge of

  • core NLP tasks (e.g. machine translation, question answering, information extraction)

  • methods (e.g. classification, structured prediction, representation learning)

  • implementations (e.g. relationship between NLP tasks, efficient implementations)

 

Skills to

  • identify the different kinds of NLP tasks

  • choose the correct algorithm for a given problem situation

  • implement core algorithms in Python using PyTorch

  • assess the most appropriate algorithms to solve a given NLP problem

  • distinguish and evaluate the advantages of different approaches to the same task

 

Competences to

  • decompose natural language processing tasks into manageable components

  • evaluate systems quantitatively and qualitatively

  • apply the learned skills in a wider context to areas that face similar challenges, e.g., data science, social science, or bioinformatics

  • critically assess the limitations and use cases of language models, and apply this knowledge to the development and deployment of these models in real-world scenarios

The format of the class consists of lectures (possibly including guest lectures), exercises, and project work.

See Absalon for a list of course literature.

Knowledge of machine learning (probability theory, linear algebra, classification, neural networks) and programming (Python) is required, either through formal education or self-study. No prior knowledge of natural language processing or linguistics is required.

Relevant machine learning competencies can be obtained through one of the following courses:
- NDAK22002U Advanced Deep Learning (ADL) or Deep Learning (DL)
- NDAK22000U Machine Learning A (MLA)
- NDAK22001U Machine Learning B (MLB)
- NDAK16003U Introduction to Data Science (IDS)

Academic qualifications equivalent to a BSc degree are recommended.

If you are in doubt about whether you meet the course prerequisites, you can check the course materials from last year here: https:/​/​github.com/​coastalcph/​nlp-course.

Written
Oral
Individual
Collective
Continuous feedback during the course of the semester
ECTS
7,5 ECTS
Type of assessment
Written assignment, Ongoing preparation throughout the course with submission at the end of the course.
Type of assessment details
A group project report, in which each student’s individual contribution is clearly specified, written during the course.
Aid
All aids allowed
Marking scale
7-point grading scale
Censorship form
No external censorship
Several internal examiners
Re-exam

The re-exam is a 30-minute individual oral examination without preparation, based on the full syllabus. No aids allowed.

Criteria for exam assessment

See Learning Outcome.

 

Single subject courses (day)

  • Category
  • Hours
  • Lectures
  • 28
  • Preparation
  • 14
  • Theory exercises
  • 57
  • Practical exercises
  • 57
  • Project work
  • 50
  • English
  • 206

Kursusinformation

Language
English
Course number
NDAK18000U
ECTS
7,5 ECTS
Programme level
Full Degree Master
Duration

1 block

Placement
Block 1
Schedulegroup
B
Capacity
No limitation – unless you register in the late-registration period (BSc and MSc) or as a credit or single subject student.
Studyboard
Study Board of Mathematics and Computer Science
Contracting department
  • Department of Computer Science
Contracting faculty
  • Faculty of Science
Course Coordinator
  • Daniel Hershcovich   (2-71754d71763b78823b7178)
Teacher

Daniel Hershcovich
Anders Søgaard

Saved on the 24-02-2025

Er du BA- eller KA-studerende?

Er du bachelor- eller kandidat-studerende, så find dette kursus i kursusbasen for studerende:

Kursusinformation for indskrevne studerende