Machine Learning for NLP (RM)

Course

URL study guide

https://studiegids.vu.nl/en/courses/2024-2025/L_AAMPLIN019

Course Objective

This course provides students with the means to conduct NLP research using machine learning. Students will learn: a) what the main machine learning technologies used in Natural Language Processing are b) how they work and how they can be used c) the methodologies for using these technologies in NLP research and d) how to represent linguistic data and what the impact is of choices in data representation. By the end of this course, students will be able to (1) name and describe the (working of) main machine learning technologies in NLP, (2) apply these technologies to specific NLP tasks (3) design a research environment where machine learning is used to solve an NLP problem, and (4) interpret and analyze evaluation results from machine learning experiments.

Course Content

Machine learning is a dynamic and active research field. The main goal of machine learning is to develop systems which can automatically solve different problems without being specifically programmed, i.e. by learning from the data. In this course, we will focus on the use of machine learning as a methodology for solving NLP tasks (e.g. pos-tagging, syntactic parsing, information extraction). We cover both `traditional' machine learning methods as the latest deep learning approaches. Representation of language as data plays a prominent role in this course. Particular attention will be paid to the methodologies for using machine learning in NLP research. We will cover the experimental setup, running existing packages on new tasks and evaluation of overall results as well as error analysis. The course covers practical skills that can be useful in industry as well as in academia. The course can be followed by any student with sufficient linguistic and programming knowledge. It should be noted, though, that the course does not provide an introduction to NLP nor linguistics (knowledge of basic linguistic concepts is an explicit prerequisite). The course consists of two components: first basic machine learning algorithms and how they are used for NLP are covered by theory and practical assignments (6 ECTS, period 2). An additional 3 ECTS follow up is offered in Period 3 where the acquired skills are applied in a practical assignment. There is also a variation of the course that only includes the 6 ECTS component (L_AAMPLIN024).

Teaching Methods

Lectures and work group (2 hours/week)

Method of Assessment

Period 2: exam (33.3%); portfolio of assignments (33.3%). Period 3: Final project (code and report) (33.3%). Students need to acquire a passing grade for each individual component in order to pass this course.

Literature

TBA

Target Audience

Students of the Research Master's Humanities, specialization Linguistics (Human Language Technology).

Recommended background knowledge

Basic linguistic knowledge and basic programming skills
Academic year1/09/2431/08/25
Course level9.00 EC

Language of Tuition

  • English

Study type

  • Master