https://studiegids.vu.nl/en/courses/2024-2025/L_AAMPLIN024This course provides students with the means to conduct NLP research using machine learning. Students will learn: a) what the main machine learning technologies used in Natural Language Processing are b) how they work and how they can be used c) the methodologies for using these technologies in NLP research and d) how to represent linguistic data and what the impact is of choices in data representation. By the end of this course, students will be able to (1) name and describe the (working of) main machine learning technologies in NLP, (2) apply these technologies to specific NLP tasks (3) design a research environment where machine learning is used to solve an NLP problem, and (4) interpret and analyze evaluation results from machine learning experiments.Machine learning is a dynamic and active research field. The main goal of machine learning is to develop systems which can automatically solve different problems without being specifically programmed, i.e. by learning from the data. In this course, we will focus on the use of machine learning as a methodology for solving NLP tasks (e.g. pos-tagging, syntactic parsing, information extraction). We cover both `traditional' machine learning methods as the latest deep learning approaches. Representation of language as data plays a prominent role in this course. This course distinguishes itself from other ML courses taught at this university, in its focus on analysing tasks and data representation and, of course, natural language. We will cover the experimental setup, running existing packages on new tasks and evaluation of overall results as well as error analysis. The course covers practical skills that can be useful in industry as well as in academia. The course can be followed by any student with sufficient linguistic and programming knowledge. Note however that this course is explicitly not meant as an introduction to NLP or linguistics: knowledge of basic linguistic concepts is an explicit prerequisite.2 lectures of 2 hours. One focuses on machine learning algorithms, the other on linguistic properties and practical aspects.Students hand in a portfolio of exercises carried out during the courseand take a final exam. Both components need to receive a passing grade in order to pass the course (at least 5.5).TBAThis course is specifically designed for students in the Text Mining 1-year master. It can also be followed by Computer Science students (among others Business Analytics and AI) if they have sufficient knowledge of linguistics or are willing to invest (independent) research time to obtain this (materials are provided).Programming (Python, corresponding to the end level of `Programming Python for Text Analysis'), Linguistics, Natural Language Processing/Human Language Technology (NLP/HLT). Students with some prior knowledge of machine learning may manage without prior knowledge of linguistics or NLP/HLT