Spring 2019
We want you to get your hands dirty with most of the core topics covered in the course. To that end, we prepared two projects.
Groups: check Canvas or our blog posts.
Project 1
Deadline: April 26th, 2019
In this project you will learn about lexical alignment, the task of learning correspondences between words in different languages. You will apply latent variable modelling techniques, in particular, learning with directed graphical models (aka locally normalised models). In this project, you will parameterise the model using categorical distributions. You will experiment with maximum likelihood estimation.
- Maximum likelihood estimation for IBM1 and IBM2: EM algorithm
Note: in IBM2 experiment use relative jumps (jump function).
Resources:
- Project description
- Training data
- Validation data
- Test data
new!
- Neural IBM1
EXTRA!
- Helper functions for validation AER
- Note that I wrote this helper class using
python3
, if you are usingpython2
you will need to importfrom __future__ import division
to avoid the default integer division (whereby stuff like 1/2 evaluates to 0 instead of 1.5)
- Note that I wrote this helper class using
Submission:
TBA
Assessment: guidelines / grades on Canvas.
Project 2
Deadline: May 20th, 2019
Extension Deadline: May 28th, 2019 new!
In this project you will learn and implement a deep generative language model.
Resources:
Assessment: described in the project description / grades on Canvas.