Homework 4: Sequence labeling (CS 2731 Spring 2024)

Due 2024-03-25, 11:59pm. Instructions last updated 2024-03-18.

In this assignment, you will manually decode the highest-probability sequence of part-of-speech tags from a trained HMM using the Viterbi algorithm. You will also fine-tune BERT-based models for part-of-speech (POS) tagging for English and Norwegian.

The learning goals of this assignment are to:

Demonstrate how the Viterbi algorithm takes into account emission and transmission probabilities to find the highest-probability sequence of hidden states in an HMM
Fine-tune a transformer-based model on sequence labeling
Find and use pretrained models from Hugging Face

1. POS tagging with an HMM

Consider a Hidden Markov Model with the following parameters: postags = {NOUN, AUX, VERB}, words = {‘Patrick’, ‘Cherry’, ‘can’, ‘will’, ‘see’, ‘spot’}

Initial probabilities:

	\(\pi\)
NOUN	0.6
AUX	0.2
VERB	0.2

Transition probabilities: The format is P(column_tag | row_tag), e.g. P(AUX | NOUN) = 0.3.

	NOUN	AUX	VERB
NOUN	0.2	0.3	0.5
AUX	0.4	0.1	0.5
VERB	0.8	0.1	0.1

Emission probabilities:

	Patrick	Cherry	can	will	see	spot
NOUN	0.3	0.2	0.1	0.1	0.1	0.2
AUX	0	0	0.4	0.6	0	0
VERB	0	0	0.1	0.2	0.5	0.2

See chapter 8 in the Jurafsky & Martin textbook for details on the Viterbi algorithm. Using the Viterbi algorithm and the given HMM, find the most likely tag sequence for the following 2 sentences.

“Patrick can see Cherry”
“will Cherry spot Patrick”

To get you started on the Viterbi tables, here are the first 2 columns for the first sentence with backtraces.

POS state	Patrick	can
NOUN	0.18	0.0036 backtrace:NOUN
AUX	0	0.0216 backtrace:NOUN
VERB	0	0.009 backtrace:NOUN

Deliverables for part 1

In your report, show your work for calculating the Viterbi tables or lattices for both example sentences. This should include backtraces and calculations for probabilities. Report the most likely tag sequences for these 2 sentences.

2. Fine-tune BERT-based models for POS tagging

In this section, you will fine-tune pretrained BERT-based models for part-of-speech tagging in English and Norwegian.

Copy this skeleton Colab notebook, run the cells, and fill in the places that are specified.

Deliverables for part 2

In your report, include:

The 5 most frequent POS tags for English and Norwegian datasets (specified in the notebook) and how many tokens are tagged with each
For each of the 5 most frequent POS tags for English and Norwegian datasets, provide the 5 most frequent word types annotated with that tag in the training data
The names of the pretrained BERT-based models you chose for both English and Norwegian
A brief discussion of any choices you made about hyperparameters in training
(Optionally, for extra credit) A description of changes you made or different pretrained models you tried and what accuracy you obtained on the dev set. 1 point of extra credit will be given if any changes result in an improved accuracy on the dev set.
Accuracy of the fine-tuned models on the test set for both English and Norwegian
POS tags predicted for the words of a sentence of your choice in both English and Norwegian
A link to your copied and filled out Colab notebook

Submission

Please submit the following items on Canvas:

Your report with results and answers to questions in Part 1 and Part 2, named report_{your pitt email id}_hw4.pdf. No need to include @pitt.edu, just use the email ID before that part. For example: report_mmy29_hw4.pdf.
A README.txt file explaining
- any additional resources, references, or web pages you’ve consulted
- any person with whom you’ve discussed the assignment and describe the nature of your discussions
- any generative AI tool used, and how it was used
- any unresolved issues or problems

Grading

This homework assignment is worth 56 points. See rubric on Canvas.

Acknowledgments

Part 1 of this assignment is based on homework assignments by Prof. Hyeju Jang and Prof. Diane Litman. Part 2 is adapted from Jacob Eisenstein and Prof. Yulia Tsvetkov.