Project

Last revised 2024-12-03.

A major component of this course is a hands-on final project guided by students’ own interests. In this project, students will demonstrate an ability to summarize current approaches and challenges in a subfield of NLP and implement some sort of contribution (however small) to this NLP area of research or practice.

Groups

Projects will be done in groups of 2-4 students. Groups will be assigned by teaching staff based on interests, skills, and group preferences from students.

Deliverables

Project idea submission form. Due 09-20. With this form, you can fill out potential project ideas you might be interested in working on. You can fill out ideas from the example projects listed on this website, ideas you have from research you are a part of, interesting text datasets you’d like to work on, really anything! You can fill out as many ideas as you’d like in this form. Ideas do not have to be fully sketched out. Submitting an idea does mean you will necessarily work on it. These ideas will be presented to all students anonymously. Each student must submit at least one idea for credit on this assignment, even if it’s just chosen from the example projects.
Project idea ranking and survey. Due 09-26. In a form, students will rank which project ideas they would prefer to work on, as well as list any personnel preferences, interests and skills you have. Teaching staff will take all of this information into account when assigning groups.
Project proposal presentation. In class 10-16. Groups will make a brief presentation to the class outlining their proposed project, with Q&A and opportunities for feedback from other students. Please plan for maximum 7-minute presentations not including Q&A, which will be held right afterward for each group. Please add your slides to this shared PowerPoint presentation. Presentations are not graded. Cover at least these key points:
1. Project motivation (what is the value of this work?)
2. Briefly, what 1-2 other related papers have done
3. What data you are planning to use
4. What approach/methods will you be taking
5. Evaluation of your approach (or dataset, if it’s a dataset contribution)
Project proposal and literature review. Due 10-18. Please submit one per group on Canvas. There is no required length or format for this report, but it is recommended to use the ACL format that the final report will be formatted in. This proposal will be a report with answers to the following questions:
1. Will you be making a contribution of a new dataset, new application, new approach, combination of these or some other type of contribution?
2. What is the problem or task you are focusing on?
3. What is the expected output of your project? This could be a new approach and its evaluation on particular datasets. It could be a new dataset with a particular format, or new analysis. In any of these cases, please describe the format of your desired output.
4. How does your contribution build on or extend prior work? This literature review will be of at least 3 papers relevant to your project area. It will group and summarize relevant papers into types of tasks, datasets, and/or approaches. Good places to look for NLP papers include the ACL Anthology, Semantic Scholar, and Google Scholar.
5. What data are you using (or contributing)? Please explain where these datasets are from and how they were constructed.
6. What algorithm or approach are you taking to address the task?
7. How are you evaluating your contribution? What performance metrics are you going to use?
8. What kinds of ethical issues may be raised by your model or data?
9. What are the proposed steps needed for completion of the project?
10. What are roles and tasks of each person in the group? Though group members will contribute in various capacities, it is best if each person is responsible for at least one aspect of the project.
Project peer review. Due 11-14. In a form, you will rate your own performance and the performance of other group members. This feedback will not be used for grading, but to identify any workload distribution issues early on and assign roles accordingly.
Progress report. Due 11-14. A brief (max 3 pages) progress report of a basic working system. Not everything needs to be done or fully functional, but there needs to be some sort of basic functionality. Also list any questions you have or resources you will need to successfully complete the project by the final deadline, if you have any. This report should be in the ACL format that the final report will be in and should maximum 3 pages, not including references. You do not have to repeat information from the project proposal except for basic descriptions of the project.
Final presentation. In class on 12-11. Groups will present their finished work to the group, with Q&A and feedback opportunities from students. Please prepare a maximum 5-minute presentation in which you can divide up speaking responsibilities however you see fit, though having more than one group member is encouraged. Add your slides to this shared PowerPoint presentation. Cover at least these key points:
1. Project motivation (briefly)
2. Data
3. Methods, or annotation/collection approach for dataset projects
4. Results
Final report. Due 12-12. At the end of the course, groups will provide a written report of their project. This report will be in the ACL format found here (Overleaf template here). The report should be a maximum of 8 pages, not including limitations, ethics, group member task breakdown, references sections or appendices. Outstanding reports would be of a quality and structure that could be submitted to an NLP workshop or conference, but other types of projects can also achieve an A. There is flexibility in section names, but please provide information about the following aspects of the project:
1. Project motivation
2. Literature review. Please provide full citations in a references sections for works cited throughout the paper (not just URLs).
3. Data
4. Methods. Please clearly specify which techniques are novel/your own versus methods directly or indirectly from prior work (which is also fine).
5. Results
6. Discussion
7. Future work. This is a good place to describe things you thought about but never had time to complete!
8. Limitations (doesn’t count toward page limit)
9. Ethical issues (doesn’t count toward page limit)
10. Group member task breakdown (doesn’t count toward page limit). This section details the high-level tasks that each group member completed.
11. References (doesn’t count toward page limit)
12. Appendices (optional, doesn’t count toward page limit). Additional figures or explanation in one or more appendices is allowed, but they will not necessarily be considered in grading.

Here is the rubric that will be used in grading:

Rubric category	Points
Clear motivation for the work is provided	4
Research questions and/or task definition is clear	8
Sufficient grounding in relevant related literature	10
Applicable dataset/s are chosen	5
Methods are relevant. For new approach contributions, multiple methods are compared. For dataset contributions, annotation methodology is explained	15
Results are provided. For new approach contributions, results from multiple methods (at least one baseline) are presented. For dataset contributions, this may be a single set of results from a simple classifier, or other results if discussed with the instructor	17
Discussion is provided of the results and/or the potential uses or contributions of any new datasets contributed	10
Limitations of your approach or dataset are sufficiently discussed	3
Ethical issues that may be raised by your system or dataset are sufficiently discussed	3
Potential future work is discussed	3
*Project content total*	78
Meets all formatting requirements. Is maximum 8 pages, not including references or group member task breakdown	8
Writing is clear	9
*Writing total*	17
Group member had a sufficient amount of workload in the project	13
Task and roles assigned to this group member were completed sufficiently	13
*Individual contribution total*	26
*Grand total*	*121*

Example projects

Your goal is to make a contribution, even a small one, to NLP research or practice. You can select from the following types of contributions, combine multiple of them, or define a different type of contribution with instructor approval. Example project ideas and projects are provided (with a bias toward computational social science and hate speech, the instructor’s research area). You are also encouraged to come up with your own ideas, too! Is there a text dataset in a field or your industry that you are familiar with that has not been analyzed? Projects can be related to students’ research, but should not be projects for other classes.

New dataset, annotations, or analysis of existing datasets

Data is at the heart of machine learning and NLP systems; it enables further modeling and encapsulates what NLP systems “know”.

Example project ideas

Build a dataset of social media posts and ads discussing prescription drugs and natural medications for comparison, with Prof. Ryan Shi and a collaborator in Pitt Family Medicine.
Hate speech is culturally specific, yet the majority of NLP work focuses on English in North American and European contexts. A quantitative analysis of different features of datasets annotated for hate speech in multiple languages and from multiple cultural contexts would illuminate global similarities and culturally specific contexts.
Build a dataset of text from different personas from fiction roleplaying sites (“language cosplaying” in China). This could be useful for dialogue systems that adopt personas, or for story generation.

New approach or application

This is perhaps the most common sort of NLP research contribution, in which a new method or algorithm for approaching a task (which could be a new task) is presented. Applying an existing method in a new context or task, as might be necessary in an industry setting, would also fit within this contribution.

Example project ideas

New tasks and applications:

New identity terms are commonly developed in online communities, some of them hateful. Develop methods to find in-group hate jargon and identity terms.
Trace and compare the language of legislative bills introduced at state legislatures across US states. Data is provided by the instructor and a collaborator at Carnegie Mellon University.
Hate speech identification without the text: Identify the discourse contexts in which hate speech is likely to occur, without allowing classifiers to look at the exact text of the hate speech.
From a set of descriptions of characters, develop a classifier to predict which ones will generate the most fanfiction. This could be a lens into online community and media norms.
Fanfiction, online writing by fans of media works, is known for celebrating queer identity but still may center the experiences of white authors and characters. Use FanfictionNLP to compare representations of characters of color to white characters in fanfiction at scale.
Analyze how different newspapers cover topics differently in English-language editorials from Sri Lankan newspapers. Data is provided by the instructor and a collaborator at Carnegie Mellon University.
Predict arch support, comfort and durability of shoes from Amazon reviews, with a collaborator in Pitt Engineering.
Quantitative analysis of hateful, white supremacist narratives usually centers on contemporary online discourse. Yet many white supremacist language and narratives has its roots before online discourse. Compare narratives, topics and themes presented in historic and contemporary white supremacist discourse with data provided by the instructor.
Evaluate LLMs for their factuality in summarization of class reflections using a dataset provided by the instructor and Prof. Diane Litman.
Evaluate the fairness of quality scores automatically assigned to sutdent reflections using a dataset provided by the instructor and Prof. Diane Litman.
Build networks of characters and predict relations among characters in fiction using this dataset.
Stancetaking, a concept from sociolinguistics, is when speakers take an evaluative position toward the concept (which are often nuanced, e.g. “No, I actually don’t like Taylor Swift’s music that much, but she’s great as a person”). Develop automated methods for identifying the “stance object”, who the speaker is evaluating, likely from Reddit data.
Automatically summarize movies based on their subtitles from this dataset developed by former students in the class.
Predict “speech acts”, intentions behind utterances, based on emojis with a dataset assembled by former students in the class.
Explore similarities and differences between language in podcasts and Reddit communities based on those podcasts using a dataset assembled by former students in the class.
Computational analysis of Nakba narratives. See workshop and datasets.
Examine the framing of different entities in police Facebook posts from the Plain View Project.

Existing tasks to work on (some ideas are drawn from Graham Neubig’s Advanced NLP class):

Lexicons are often used to identify emotional language or other concepts text analysts may be interested in measuring in large corpora. Develop and evaluate approaches for expanding lexicons to specific contexts. This could be applied to expand lists of identity terms in a variety of online communities, with data supplied by the instructor.
WASSA Shared Task on Empathy Detection and Emotion Classification and Personality Detection in Interaction
WASSA Shared Task on Explainability for Cross-Lingual Emotion in Tweets
SemEval Multilingual characterization and extraction of narratives from online news
Information retrieval from regulatory documents in a shared task and RegNLP workshop
Semantic pleonasm detection with LLMs, on a dataset developed here at Pitt
Shared tasks in identifying AI-generated content
Any of the SemEval 2021 tasks
Sexism identification in social networks shared task
X-FACTR multilingual knowledge probing in QA
GoEmotions Fine-grained Emotion Detection Dataset
SciREX Scientific Information Extraction
Subjective Intent Classification in Discourse
Very Low Resource machine translation
Predict the points at which speakers switch languages when code-switching
Sign language translation
Develop best approaches for training hate speech classifiers that generalize across targeted identities. Data would be provided by the instructor
Style transfer of offensive language into inoffensive language. See existing paper and dataset+code.
Develop new approaches for hate speech detection by comparing with knowledge bases or pretrained models of stereotypes from a stereotype dataset
Cross-lingual emotion detection. See existing paper.

New survey or position paper

Surveys are especially needed for new, emerging research areas. All projects will require a literature review, but a survey paper would be both broader and deeper. It would summarize key approaches and key challenges and present lines for future work. Some sort of implementation is necessary for this type of contribution as well, such as applying multiple established methods to a new dataset or in a new context to show challenges that need to be addressed. Position papers argue for a certain viewpoint or shortcoming of existing approaches, e.g. arguing for the utility of techniques from a discipline outside NLP in NLP tasks.

Example project ideas

Survey how NLP is used and applied in other fields. What has been our most useful contributions to scholars in the social sciences, physical sciences, or humanities? This survey would assemble papers across disciplines for mentions of NLP and summarize what is most useful, what is lacking, and what approaches from NLP could be helpful to others.
Computational social science using NLP generally relies on data from online communities. But this is missing non-online interactions and the practices of those who are not active online. Survey datasets and approaches that use quantitative and computational techniques on recordings of offline linguistic interaction.
A growing area of research in computational social science aims to capture the framing and portrayal of entities across large text corpora (such as in news media). Survey existing approaches and challenges.

How your project will be graded

To get an A, your group’s project should make progress toward an achievable, concrete contribution specified in your project proposal. The project does not necessarily need to be successful in the sense that it outperforms baselines or contributes to our knowledge of a phenomenon. Sometimes ideas don’t work, and that’s okay. But you need to provide evidence of progress toward that contribution. If you are building a dataset, for example, the dataset needs to be built in some form, even if it is as not as large or as useful as you may have hoped. If you are evaluating a new method for a task, you must have an implementation that tests that method against other baselines, even if it doesn’t perform as well as you would have hoped or you didn’t get to evaluate it against all the baselines you wanted to. If you are doing a survey, you must distill a sufficient number of papers into themes that comprehensively describe a research area, even if you don’t end up finding groundbreaking gaps in knowledge that must be addressed. Feel free to take on more risky ideas, but only if you know you’ll have something to show for it at the end. Teaching staff will guide you toward scoping projects that should fulfill this goal in the planning phase through the proposal.