Machine Learning with Limited Annotation

COMP 152 Spring 2021

Description

This course will cover the latest research on machine learning in settings where labeled data is not available. In such settings, classical methods are not necessarily the best approaches for algorithms to learn from data. Students will lead and participate in discussions on recent research publications. Students will also complete a research project applying these new methods. In addition to the specific knowledge about the topic area, students will also gain experience reading and understanding research literature, planning and executing research projects, and communicating about machine learning research.

We will read about ongoing research on how to solve the challenging task of doing machine learning with limited labeled data. We will read a lot of papers and try to understand the problems they solve and their methods. You will do a mini research project during the semester where you either apply an existing method to an application you are interested in or work on developing and evaluating new methods.

The course will take place virtually from 1:30-2:45pm on Mondays and Wednesdays. The Zoom link for class sessions is https://tufts.zoom.us/j/99531912173?pwd=WUFWb1Azcm1vUTVHUldLYUpwUklXUT09. The session requires you to be logged into your Tufts Zoom account.

  • Instructor:
    Bert Huang, Assistant Professor of Computer Science
    Office hours: Tue. 4pm–5pm, Thurs. 11am–12pm
    [email protected]
  • Teaching Assistant:
    Sasha Fedchin, Computer Science Ph.D. Student
    Office hours: Mon. 5:20pm–6:20pm, Wed. 4:30pm–5:30pm
    [email protected].
  • See Canvas page for office hour Zoom links.

The course homepage (this page) is located at the URL http://berthuang.com/courses/limited.

The course Canvas page, where you'll submit the homework assignments and discuss papers, is at the URL https://canvas.tufts.edu/courses/27334.

Topics

  • Semi-supervised learning
  • Weakly supervised learning
  • Self-supevised learning
  • Few/zero-shot learning
  • Active learning
  • Transfer learning
  • Data augmentation
  • Crowdsourcing

Prerequisites

Students are expected to have taken COMP 135 or have equivalent experience, broadly defined. In other words, students in this course should have previously studied machine learning and/or artificial intelligence and basic statistics and probability.

Please speak with the instructor if you are concerned about your background. Note: If any student needs special accommodations because of any disabilities, please contact the instructor during the first week of classes.

Learning Objectives

A student who successfully completes this class should

  • Have a broad perspective on tools for machine learning in settings with limited annotation;
  • Gain experience reading and understanding research literature;
  • Have practiced planning and executing a research project;
  • Have practiced communicating about the latest machine learning research.

Schedule

The class schedule is available here and is embedded below. We will update the schedule regularly.

Takeaway Summaries

Each student will be required to read the assigned text and submit a summary of the reading before class begins. These summaries should be between four and 10 sentences long, covering the important points of the reading. The summaries do not need include fine detail, but they should highlight things you believe are important to discuss in class, where we will examine the reading in detail. Questions and points that you are uncertain about are welcome parts of these summaries.

Leading Classes

Each student will be required to lead our discussion during one class session during the semester. The discussion leader is free to structure the class discussion as they like, but a suggested format is:

  • Discussion leader summarizes the reading (15–30 minutes), highlighting unclear parts or open research questions
  • Class splits into breakout rooms; each breakout room discusses its members' uncertainties about the reading or research questions (15–20 minutes)
    Suggested breakout discussion prompts:
    • Work with your group to try to decipher any unclear parts of the reading.
    • Do you believe the claims of the paper? Why or why not?
    • What is the actual takeaway result?
    • What parts of the paper are especially well written, well evaluated, useful, etc.?
    • What unanswered research questions are you left with (missing comparisons, unevaluated ideas, etc.)?
  • Full-class discussion on questions and points from each breakout room (remainder of class session).

Students are not expected to "teach" the topics from the reading. We will all learn the concepts together as a class. Therefore it is okay and encouraged for students to highlight questions or passages in the reading they want help understanding.

Projects

Students will complete a project during this course. They will be done in groups of size 1, 2, or 3. Projects may be one of two types:

  1. Reproduction studies. Take an existing result from a recent paper and attempt to reproduce the result with experimentation or theoretical analysis.
  2. Novel research. Propose and evaluate a new application, algorithm, or theoretical idea.

Broadly speaking, the expectations for course projects is that they will include preliminary results. If the results are promising, you are encouraged to continue pursuing the research direction, but that will not be graded. You are also welcome to incorporate your existing research interests into a project. More details about project requirements are forthcoming.

Policies

Accessibility

If any student needs special accommodations because of any disabilities, please contact the instructor during the first week of classes. Such students are encouraged to work with Tufts' Student Accessibility and Academic Resource Center (https://students.tufts.edu/staar-center/accessibility-services) to help coordinate accessibity arrangements.

Grading

  • 10%: Class attendance
  • 50%: Takeaway summaries
  • 20%: Project
  • 20%: Class leading

Based on the grading breakdown above, each student's final grade for the course will be determined by the final percentage of points earned. The grade ranges are as follows:

A: 93.3%–100%, A-: 90.0%–93.3%, B+: 86.6%–90.0%, B: 83.3%–86.6%, B-: 80.0%–83.3%, C+: 76.6%–80.0%,
C: 73.3%–76.6%, C-: 70.0%–73.3%, D+: 66.6%–70.0%, D: 63.3%–66.6%, D-: 60.0%–63.3%, F: 00.0%–60.0%.

Class Attendance

All students are expected to attend all lectures unless they have given sufficient notice for justifiable absences. Absence will be excused for reasons including health needs, conference attendance, family emergencies, and job-search interviews.

Retroactive permission will be granted for emergencies if you miss class on an attendance day but had a valid reason that you were unable to notify about (e.g., health reasons).

Academic Integrity

Students enrolled in this course are responsible for abiding by the academic integrity code in the Tufts Student Code of Conduct. For additional information about the code, please visit: https://students.tufts.edu/student-affairs/student-code-conduct/academic-integrity-resources

This course will have a zero-tolerance philosophy regarding plagiarism or other forms of cheating. Your homework assignments must be your own group's work, and any external source of code, ideas, or language must be cited to give credit to the original source. I am required to report any suspicion of academic misconduct to the Dean of Student Affairs Office without warning or discussing with the suspected students.

Standards of Community

Because the course will include in-class discussions, we will adhere to Tufts' Standards of the Tufts Community. For class discussions, one principle is most relevant:

Tufts University students demonstrate respect for themselves, for each other, and for the entire community. Respect includes promoting safety of all people and property. It also includes respecting the privacy and autonomy of all community members. In both the intellectual and social community, respect transcends disagreement to promote learning and understanding.

The remaining principles are also important, and we will take them seriously as a class.