6.882 Ethical Machine Learning in Human Deployments
Welcome to 6.882!
6.882 Spring 2022
Friday, 10:00 AM - 1:00 PM
Instructor: Dr. Marzyeh Ghassemi
TA: Lelia Marie Hampton
Piazza: piazza.com/mit/spring2022/6882 Links to an external site.
TA Office Hours: Fridays, 9-10 AM, 56-114
Overview
This course focuses on the human-facing considerations in the pipeline of machine learning (ML) development in human-facing settings like healthcare, employment, and education. Students will learn about the issues involved in ethical machine learning, including framing ethics of ML in healthcare through the lens of social justice. Students will read papers related to ongoing efforts and challenges in ethical ML, ranging from problem selection to post-deployment considerations. Guest lectures will be given by experts in data access, ethics, fairness, privacy and deployments, and the course will focus around a central project that students will use to explore how machine learning can potentially be brought into human-facing deployments ethically.
The course will involve an algorithmic fairness measurement and mitigation exercise and a final project. A major component of this course is the final project. Project presentations will be on the last day of class (May 6).
Students should have taken 2 introductory courses to machine learning; known courses that fulfill these are listed below, but others may be used if similar in content:
- Graduate Students: 6.867, 6.806, 6.819
- Undergraduate Students: 6.036, 6.401, 6.419, or 6.402
Grading
1. Weekly Reflections: The weekly reflections, corresponding to Week 2 - Week 10, are a total of 10% of your grade.
- Done as a Canvas discussion
- Must reply to at least one other student
- Due before class
- Worth 1.25 point per week
2. Problem Sets: Problem sets are worth a total of 85% of your grade, as broken down below.
HW | Subject | Percentage of Grade |
1 | Algorithmic Fairness Assignment | 15% |
2 | Project Literature Review | 15% |
3 | Project Proposal and Outline |
15% |
4 | Final Project Presentation |
20% |
5 | Final Project Write-Up | 20% |
Note on Plagiarism: Student code submissions may be submitted by the instructors to a plagiarism detection tool for a review of similarity and detection of possible plagiarism. Submissions will be used solely for the purpose of detecting similarity, and are not retained indefinitely on the server; typically results are deleted after 14 days but may be removed sooner. For more information on the tool used, refer to https://theory.stanford.edu/~aiken/moss/ Links to an external site..
3. Scribing: 5% of your grade is assigned for Latexing/scribing the lectures for the week, and submitting them to the class staff for review. Students can sign up in pairs or trios to record the class lectures, and the grade for scribing will be collectively shared by those who share a scribing session.
Scribe Signup Links to an external site.
Schedule
Project Details
Project Overview
The goal is to have a submission-ready manuscript by the end of the semester, formatted according to the journal the team is targeting to submit to. The project should tackle an ethical issue that could, or has, occurred in the process of using machine learning in a human setting. Projects can be more technical, involving the reproduction or development of models and evaluation, or more socio-technical or policy focused, examining the complex choices, interactions and implications of machine learning use in these spaces.
Teams can be composed of 2-4 students, and there will be one project report/presentation per group. There are many options for the class project but, crucially, students should choose a major undertaking because various components of the project (Project Literature Review, Project Proposal and Outline, Final Project Presentation, Final Project Write-Up) account for a total of 70% of your grade.
Data Sources
Kaggle is a platform for many kinds of data, and competitions from this platform can be modified for relevant investigations.
MIMIC is an open platform for health data. To obtain access to MIMIC, students must obtain CITI certification, and request access on Physionet.
Nightingale is an open platform for health data. Register Links to an external site. for the platform as soon as possible, and have a look at existing questions by reading through the documents here Links to an external site.. CITI training is required to use the platform. These projects focus mostly on fairness audits of machine learning systems trained on real health data.
Nightengale Project 1: Identifying fairness violations in breast cancer risk using digital pathology images
Every year, 40 million women get a mammogram; some go on to have an invasive biopsy to better examine a concerning area. Since the 1990s, we have found far more ‘cancers’, which has in turn prompted vastly more surgical procedures and chemotherapy. But death rates from metastatic breast cancer have hardly changed. When a pathologist looks at a biopsy slide, they are looking for known signs of cancer: tubules, cells with atypical looking nuclei, evidence of rapid cell division. These features still underlie critical decisions today with respect to how to manage the patient (e.g. give surgery/chemo or wait). Students can predictive models trained on pathology images to predict critical patient outcomes such as mortality and metastasis. They will identify patients at high risk of poor outcomes and compare rates of prediction in different gender/ethnicity groups. https://docs.nightingalescience.org/brca-psj-path.html
Links to an external site.
Nightingale Project 2: Subtyping cardiac arrest with ECG
A patient is rushed into the ER, unconscious and in cardiac arrest. As the physician begins the resuscitation, they know only that the patient’s heart has stopped—but nothing else. What happened to cause the arrest? What immediate actions need to be taken? One of the only pieces of data available to the emergency physician in this situation is the electrocardiogram (ECG), which measures the electrical activity of the heart. Physicians use this to determine which immediate actions are needed. This rich signal might also contain other clues: about why the heart stopped, what physicians can do in the ER to give the patient the best possible chance of surviving, and the likelihood that a patient who survives will have a normal life, without profound physical or neurological impairments. Students will evaluate the fairness of algorithms using ECGs to predict clinical tasks such as cardiac arrest cause and patient survival post-hospital discharge from hospital. https://docs.nightingalescience.org/arrest-ntuh-ecg.html
Links to an external site.
Authorship
A note on collaboration: Research is a collaborative activity and we encourage all students to collaborate and learn from each other. In general, when you put your name on something for research, you must: a) have materially contributed to the work, b) be able to defend the research, and c) acknowledge the contribution of others. Keep this in mind when working together and submitting material for evaluation.
A note on authorship: By the end of the course your final project may be sufficiently developed to submit to a peer-reviewed journal. The author order can be a somewhat controversial issue and is left to the project participants to decide. We would strongly encourage you to discuss what the order will be, or what philosophy you will use to decide the order while forming groups. In the case of a dispute during or after the course, the instructors will likely not be able to mediate in any meaningful way. We would also recommend equal authorship (now more common), but the decision is left to each team.
A note on acknowledgement: Papers that result from work done during this course should recognize the contributions of the course in an acknowledgement or in other sections. The suggested language is: "This manuscript was composed by participants in the EECS 6.882 course on Ethical Machine Learning in Human Deployments at the Massachusetts Institute of Technology, Spring 2022.'"
Other MIT Courses on Ethical ML:
Data-Driven Decision Making and Society
STS Computing and Society Concentration
Exploring Fairness in Machine Learning for International Development
