6.882 Ethical Machine Learning in Human Deployments

Welcome to 6.882!

6.882 Spring 2022

Friday, 10:00 AM - 1:00 PM

56-114

Instructor: Dr. Marzyeh Ghassemi

TA: Lelia Marie Hampton

Piazza: piazza.com/mit/spring2022/6882 Links to an external site.

TA Office Hours: Fridays, 9-10 AM, 56-114

Overview

This course focuses on the human-facing considerations in the pipeline of machine learning (ML) development in human-facing settings like healthcare, employment, and education. Students will learn about the issues involved in ethical machine learning, including framing ethics of ML in healthcare through the lens of social justice. Students will read papers related to ongoing efforts and challenges in ethical ML, ranging from problem selection to post-deployment considerations. Guest lectures will be given by experts in data access, ethics, fairness, privacy and deployments, and the course will focus around a central project that students will use to explore how machine learning can potentially be brought into human-facing deployments ethically.

The course will involve an algorithmic fairness measurement and mitigation exercise and a final project. A major component of this course is the final project. Project presentations will be on the last day of class (May 6).

Students should have taken 2 introductory courses to machine learning; known courses that fulfill these are listed below, but others may be used if similar in content:

Graduate Students: 6.867, 6.806, 6.819
Undergraduate Students: 6.036, 6.401, 6.419, or 6.402

Grading

1. Weekly Reflections: The weekly reflections, corresponding to Week 2 - Week 10, are a total of 10% of your grade.

Done as a Canvas discussion
Must reply to at least one other student
Due before class
Worth 1.25 point per week

2. Problem Sets: Problem sets are worth a total of 85% of your grade, as broken down below.

HW	Subject	Percentage of Grade
1	Algorithmic Fairness Assignment	15%
2	Project Literature Review	15%
3	Project Proposal and Outline	15%
4	Final Project Presentation	20%
5	Final Project Write-Up	20%

Note on Plagiarism: Student code submissions may be submitted by the instructors to a plagiarism detection tool for a review of similarity and detection of possible plagiarism. Submissions will be used solely for the purpose of detecting similarity, and are not retained indefinitely on the server; typically results are deleted after 14 days but may be removed sooner. For more information on the tool used, refer to https://theory.stanford.edu/~aiken/moss Links to an external site..

3. Scribing: 5% of your grade is assigned for Latexing/scribing the lectures for the week, and submitting them to the class staff for review. Students can sign up in pairs or trios to record the class lectures, and the grade for scribing will be collectively shared by those who share a scribing session.

Scribe LaTeX Template

Scribe Signup Links to an external site.

Schedule

Week	Date	Lecture	Materials	HW Due
1/INTRO	2/4	Course Introduction and Overview 10:00 - 10:15 am: Dr. Marzyeh Ghassemi "Course Intro" 10:15 - 11:00 am: Harini Suresh (MIT) "Potential Sources of Harm in ML" 11:00 - 11:10 am: Q/A session --- 11:10 - 12:00 pm: Dr. Marzyeh Ghassemi "Healthcare: Case Study of Hope and Harm" 12:00 - 12:10 pm: Q/A session --- 12:10 - 12:30 pm: Small group interactive case study 12:30 - 1:00 pm: Class discussion	Ethical Machine Learning in Healthcare Links to an external site. Understanding Potential Sources of Harm throughout the Machine Learning Life Cycle Links to an external site.	HW 1 Out
2/INTRO	2/11	Problem Definition: Disparities in funding and problem selection priorities. Links to an external site. 10:00 - 10:50 am: Catherine D'Ignazio (MIT) 10:50 - 11:10 am: Q/A session --- 11:10 - 12:00 pm: Timnit Gebru (DAIR) 12:00 - 12:20 pm: Q/A session --- 12:20 - 1:00 pm: Class discussion Lecture Notes 1 Lecture Notes 2	Data Feminism Introduction (Why Data Science Needs Feminism) Data Feminism Chapter 1 (The Power Chapter) Data Feminism Chapter 4 (What Gets Counted Counts) Datasheet for Datasets Links to an external site. Notes on Problem Formulation in Machine Learning Links to an external site.	Week 2 Reflection
3/INTRO	2/18	Data Collection: Biased clinical knowledge, power differentials, and social disparities of the healthcare system encode bias. Links to an external site. 10:10 - 10:55 am: Emma Pierson (Cornell Tech, MSR) 10:55 - 11:15 am: Q/A session --- BREAK --- 11:20 - 12:20 pm: Class Case Study discussion --- 12:20 - 1:00 pm: Nightingale Data Project Intros Lecture Notes 1 Lecture Notes 2	Quantifying Inequality in Underreported Medical Conditions Links to an external site. Race Matters in Health Data Links to an external site. Data and its (dis)contents: A survey of dataset development and use in machine learning research Links to an external site. Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research Links to an external site. Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers Links to an external site. An algorithmic approach to reducing unexplained pain disparities in underserved populations Links to an external site. Artificial intelligence for disparities in knee pain assessment Links to an external site.	Week 3 Reflection HW1 Due HW 2 Out
4/INTRO	2/25	Algorithm Development: Default practices, can cause algorithms to work poorly for sub-populations. 10:10 - 11:00 am: Shalmali Joshi (Harvard) 11:00 - 11:15 am: Q/A session --- BREAK --- 11:20 - 11:50 am: Leo Anthony Celi (MIT/Harvard) 11:50 - 12:10 pm: Q/A Session Lecture Notes	Algorithmic Fairness: Choices, Assumptions, and Definitions Links to an external site. Ethical limitations of algorithmic fairness solutions in health care machine learning Links to an external site. Hidden in Plain Sight — Reconsidering the Use of Race Correction in Clinical Algorithms Links to an external site. Links to an external site.Fairness and Abstraction in Sociotechnical Systems Links to an external site. Links to an external site.Fair Regression in Healthcare Spending Links to an external site. Links to an external site.Sensitivity Analysis of Linear Structural Causal Models Links to an external site. Confounding robust policy improvement Links to an external site.	Week 4 Reflection
5/INTRO	3/4	Post-Deployment Considerations: Lack of audits of, and adjustment for, shifting populations and use risks poor outcomes. Links to an external site. 10:10 - 10:55 am: Aleksander Madry (MIT) 10:55 - 11:15 am: Q/A session --- BREAK --- 11:20 - 12:05 pm: Karen Hao (MIT Tech Review) 12:05 - 12:20 pm: Q/A session --- 12:20 - 1:00 pm: Class Case Study discussion Lecture Notes	Required: How Facebook got addicted to spreading misinformation Links to an external site. Skim: A Brief Introduction to Adversarial Examples Links to an external site. The Mythos of Model Interpretability Links to an external site. Adversarial Examples Are Not Bugs, They Are Features Links to an external site. Optional: How Facebook and Google fund global misinformation Links to an external site. Can you make AI fairer than a judge? Play our courtroom algorithm game Links to an external site. Reproducibility in machine learning for health research: Still a ways to go Links to an external site. Do as AI say: susceptibility in deployment of clinical decision-aids Links to an external site. Do no harm: a roadmap for responsible machine learning for health care Links to an external site.	Week 5 Reflection HW2 Due HW 3 Out
6/FOCUS	3/11	Embodied Data As A Biased Medium: Human data itself is biased due to devices, rules, and societies being biased. Links to an external site. 10:10 - 10:55 am: Inioluwa Deborah Raji (UC Berkeley) 10:55 - 11:15 am: Q/A session --- BREAK --- 11:20 - 11:40 pm: Danielle Coleman (U Michigan) 11:40 - 12:20 pm: Q/A session --- 12:20 - 1:00 pm: Class Case Study discussion Lecture Notes	Digital Colonialism: The 21st Century Scramble for Africa through the Extraction and Control of User Data and the Limitations of Data Protection Laws Links to an external site. Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products Links to an external site. Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing Links to an external site. The Discomfort of Death Counts: Mourning Through the Distorted Lens of Reported COVID-19 Death Data Links to an external site. What to Account for When Accounting for Algorithms Links to an external site. Reading Race: AI Recognises Patient's Racial Identity In Medical Images Links to an external site. Dirty data, bad predictions: How civil rights violations impact police data, predictive policing systems, and justice Links to an external site. AI is Learning from Humans. Many Humans. Links to an external site.	Week 6 Reflection
7/FOCUS	3/18	Trade-offs in Utility, Privacy and Fairness - There are known trade-offs between utility, algorithmic fairness and privacy. Methods to address biased data via synthetic creation, re-balancing, or algorithmic identification are limited. Links to an external site. 10:10 - 11:10 am: Vinith Suriyakumar (MIT) 11:10 - 11:20 pm: Q/A session --- BREAK --- 11:20 - 1:00 pm: Lelia Hampton (MIT) Presentation of SERC Case Study	Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings Links to an external site. Can You Fake It Until You Make It? Impacts of Differentially Private Synthetic Data on Downstream Classification Fairness Links to an external site.	Week 7 Reflection HW3 Due HW4/5 Out
8	3/25	No Class - Spring Break
9/FOCUS	4/1	Model Robustness and Transfer: Discussion of how model robustness impacts fairness in deployments. Links to an external site. 10:00 - 10:50 am: Haoran Zhang (MIT) 10:50 - 11:10 am: Q/A session --- BREAK --- 11:15 - 12:00 pm: Lauren Oakden-Rayner (Australian Institute for ML) 12:00 - 12:20 pm: Q/A session --- 12:20 - 1:00 pm: Class discussion Lecture Notes	Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning Links to an external site. Fairness Violations and Mitigation under Covariate Shift Links to an external site. AI for radiographic COVID-19 detection selects shortcuts over signal Links to an external site. Links to an external site.Feature Robustness in Non-stationary Health Records: Caveats to Deployable Model Performance in Common Clinical Machine Learning Tasks Links to an external site. Links to an external site.BEDS-Bench: Behavior of EHR-models under Distributional Shift–A Benchmark Links to an external site.	Week 9 Reflection
10/FOCUS	4/8	Machine Values: Values that are embedded in machine learning models, and how deployments may alter human systems for better or worse. Links to an external site. 10:00 - 10:50 am: Niloufar Salehi (UC Berkeley) 10:50 - 11:10 am: Q/A session --- BREAK --- 11:15 - 12:00 pm: Maxine Mackintosh (Queen Mary University of London) 12:00 - 12:20 pm: Q/A session --- 12:20 - 1:00 pm: Class discussion Lecture Notes	Modeling Assumptions Clash with the Real World: Transparency, Equity, and Community Challenges for Student Assignment Algorithms Links to an external site. Do No Harm Links to an external site. Measuring Social Biases in Grounded Vision and Language Embeddings Medical Dead-ends and Learning to Identify High-risk States and Treatments Links to an external site. Getting Genetic Ancestry Right for Science and Society Links to an external site. Defining and Achieving Health Equity in Genomic Medicine Links to an external site. Data Representativity for Machine Learning and AI Systems Links to an external site. Stability of polygenic scores across discovery genome-wide association studies Links to an external site.	Week 10 Reflection
11/CASE	4/15	Case Study: Education Links to an external site. 10:00 - 10:50 am: Rene Kizilcec (Cornell) Case Study on Education. 10:50 - 11:10 am: Q/A session --- BREAK --- 11:15 - 12:00 pm: Serena Wang and Lydia T. Liu (Berkeley) 12:00 - 12:15 pm: Q/A session 12:15 - 1:00 pm: Team Quick Hits! 5 Minute Group Presentations	50 Years of Test (Un)fairness Links to an external site. Algorithmic fairness in education Links to an external site. The many dimensions of algorithmic fairness in educational applications Links to an external site. Grade: Machine learning support for graduate admissions Links to an external site. Problem formulation and fairness Links to an external site. On the limits of algorithmic prediction across the globe Links to an external site. How Much Information?: Effects of Transparency on Trust in an Algorithmic Interface Links to an external site. Algorithmic Fairness in Education Links to an external site. Algorithmic Bias in Education Links to an external site. Should College Dropout Prediction Models Include Protected Attributes? Links to an external site.
12/CASE	4/22	Case Study: Links to an external site. Healthcare Links to an external site. 10:00 - 10:50 am: Lauren Wilcox (Google) Case study on Health Deployments 10:50 - 11:10 am: Q/A session --- BREAK --- 11:15 - 12:00 pm: Irene Chen (MIT) Cliffnotes on ethical ML in health 12:00 - 12:15 pm: Q/A session --- 12:15 - 1:00 pm: Team Quick Hits! 5 Minute Group Presentations Lecture Notes	Can AI Help Reduce Disparities in Medical and Mental Health Care? Links to an external site. Using search queries to understand health information needs in Africa Links to an external site. A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy Links to an external site. Onboarding Materials as Cross-functional Boundary Objects for Developing AI Assistants Links to an external site. Why Is My Classifier Discriminatory? Links to an external site.
13/CASE	4/29	Case Study: Employment 10:00 - 10:50 am: Manish Raghavan (Harvard/MIT) 10:50 - 11:10 am: Q/A session --- BREAK --- 11:15 - 12:00 pm: Maria De-Arteaga (UT Austin) 12:00 - 12:15 pm: Q/A Session 12:15 - 1:00 pm Private Project Check-Ins with Course Staff (Link to Book Time Links to an external site.)	Mitigating Bias in Algorithmic Hiring: Evaluating Claims and Practices Links to an external site. Links to an external site.Investigating the impact of gender on rank in resume search engines Links to an external site. Bias in Bios Links to an external site. Race-Aware Algorithms: Fairness, Nondiscrimination and Affirmative Action Links to an external site. Challenges for mitigating bias in algorithmic hiring Links to an external site. Auditing employment algorithms for discrimination Links to an external site.
14	5/6	Final Project Presentations		HW4 Due
No Class	5/10	No Class		HW5 Due

Project Details

Project Overview

The goal is to have a submission-ready manuscript by the end of the semester, formatted according to the journal the team is targeting to submit to. The project should tackle an ethical issue that could, or has, occurred in the process of using machine learning in a human setting. Projects can be more technical, involving the reproduction or development of models and evaluation, or more socio-technical or policy focused, examining the complex choices, interactions and implications of machine learning use in these spaces.

Teams can be composed of 2-4 students, and there will be one project report/presentation per group. There are many options for the class project but, crucially, students should choose a major undertaking because various components of the project (Project Literature Review, Project Proposal and Outline, Final Project Presentation, Final Project Write-Up) account for a total of 70% of your grade.

Data Sources

Kaggle is a platform for many kinds of data, and competitions from this platform can be modified for relevant investigations.

MIMIC is an open platform for health data. To obtain access to MIMIC, students must obtain CITI certification, and request access on Physionet.

Nightingale is an open platform for health data. Register Links to an external site. for the platform as soon as possible, and have a look at existing questions by reading through the documents here Links to an external site.. CITI training is required to use the platform. These projects focus mostly on fairness audits of machine learning systems trained on real health data.

Nightengale Project 1: Identifying fairness violations in breast cancer risk using digital pathology images
Every year, 40 million women get a mammogram; some go on to have an invasive biopsy to better examine a concerning area. Since the 1990s, we have found far more ‘cancers’, which has in turn prompted vastly more surgical procedures and chemotherapy. But death rates from metastatic breast cancer have hardly changed. When a pathologist looks at a biopsy slide, they are looking for known signs of cancer: tubules, cells with atypical looking nuclei, evidence of rapid cell division. These features still underlie critical decisions today with respect to how to manage the patient (e.g. give surgery/chemo or wait). Students can predictive models trained on pathology images to predict critical patient outcomes such as mortality and metastasis. They will identify patients at high risk of poor outcomes and compare rates of prediction in different gender/ethnicity groups. https://docs.nightingalescience.org/brca-psj-path.html Links to an external site.

Nightingale Project 2: Subtyping cardiac arrest with ECG
A patient is rushed into the ER, unconscious and in cardiac arrest. As the physician begins the resuscitation, they know only that the patient’s heart has stopped—but nothing else. What happened to cause the arrest? What immediate actions need to be taken? One of the only pieces of data available to the emergency physician in this situation is the electrocardiogram (ECG), which measures the electrical activity of the heart. Physicians use this to determine which immediate actions are needed. This rich signal might also contain other clues: about why the heart stopped, what physicians can do in the ER to give the patient the best possible chance of surviving, and the likelihood that a patient who survives will have a normal life, without profound physical or neurological impairments. Students will evaluate the fairness of algorithms using ECGs to predict clinical tasks such as cardiac arrest cause and patient survival post-hospital discharge from hospital. https://docs.nightingalescience.org/arrest-ntuh-ecg.html Links to an external site.

Authorship

A note on collaboration: Research is a collaborative activity and we encourage all students to collaborate and learn from each other. In general, when you put your name on something for research, you must: a) have materially contributed to the work, b) be able to defend the research, and c) acknowledge the contribution of others. Keep this in mind when working together and submitting material for evaluation.

A note on authorship: By the end of the course your final project may be sufficiently developed to submit to a peer-reviewed journal. The author order can be a somewhat controversial issue and is left to the project participants to decide. We would strongly encourage you to discuss what the order will be, or what philosophy you will use to decide the order while forming groups. In the case of a dispute during or after the course, the instructors will likely not be able to mediate in any meaningful way. We would also recommend equal authorship (now more common), but the decision is left to each team.

A note on acknowledgement: Papers that result from work done during this course should recognize the contributions of the course in an acknowledgement or in other sections. The suggested language is: "This manuscript was composed by participants in the EECS 6.882 course on Ethical Machine Learning in Human Deployments at the Massachusetts Institute of Technology, Spring 2022.'"

Other MIT Courses on Ethical ML:

Data-Driven Decision Making and Society

Data and Society

STS Computing and Society Concentration

Exploring Fairness in Machine Learning for International Development

This course content is offered under a CC Attribution Non-Commercial Links to an external site. license. Content in this course can be considered under this license unless otherwise noted.

6.882: Ethical Machine Learning in Human Deployments

6.882 Ethical Machine Learning in Human Deployments

Welcome to 6.882!

Overview

Grading

Schedule

Course Introduction and Overview

Case Study: Employment

Final Project Presentations

No Class

Project Details

Project Overview

Data Sources

Authorship