HST.953/6.8850 Clinical Data Learning

Home

Welcome to HST 953/6.8850! 

Clinical Data Learning, Visualization, and Deployments

HST 953/6.8850: Clinical Data Learning

Artificial intelligence (AI) has the potential to transform healthcare worldwide. bearing promises of increased accuracy, efficiency, and cost-effectiveness, in areas as diverse as drug discovery, clinical diagnosis, and disease management. Furthermore, AI has been promoted as a tool that could expand the reach of quality healthcare to traditionally underserved patients and regions. But even with appropriate representation of marginalized communities with high quality data, the social patterning of the data generation process can still produce AI that is bound to preserve and even scale existing disparities in care with resulting inequities in patient outcomes. Creating algorithms from the digital exhaust of flawed human systems by AI developers who are not cognizant of the backstory of the data, risks cementing inequities as permanent fixtures in healthcare delivery systems. This course will introduce students to a portfolio of methodologies that learn patterns from the data. More importantly, it will explore data issues which if not addressed will have profound consequences on downstream prediction, classification and optimization tasks.

Instructors
Marzyeh Ghassemi 
Leo Anthony Celi
Adam Rodman
Ned McCague

Teaching Assistant
Alessandro Hammond
Omar Dahleh

Course Email: hst953faculty@mit.edu

Credits: 12 MIT credits; 5 Harvard credits

Fridays from 9:30 AM - 12PM
E25-117


Useful Links

Syllabus

OH Zoom link


Overview

HST.953/6.8850 is a course about the practical considerations for operationalizing machine learning in healthcare settings. 

The course will involve three homework assignments (one each on dataset creation, machine learning/visualization and implementation) followed by a course project proposal and presentation.

  • All students are required to complete human subjects training and submit proof of access for MIMIC-III and the eICU-CRD databases.
  • All students regardless of their enrollment status are expected to join a project group and contribute to a final project.

HST.953/6.8850 is not intended to teach graduate machine learning or visualization skills to students, and we expect that students will have some working knowledge of both in order to complete homework assignments and the project. 

We recommend the following courses, or some equivalent experience with subject matter in ML, visualization and HCI:

Recommended Courses:
CS Grad ML 6.867
CS Grad ML in Health 6.S897 / HST.956
CS Grad Visualization 6.813

We will start with a primer on machine learning concepts including but not limited to cross-validation, data leakage, benchmarks, performance metrics, and fairness evaluation. Publicly available high-resolution datasets (not registries) will be leveraged.

Other semester activities:

1. The Bias-athon is designed to address and mitigate biases in artificial intelligence (AI) systems. This workshop will leverage interdisciplinarity to identify, understand, and develop strategies to understand biases in clinical AI datasets. Participants will engage in hands-on sessions where they explore various types of biases, such as measurement bias, and variation in the degree of monitoring from social determinants of care, and their impact on AI performance. 

2. A prompt-athon and red teaming will focus on enhancing the effectiveness and reducing the bias of large language models. This workshop is designed for clinicians who are already or who are thinking of using these tools for summarizing patient course, drafting content for progress notes and letters to other providers and to the patients, and soliciting differential diagnoses, treatment recommendations and prognostication. Participants will be introduced to various prompt engineering techniques that can leverage the power of this technology. Through collaborative exercises, attendees will experiment with different types of prompts, analyze the outputs, and refine their strategies to achieve better results. The event will also include discussions on the challenges of prompt design, such as avoiding ambiguity and ensuring context-appropriateness. 

3. The Health AI Systems Thinking for Equity (HASTE) Policy Workshop is organized to explore the regulatory and ethical frameworks surrounding the use of AI technologies. Sessions will cover a range of topics, including transparency and accountability, power structures and the political economy that drives the impact of AI. Participants will engage in brainstorming and dialogue and propose solutions to complex policy issues. The goal is to engender a systems thinking mindset among developers and users of AI to improve population health.


Grading

Weekly Reflections: the weekly reflections, corresponding to Week 2 - Week 10, will be done as a Canvas discussion, are due before class, and are worth 1 point (1.67% of your grade) per week. This means that reflections are worth a total of 15% of your grade.

Three Problem Sets: Problem sets 1, 2, and 3 are each worth 10 points, or 16.67% of your grade. This means problem sets are worth a total of 50% of your grade. 

Course Final Project: The submission of the project teams is worth 1 point (1.67% of your grade), the final project presentation is worth 10 points (16.67% of your grade) and the final project write up is worth 10 points (16.67% of your grade).

Plagiarism: Student code submissions may be submitted by the instructors to a plagiarism detection tool for a review of similarity and detection of possible plagiarism. Submissions will be used solely for the purpose of detecting similarity, and are not retained indefinitely on the server; typically results are deleted after 14 days but may be removed sooner. For more information on the tool used, refer to https://theory.stanford.edu/~aiken/moss/.


Schedule

Week Date Lecture Materials Assignments
1/DATA Sept 6, 2024
  • 9:30 - 10:10 Dr. Marzyeh Ghassemi "Ethical Machine Learning in Health"

  • 10:20 - 11:20 Course Staff "Course Overview"
                             

    --- BREAK ---


  • 11:30 - 12:30 Catherine Bielick "The problems in healthcare"
  • [Slides]

Readings:

 

2/DATA Sept 13, 2024
  • 9:40 - 10:40 Mohammad Mamdani on Challenges of implementation

    --- BREAK ---

  • 10:50 - 11:50 Adam Rodman on Project Implementation

    --- BREAK ---

  • 12:00 - 12:30 Jack Gallifant on Ethical Considerations in data collection and use
  • [Slides]

Readings:

Extra Material:

 

3/DATA Sept 20, 2024

Lecture by Adam Rodman and Jack Gallifant

[Slides]

Readings:

4/ML.VIS Sept 27, 2024
  • 9:30 - 10:40 Adam Rodman -  "Representation learning in ML and Health"

    --- BREAK ---

  • 10:50 - 11:50 Takashi and Rodrigo on TRIPOD-AI exercise

Readings:

5/ML.VIS Oct 4, 2024
  • 9:30 - 10:40 Adam Rodman on Model evaluation: health system

    --- BREAK ---

  • 10:50 - 12 Tom Pollard on Data Sharing

Lecture Slides Links to an external site.

Readings:

6/ML.VIS Oct 11, 2024
  • 9:30 - 10:40 Matthew McDermott on Model evaluation: ML perspective

    --- BREAK ---

  • 10:50 - 11:50 Leo Anthony Celi on HASTE policy camp

Lecture Slides Links to an external site.

Readings:

7/IMP Oct 18, 2024
  • 9:40 - 10:40 Jim Smit on Causal machine learning

    --- BREAK ---

  • 10:50 - 12 pm Shalmali Joshi on Causal machine learning

Lecture Slides Links to an external site.

Readings:

8/IMP Oct 25, 2024
  • 9:30 - 10:40 Leo Celi

    --- BREAK ---
  • 10:50 - 11:10 Liam McCoy

    --- BREAK ---

  • 11:10-11:50 Midpoint Presentation 

Lecture Slides Links to an external site. 

mid-point presentation is scheduled for Friday 25th October 2024.

 

9/IMP Nov 1, 2024
  • 9:30 - 10:40 Amol Verma and Gabe Brat

    --- BREAK ---

  • 10:50-11:50 Team Work on Projects

Lecture Slides

Readings:

10/IMP Nov 8, 2024
  • 9:30 - 10:40 Heather Mattie
    --- BREAK ---

  • 10:50 - 11:50 Ned McCague

Lecture Slides #1 Links to an external site.

Lecture Slides #2 Links to an external site.

Readings: 

What Artificial Intelligence Means for Health Care Links to an external site.

Revolutionizing healthcare: the role of artificial intelligence in clinical practice Links to an external site.

AI is Already Reshaping Care: Here's What it Means for Doctors Links to an external site.

  • Reflection 9 Due
11 Nov 15, 2024
  • 9:30 - 10:40 Charlotta Lindvall
    --- BREAK ---

  • 10:50 - 11:50  Thomas Souneck

 

12 Nov 22, 2024
  • Leo Anthony Celi on Haste Policy Camp

  • Leo Anthony Celi on Haste Policy Camp

Readings: 

Explainable artificial intelligence in breast cancer detection and risk prediction: A systematic scoping review Links to an external site.

The ethical issues of the application of artificial intelligence in healthcare: a systematic scoping review Links to an external site.

Artificial Intelligence in Health Care: A Report From the National Academy of Medicine Links to an external site.

  • Reflection 10 Due

 

13 Nov 29, 2024

 THANKSGIVING WEEK,  Project Work Week

 

14 Dec 6, 2024
  • Final Presentations

Final Projects to be Presented in Class

Slides should be sent by 9 am on Dec 6th.
Each team (19 total) will present for 8 minutes in class.
All team members are expected to be present, a subset may present.
If a specific block of class time is required, let instructors know.

 


Project Details

Projects and Authorship

A note on collaboration: Research is a collaborative activity and we encourage all students to collaborate and learn from each other. In general, when you put your name on something for research, you must: a) have materially contributed to the work, b) be able to defend the
research, and c) acknowledge the contribution of others. Keep this in mind when working together and submitting material for evaluation.

A note on authorship: As noted, the expectation is that by the end of the course the final project will be sufficiently developed to submit to a peer-reviewed journal. The author order can be a somewhat controversial issue and is left to the project participants to decide. We would strongly encourage you to discuss what the order will be, or what philosophy you will use to decide the order while forming groups. In the case of a dispute during or after the course, the instructors will likely not be able to mediate in any meaningful way. We would also recommend equal authorship (now more common), but the decision is left to each team.

For the clinicians: If you expect a certain level of authorship (first, last, etc.) you should mention this in your project pitch. Keep in mind that this is a two-way street involving both clinicians and data scientists. If a project fails to garner enough interest, it may not be able to be completed as part of the course.

A note on acknowledgement: Papers that result from work done during this course should recognize the contributions of the course in an acknowledgement or in other sections. The suggested language is: "This manuscript was composed by participants in the HST.953 course at the Massachusetts Institute of Technology, Fall 2022.'"

 

Computing Credits