Results
0.932
Macro-AUC
(SOTA Ensemble)
0.844
AUROC on
External Dataset
21K+
ECG Records
Trained On
The Problem
Cardiovascular diseases are the leading cause of death globally. Early diagnosis via ECG is critical but requires specialized expertise. I built an end-to-end deep learning pipeline to classify ECG signals into 5 diagnostic superclasses: Normal, Myocardial Infarction, ST/T Change, Conduction Disturbance, and Hypertrophy.
What I Built
- SE-ResNet architecture with Squeeze-and-Excitation blocks for lead-wise attention across all 12 ECG leads
- Custom data pipeline with bandpass filtering, Z-score normalization, and stratified splitting on PTB-XL (21,799 records)
- Mixup augmentation to generate synthetic training samples and Focal Loss to address heavy class imbalance
- Ensemble of baseline + Focal Loss variants for state-of-the-art multi-label performance
- External validation on Chapman-Shaoxing dataset (10,646 records) to test generalization
Why It Matters
This project sits at the intersection of AI and real clinical impact. The ensemble approach achieved 0.932 Macro-AUC on the PTB-XL test set, with strong external generalization (0.844 AUROC). It demonstrates how thoughtful architecture choices and training techniques can meaningfully close the gap between AI models and clinical-grade tools.