I am working on a deep learning project to forecast Sudden Cardiac Death (SCD) using ECG data from PhysioNet. Specifically, I need to download and preprocess the following databases:
MIT-BIH Normal Sinus Rhythm (NSR) Database MIT-BIH Sudden Cardiac Death Holter (SCDH) Database
Goal:
- Download the full datasets programmatically in Python.
- Extract raw ECG signals and annotations from .dat, .hea, and .atr files.
- Preprocess the ECG data (denoising, R-peak detection, feature extraction).
- Train a deep learning model for SCD prediction
Attempted Approach: I tried using wfdb and physionet-datasets but ran into issues when downloading complete datasets. Here’s my current approach:
!pip install wfdb
import wfdb
record = wfdb.rdrecord('nsrdb/16265', pn_dir='nsrdb')
annotation = wfdb.rdann('nsrdb/16265', 'atr', pn_dir='nsrdb')
print(record.__dict__)
print(annotation.__dict__)
Issues & Questions:
- How do I download ALL records from both databases in one go? (instead of manually specifying each file)
- Are there better methods to bulk download PhysioNet datasets efficiently?
- How can I extract full ECG signals along with annotations for preprocessing?
- Any best practices for handling large ECG datasets in deep learning?