Publications
59 Publications
Tech
Conference
Non-cardiovascular
Benchmarking ECG Delineation using Deep Neural Network-based Semantic Segmentation Models
Accurate electrocardiogram (ECG) delineation is essential for automated cardiac diagnosis, enabling the precise identification of key waveforms such as the P wave, QRS complex, and T wave. This study presents the first comprehensive benchmarking of neural network-based semantic segmentation models for ECG delineation, evaluating their accuracy, resource efficiency, and robustness across both public and private datasets. Our results demonstrate that convolutional neural network (CNN)-based approaches consistently achieve superior accuracy compared to Transformer-based approaches. Additionally, we observed the presence of fragmented segments in the delineation results. To address this issue, we explored post-processing techniques to consolidate or eliminate fragmented segments using an optimal configuration, leading to performance improvements. Furthermore, by analyzing performance variations across different waveform labels, we provide critical insights into key considerations for ECG segmentation tasks. Notably, our findings also reveal that larger model sizes do not necessarily correlate with better performance. Based on our findings, we propose a set of practical guidelines for leveraging segmentation models in ECG delineation, offering valuable direction for future research and clinical applications.
Proceedings of the Conference on Health, Inference, and Learning
June 25, 2025
View original text(in a new window)
Tech
Conference
Non-cardiovascular
Test-Time Calibration: A Framework for Personalized Test-Time Adaptation in Real-World Biosignals
Test-Time Adaptation (TTA) methods have been widely used to enhance model robustness by continuously updating pre-trained models with unlabeled target data. However, in real-world biosignal applications-where factors such as age, lifestyle, and comorbidities induce significant variability—traditional TTA often falls short in addressing personalization needs. To satisfy such needs, we introduce a novel Test-Time Calibration (TTC) framework that integrates continuous self supervised adaptation on unlabeled samples with periodic supervised calibration using the sporadically available ground-truth labels. Our approach leverages a model equipped with dual heads for supervised learning (SL) and self-supervised learning (SSL), and further incorporates a dual buffer along with a weighted batch sampling strategy to effectively manage and utilize both data types during the test phase. We evaluate our framework on two distinct datasets: the publicly available PulseDB, a benchmark for cuff-less blood pressure estimation, and a private ICU dataset collected from critically ill patients. Experimental results demonstrate that our approach improves blood pressure prediction accuracy and robustness, highlighting its suitability for dynamic, personalized biosignal applications.
Proceedings of the Conference on Health, Inference, and Learning
June 25, 2025
View original text(in a new window)
Tech
Journal
Non-cardiovascular
Unveiling the secrets of neural network scaling for ECG classification
We present a new perspective on scaling neural networks for electrocardiograms (ECG). Although ResNet-based models are widely used in ECG classification, the potential benefits of network scaling remain unexplored. Our research investigates the impact of changes in the depth of layers, the number of channels, and the dimensions of the convolution kernels on performance. Contrary to computer vision practices, we found that shallower networks, with more channels and smaller kernels, lead to better performance for ECG classifications. Based on these findings, we provide insights that can guide the efficient development of models in practice. Finally, we explore why scaling hyperparameters affects ECG and computer vision differently. Our findings suggest that the inherent periodicity of the ECG signals plays a crucial role in this difference.
Tech
Conference
Non-cardiovascular
New Test-Time Scenario for Biosignal: Concept and Its Approach
Online Test-Time Adaptation (OTTA) enhances model robustness by updating pretrained models with unlabeled data during testing. In healthcare, OTTA is vital for realtime tasks like predicting blood pressure from biosignals, which demand continuous adaptation. We introduce a new test-time scenario with streams of unlabeled samples and occasional labeled samples. Our framework combines supervised and self-supervised learning, employing a dual-queue buffer and weighted batch sampling to balance data types. Experiments show improved accuracy and adaptability under real-world conditions.
Findings paper presented at Machine Learning for Health (ML4H)
November 26, 2024
View original text(in a new window)
Medical
Journal
Cardiovascular
Artificial Intelligence-Enhanced 6-Lead Portable Electrocardiogram Device for Detecting Left Ventricular Systolic Dysfunction: A Prospective Single-Centre Cohort Study
The real-world effectiveness of the artificial intelligence model based on electrocardiogram (AI-ECG) signals from portable devices for detection of left ventricular systolic dysfunction (LVSD) requires further exploration. In this prospective, single-centre study, we assessed the diagnostic performance of AI-ECG for detecting LVSD using a six-lead hand-held portable device (AliveCor KardiaMobile 6L). We retrained the AI-ECG model, previously validated with 12-lead ECG, to interpret the 6-lead ECG inputs. Patients aged 19 years or older underwent six-lead ECG recording during transthoracic echocardiography. The primary outcome was the area under the receiver operating characteristic curve (AUROC) for detecting LVSD, defined as an ejection fraction below 40%. Of the 1716 patients recruited prospectively, 1635 were included for the final analysis (mean age 60.6 years, 50% male), among whom 163 had LVSD on echocardiography. The AI-ECG model based on the six-lead portable device demonstrated an AUROC of 0.924 [95% confidence interval (CI) 0.903–0.944], with 83.4% sensitivity (95% CI 77.8–89.0%) and 88.7% specificity (95% CI 87.1–90.4%). Of the 1079 patients evaluated using the AI-ECG model based on the conventional 12-lead ECG, the AUROC was 0.962 (95% CI 0.947–0.977), with 90.1% sensitivity (95% CI 85.0–95.2%) and 91.1% specificity (95% CI 89.3–92.9%). The AI-ECG model constructed with the six-lead hand-held portable ECG device effectively identifies LVSD, demonstrating comparable accuracy to that of the conventional 12-lead ECG. This highlights the potential of hand-held portable ECG devices leveraged with AI as efficient tools for early LVSD screening.
Medical
Journal
Cardiovascular
Electrocardiographic-Driven artificial intelligence Model: A new approach to predicting One-Year mortality in heart failure with reduced ejection fraction patients Author links open overlay panel
Despite the proliferation of heart failure (HF) mortality prediction models, their practical utility is limited. Addressing this, we utilized a significant dataset to develop and validate a deep learning artificial intelligence (AI) model for predicting one-year mortality in heart failure with reduced ejection fraction (HFrEF) patients. The study’s focus was to assess the effectiveness of an AI algorithm, trained on an extensive collection of ECG data, in predicting one-year mortality in HFrEF patients.
Medical
Journal
Cardiovascular
Artificial intelligence applied to electrocardiogram to rule out acute myocardial infarction: the ROMIAE multicentre study
Emerging evidence supports artificial intelligence–enhanced electrocardiogram (AI-ECG) for detecting acute myocardial infarction (AMI), but real-world validation is needed. The aim of this study was to evaluate the performance of AI-ECG in detecting AMI in the emergency department (ED). The Rule-Out acute Myocardial Infarction using Artificial intelligence Electrocardiogram analysis (ROMIAE) study is a prospective cohort study conducted in the Republic of Korea from March 2022 to October 2023, involving 18 university-level teaching hospitals. Adult patients presenting to the ED within 24 h of symptom onset concerning for AMI were assessed. Exposure included AI-ECG score, HEART score, GRACE 2.0 score, high-sensitivity troponin level, and Physician AMI score. The primary outcome was diagnosis of AMI during index admission, and the secondary outcome was 30 day major adverse cardiovascular event (MACE). The study population comprised 8493 adults, of whom 1586 (18.6%) were diagnosed with AMI. The area under the receiver operating characteristic curve for AI-ECG was 0.878 (95% CI, 0.868–0.888), comparable with the HEART score (0.877; 95% CI, 0.869–0.886) and superior to the GRACE 2.0 score, high-sensitivity troponin level, and Physician AMI score. For predicting 30 day MACE, AI-ECG (area under the receiver operating characteristic, 0.866; 95% CI, 0.856–0.877) performed comparably with the HEART score (0.858; 95% CI, 0.848–0.868). The integration of the AI-ECG improved risk stratification and AMI discrimination, with a net reclassification improvement of 19.6% (95% CI, 17.38–21.89) and a C-index of 0.926 (95% CI, 0.919–0.933), compared with the HEART score alone. In this multicentre prospective study, the AI-ECG demonstrated diagnostic accuracy and predictive power for AMI and 30 day MACE, which was similar to or better than that of traditional risk stratification methods and ED physicians.
Medical
Journal
Cardiovascular
AI-Enabled Smartwatch ECG: A Feasibility Study for Early Prediction and Prevention of Heart Failure Rehospitalization
This study explores the use of artificial intelligence-enabled electrocardiogram (AI-ECG) technology combined with smartwatch-based daily monitoring to predict heart failure (HF) rehospitalization by detecting early signs of heart failure exacerbation, such as left ventricular systolic dysfunction (LVSD), left ventricular diastolic dysfunction (LVDD), and myocardial infarction (MI). Traditional monitoring methods have limitations, and AI-ECG offers a scalable, noninvasive, cost-effective solution. The study will evaluate the impact of this technology on reducing rehospitalization rates in a prospective, multicenter trial involving 220 adult patients recently discharged after HF hospitalization. The primary endpoint is a reduction in HF rehospitalization rates within 90 days, with secondary endpoints including time to clinical intervention, unplanned hospitalizations, and improvements in mortality and quality of life. This approach represents a promising, patient-friendly solution for better HF management.
Tech
Journal
Cardiovascular
Transparent and robust Artificial intelligence-driven Electrocardiogram model for Left Ventricular Systolic Dysfunction
Heart failure (HF) is an escalating global health concern, worsened by an aging population and limitations in traditional diagnostic methods like electrocardiograms (ECG). The advent of deep learning has shown promise for utilizing 12-lead ECG models for the early detection of left ventricular systolic dysfunction (LVSD), a crucial HF indicator. This study validates the AiTiALVSD, an AI/machine learning-enabled Software as a Medical Device, for its effectiveness, transparency, and robustness in detecting LVSD. Conducted at Mediplex Sejong Hospital in the Republic of Korea, this retrospective single-center cohort study involved patients suspected of LVSD. The AiTiALVSD model, which is based on a deep learning algorithm, was assessed against echocardiography findings. To improve model transparency, the study utilized Testing with Concept Activation Vectors (TCAV) and included clustering analysis and robustness tests against ECG noise and lead reversals. The study involved 688 participants and found AiTiALVSD to have a high diagnostic performance, with an AUROC of 0.919. There was a significant correlation between AiTiALVSD scores and left ventricular ejection fraction values, confirming the model’s predictive accuracy. TCAV analysis showed the model’s alignment with medical knowledge, establishing its clinical plausibility. Despite its robustness to ECG artifacts, there was a noted decrease in specificity in the presence of ECG noise. AiTiALVSD’s high diagnostic accuracy, transparency, and resilience to common ECG discrepancies underscore its potential for early LVSD detection in clinical settings. This study highlights the importance of transparency and robustness in AI/ML-based diagnostics, setting a new benchmark in cardiac care.
Tech
Journal
Non-cardiovascular
Unveiling AI-ECG using Generative Counterfactual XAI Framework
The application of artificial intelligence (AI) to electrocardiograms (ECGs) has shown great promise in the screening and diagnosis of cardiovascular diseases, often matching or surpassing human expertise. However, the “black-box” nature of deep learning models poses significant challenges to their clinical adoption. While Explainable AI (XAI) techniques, such as Saliency Maps, have attempted to address these issues, they have not been able to provide clear, clinically relevant explanations. We developed the Generative Counterfactual ECG XAI (GCX) framework, which uses counterfactual scenarios to explain AI predictions, enhancing interpretability and aligning with medical knowledge. We designed a study to validate the GCX framework by applying it to eight AI-ECG models, including those focused on regression of six ECG features, potassium level regression, and atrial fibrillation (AF) classification. PTB-XL and MIMIC-IV were used to develop and test. GCX generated counterfactual (CF) ECGs to visualize how changes in the ECG relate to AI-ECG predictions. We visualized CF ECGs for qualitative comparisons, statistically compared ECG features, and validated these findings with conventional ECG knowledge. The GCX framework successfully generated interpretable ECGs aligned with clinical knowledge, particularly in the context of ECG feature regression, potassium level regression, and AF classification. For ECG feature regression, GCX demonstrated clear and consistent changes in features, reflecting the corresponding morphological alterations. CF ECGs for hyperkalemia showed a prolonged PR, discernible P wave, increased T wave amplitude, and widened QRS complex, whereas those for AF demonstrated the disappearance of the P wave and irregular rhythms. The GCX framework enhances the interpretability of AI-ECG models, offering clear relevant explanations for AI predictions. This approach holds substantial potential for improving the trust and utility of AI in clinical practice, although further validation across diverse datasets is required.