Using Optimal Machine Learning Algorithms to Predict Heart Failure Patient Classification
DOI:
https://doi.org/10.63665/vfxanb21Keywords:
Heart Failure Prediction, Machine Learning, XGBoost, SMOTE, Healthcare Analytics, Survival Prediction, Predictive ModelingAbstract
Heart failure remains one of the leading causes of mortality worldwide, posing a significant challenge to healthcare systems due to its complex diagnosis and high risk of late detection. Early and accurate prediction of heart failure can greatly enhance clinical decision-making, improve patient outcomes, and reduce mortality rates. This study presents a comprehensive machine learning-based framework for heart failure prediction, leveraging the powerful gradient boosting algorithm XGBoost in combination with the Synthetic Minority Over-sampling Technique SMOTE to effectively address class imbalance commonly present in clinical datasets.
The dataset used in this study consists of critical patient health indicators such as age, serum creatinine levels, ejection fraction, blood pressure, and diabetes status, all of which play a vital role in determining patient survival. To enhance model performance and reduce dimensionality, SelectKBest feature selection based on statistical significance was employed to identify the most relevant clinical attributes. A comparative analysis was conducted using multiple machine learning algorithms, including Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors, Support Vector Machine, and XGBoost, to evaluate their effectiveness in predicting heart failure outcomes.
Experimental results demonstrate that the XGBoost model significantly outperforms other algorithms, achieving an exceptional prediction accuracy of 99.70%, along with superior precision, recall, F1-score, and Area Under the ROC Curve (AUC-ROC). The integration of SMOTE contributed to improved classification of minority cases, thereby reducing bias and enhancing the model’s reliability. Furthermore, the proposed system was successfully deployed as a real-time prediction tool using the Flask framework, providing an interactive and user-friendly interface for healthcare practitioners to input patient data and obtain instant survival predictions.
Downloads
References
[1] European Society of Cardiology, T. A. McDonagh et al., “2021 ESC Guidelines for the Diagnosis and Treatment of Acute and Chronic Heart Failure,” European Heart Journal, 2021.
[2] G. Savarese and L. H. Lund, “Global Public Health Burden of Heart Failure,” Cardiac Failure Review, vol. 3, no. 1, pp. 7–11, 2017.
[3] A. Rajkomar, E. Oren, K. Chen et al., “Scalable and Accurate Deep Learning with Electronic Health Records,” New England Journal of Medicine, 2019.
[4] T. Ahmad et al., “Machine Learning Approaches to Survival Analysis in Heart Failure Patients,” PLoS ONE, 2017.
[5] A. Fernández, S. García, F. Herrera, and N. V. Chawla, “SMOTE for Learning from Imbalanced Data: Progress and Challenges,” Journal of Artificial Intelligence Research, 2018.
[6] G. Ke, Q. Meng, T. Finley et al., “LightGBM: A Highly Efficient Gradient Boosting Decision Tree,” Advances in Neural Information Processing Systems (NeurIPS), 2017.
[7] Tianqi Chen and Carlos Guestrin, “XGBoost: A Scalable Tree Boosting System,” ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016.
[8] Jerome H. Friedman, “Greedy Function Approximation: A Gradient Boosting Machine,” Annals of Statistics, 2001.
[9] Leo Breiman, “Random Forests,” Machine Learning Journal, vol. 45, no. 1, pp. 5–32, 2001.
[10] Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
