| خلاصه مقاله | Cardiovascular disease (CVD) is a leading global health concern, necessitating the development of effective predictive tools. This study comparatively evaluated the performance of four machine learning models – Logistic Regression, Naive Bayes, K-Nearest Neighbors (KNN), and AdaBoost – for predicting cardiovascular risk using data from the Azar cohort study, a large dataset of 15,001 individuals. After preprocessing steps including handling missing values and feature scaling, the models were trained and tested, with hyperparameter tuning performed using grid search. Performance was assessed using accuracy, precision, recall, and F1-score. The KNN model achieved the highest balanced performance with an accuracy of 0.94 and an F1-score of 0.91. AdaBoost also demonstrated strong performance, with an accuracy of 0.94 and an F1-score of 0.89. Logistic Regression exhibited good performance, while Naive Bayes showed high precision but lower recall. The results suggest that KNN is the most effective model among those evaluated for CVD risk prediction in the Azar cohort, with AdaBoost showing comparable strength. These findings highlight the potential of machine learning for enhancing CVD risk stratification, particularly when leveraging methods like KNN and AdaBoost. |