dc.description.abstract | This study proposes a machine learning method to identify patients who may have heart disease or not. The results shows either the patient has heart disease or not, I created a logistic regression model using a dataset of 303 patient records, each of which had 13 clinical variables or columns.
Then I preprocessed the data, exploratory the data, and created scikit-learn model in the methodology. In order to preserve the target variable's distribution using stratified sampling, I divided the data into two sections: 80/20, which means 80% data for training and 20% for testing.
With an accuracy of 81.97% on the test data and 85.12% on the training data, our logistic regression model showed strong generalization to new data. Individual patient data could now be classified in real time thanks to the implementation of a predictive system.
My model has some limitations, like single algorithm Ire being used and the dataset I have used is very small, which contains only 303 patient records and has only 13 columns. Future studies should look into complex algorithm and a dataset with more patient records.
This study shows that, despite of small dataset and less complex algorithm it can benefit the medical sector, especially the field of heart disease prediction. | en_US |