A Project Report on Heart Disease Prediction System
Abstract
This study proposes a machine learning method to identify patients who may have heart disease or not. The results shows either the patient has heart disease or not, I created a logistic regression model using a dataset of 303 patient records, each of which had 13 clinical variables or columns.
Then I preprocessed the data, exploratory the data, and created scikit-learn model in the methodology. In order to preserve the target variable's distribution using stratified sampling, I divided the data into two sections: 80/20, which means 80% data for training and 20% for testing.
With an accuracy of 81.97% on the test data and 85.12% on the training data, our logistic regression model showed strong generalization to new data. Individual patient data could now be classified in real time thanks to the implementation of a predictive system.
My model has some limitations, like single algorithm Ire being used and the dataset I have used is very small, which contains only 303 patient records and has only 13 columns. Future studies should look into complex algorithm and a dataset with more patient records.
This study shows that, despite of small dataset and less complex algorithm it can benefit the medical sector, especially the field of heart disease prediction.
Collections
- General [1326]