Final project repo for James, Nate, Cooper and Alan
Project overview:
- Apply a ML algorithm to solve an interesting problem, if possible try to approach the problem with more than one approach.
- We will apply three classification algorithms to patients' medical information and determine whether they have a lower (0) or higher (1) chance of heart attack. This work is based on the dataset found on Kaggle: https://www.kaggle.com/datasets/rashikrahmanpritom/heart-attack-analysis-prediction-dataset?resource=download
- Our choices of classification algorithms include: 1) binary tree 2) random forest 3) logistic regression
- We will train all three models and validate them with a test set, taken from the full dataset above, and evaluate the accuracy of each model based on their false-positive rate.
Writeup requirements:
- Max of 4 pages and can have 1 page for bibliography
- Must be in LaTeX
Structure of report:
- Intro (background on why determining heart attacks is important, how other people do it)
- Methods: briefly describe the 3 algorithms we will use (probably 1/2 page max for this??)
- Results: describe the accuracy of each method and also mention the relevant hyperparameter values for repeatability (can be tabulated if needed), maybe show some plots not sure what those would be yet
- Conclusions/Impact: Why ML-based predictors like this are v important
Bibliography: this is super easy in latex as long as we have sources to cite.
Links to Overeleaf document This is a view-only link (for when we submit the github on Sakai): https://www.overleaf.com/read/hkhjbmzwtghs