📘 Uncategorized

Statistical Learning Project II • Important: This is not a group project. Please work independently. • Data proj2data.csv A retrospective sample of 462 males in a heart-disease high-risk region.

TO Topessayz Expert · 📅 8 April 2026 · ⏱ 2 min read
✍️ Need help with this assignment? Get expert quotes in minutes — free to submit. ✍️ Get Writing Help FREE

S412/S512/T650 Statistical Learning Project II

 

 

•  Important: This is not a group project. Please work independently.

•  Data proj2data.csv A retrospective sample of 462 males in a heart-disease high-risk region. The data set contains the following variables

–    Response: HD: coronary heart disease (Yes) or not (No)

–    Predictor variables

∗ adiposity: the accumulation of excessive body fat;

∗ age: age in years;

∗ alcohol: current alcohol consumption;

∗ famhist: family history of heart disease (Yes, No);

∗ ldl: low density lipoprotein cholesterol;

∗ obesity: UN:Underweight, HE: Healthy weight, OV: over weight, OB: obesity

∗ sbp: systolic blood pressure;

∗ tobacco: cumulative tobacco (kg);

∗ typea: score of type-A personality;

•  Training data: the first 230 cases.

•  Testing data: the last 232 cases.

•  Apply all of the following models and find a model that has the lowest test data error rate and can be used to classify whether a patient has the coronary heart disease or not. The best final model should be chosen by comparing the test data error rates.

–        1. LDA

–        2. QDA

–        3. Logistic regression

–        4. Naive Bayes

–        5. KNN classifier

–        6. Tree-based Methods

–        7. Support vector machine

 

Project Report Format:

 

•  Please typeset your report using R Markdown in RStudio to produce a PDF file. Your report should have the following

–    A title

–    Abstract

–    Introduction

–    Section(s) containing details of your data analysis, include only relevant R codes used for data analysis, graphs and explanation of the methods you used, etc. If random data were generated you must specified a random seed so that your results can be repeated.

–    Summary of Results and Discussion which include a table to summarize the test error rate of the models trained and give the predicting formula of your final model if possible.

•  Limit your report within 15 pages. Only report necessary and relevant R code, graphs, and printout. Make your report neat, clean, readable, not like draft. Do not include the data in your report. Remove all unwanted “warnings” and “messages” using message=FALSE, warning=FALSE in R chunk or fix the issues that relate to the messages.

•  Important: Using AI tools or anything like this is prohibited. Use only the R libraries and functions taught or mentioned in the lectures or the textbook or no credit!

The post Statistical Learning Project II • Important: This is not a group project. Please work independently. • Data proj2data.csv A retrospective sample of 462 males in a heart-disease high-risk region. appeared first on Skilled Papers.

Plagiarism Free Assignment Help

Expert Help With This Assignment — On Your Terms

  • Native UK, USA & Australia writers
  • 100% Plagiarism-Free — Turnitin report included
  • Deadline from 3 hours
  • Unlimited free revisions
  • Free to submit — compare quotes
TO
Topessayz Expert
Academic Expert · Topessayz

Expert academic writer and education specialist helping students in the UK, USA, and Australia achieve their best results.

Need help with your own assignment?

Our expert writers can help you apply everything you've just read — to your actual assignment, brief, and marking criteria.

Get Expert Help Now →
Related Articles

You May Also Find Helpful

View All Articles →
📝 Free Submission — No Card Required

Need Help With This Assignment?

Our verified experts deliver 100% original, plagiarism-free work to your exact brief and marking criteria. Submit free — compare quotes — choose your expert.

Write My Assignment FREE Get A Free Quote →

No credit card · No commitment · First quote in minutes