We're Open
+44 7340 9595 39
+44 20 3239 6980

EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS

  100% Pass and No Plagiarism Guaranteed

EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS

Task 2 Exploratory Data Analysis and Decision Tree Analysis (Worth 25 Marks)

Task 2.1) Conduct an exploratory data analysis of the patient-health.csv data set using the RapidMiner Studio data mining tool. Summarise the findings of your exploratory data analysis in terms of describing key characteristics of each of the variables in the patient- health.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relationships with other variables if relevant in a table named Table 2.1 Results of Exploratory Data Analysis for the patient-health.csv Data Set.

Hint: The Statistics Tab and the Chart Tab in RapidMiner provide descriptive statistical information and useful charts like Barcharts, Scatterplots etc. You might also like to look at running some correlations and chi square tests to indicate which variables you consider to be the top five key variables and which contribute most to determining whether a patient is healthy. Note in completing Task 2.1 you will find it useful to refer to the data dictionary for the patient-health.csv data set provided in this document which defines each of the variables in terms of their data type and range of values.

Briefly discuss the key results of your exploratory data analysis presented in Table 2.1 and the rationale for why you have selected your five top variables for predicting Patient Health. (About 250 words)

Task 2.2) Build a Decision Tree model for predicting Patient Health using RapidMiner and an appropriate set of data mining operators and a reduced patient-health.csv data set determined by your exploratory data analysis in Task 2.1. Provide these outputs from RapidMiner (1) Final Decision Tree Model process, (2) Final Decision Tree diagram, and (3) Decision Tree rules for Task 2.2.

Briefly describe your final Decision Tree Model Process, and discuss the results of the Final Decision Tree Model drawing on the key outputs (Decision Tree Diagram, Decision Tree Rules) for predicting Patient Health and relevant supporting literature on the interpretation of decision trees (About 250 words).

Include all appropriate RapidMiner outputs such as RapidMiner Processes, Graphs and Tables that support the key aspects of your exploratory data analysis and decision tree model analysis of the data set in your Assignment 2 report. Note you need export these outputs from RapidMiner using the File/Print/Export Image option and where relevant include in Task 2 and/or in Appendix A of the Assignment 2 report.

Table 1 Patient Health Data Set Data Dictionary

Variable Name

Type and description of variable

Range of values

1.

Patient_id

Integer Patient Id

Range 1 to 20,000

2.

genhealth

Polynominal, Health Rating of each patient

Poor, Fair, Good, Very Good,

 

 

 

Excellent

3.

exerany

Integer, does the patient exercise?

1 or 0

4.

hlthplan

Integer, Health insurance plan?

1 or 0

5.

smoke100

Integer, Smoker?

1 or 0

6.

height

Integer, height in inches of patient

Height range in inches

7.

weight

Integer, weight in pounds of each patient?

Weight range in pounds

8.

wtdesire

Integer, desired weight of each patient can be

Desired weight of each patient

 

 

used to calculate if a patient is overweight etc

in pounds

9.

age

Integer

Age of each patient

10.

gender

Polynominal, Gender of each patient

M = Male; F = Female

 


100% Plagiarism Free & Custom Written,
Tailored to your instructions


International House, 12 Constance Street, London, United Kingdom,
E16 2DQ

UK Registered Company # 11483120


100% Pass Guarantee

STILL NOT CONVINCED?

View our samples written by our professional writers to let you comprehend how your work is going to look like. We have categorised this into 3 categories with a few different subject domains

View Our Samples

We offer a £ 2999

If your assignment is plagiarised, we will give you £ 2999 in compensation

Recent Updates

Details

  • Title: EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS
  • Price: £ 109
  • Post Date: 2018-11-10T06:27:42+00:00
  • Category: Assignment
  • No Plagiarism Guarantee
  • 100% Custom Written

Customer Reviews

 EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS
Reviews: 5

A masterpiece of assignment by , written on 2020-03-12

I had wasted so much money for poor quality work on other sites that I became despair to find a reliable one. I landed on Insta Research and finally, I am fully satisfied after ages from their quality of the work and instant response. Also, the rates are handy. Great going guys!
Reviews: 5

A masterpiece of assignment by , written on 2020-03-12

Oh my god! This writing company has saved me from so many bothering and figured out my problem in the best way possible. I am not fond of reading and when this book review was given to me as part of my coursework, I went into depression. But I must say, my writer came up with an amazing book review covering all the major aspects of the book nicely. I am waiting for other assignments to come so that I would come here again. The place is good and quite reasonable as well which makes it easy for me to manage my budget.