We're Open
+44 7340 9595 39
+44 20 3239 6980

EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS

  100% Pass and No Plagiarism Guaranteed

EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS

Task 2 Exploratory Data Analysis and Decision Tree Analysis (Worth 25 Marks)

Task 2.1) Conduct an exploratory data analysis of the patient-health.csv data set using the RapidMiner Studio data mining tool. Summarise the findings of your exploratory data analysis in terms of describing key characteristics of each of the variables in the patient- health.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relationships with other variables if relevant in a table named Table 2.1 Results of Exploratory Data Analysis for the patient-health.csv Data Set.

Hint: The Statistics Tab and the Chart Tab in RapidMiner provide descriptive statistical information and useful charts like Barcharts, Scatterplots etc. You might also like to look at running some correlations and chi square tests to indicate which variables you consider to be the top five key variables and which contribute most to determining whether a patient is healthy. Note in completing Task 2.1 you will find it useful to refer to the data dictionary for the patient-health.csv data set provided in this document which defines each of the variables in terms of their data type and range of values.

Briefly discuss the key results of your exploratory data analysis presented in Table 2.1 and the rationale for why you have selected your five top variables for predicting Patient Health. (About 250 words)

Task 2.2) Build a Decision Tree model for predicting Patient Health using RapidMiner and an appropriate set of data mining operators and a reduced patient-health.csv data set determined by your exploratory data analysis in Task 2.1. Provide these outputs from RapidMiner (1) Final Decision Tree Model process, (2) Final Decision Tree diagram, and (3) Decision Tree rules for Task 2.2.

Briefly describe your final Decision Tree Model Process, and discuss the results of the Final Decision Tree Model drawing on the key outputs (Decision Tree Diagram, Decision Tree Rules) for predicting Patient Health and relevant supporting literature on the interpretation of decision trees (About 250 words).

Include all appropriate RapidMiner outputs such as RapidMiner Processes, Graphs and Tables that support the key aspects of your exploratory data analysis and decision tree model analysis of the data set in your Assignment 2 report. Note you need export these outputs from RapidMiner using the File/Print/Export Image option and where relevant include in Task 2 and/or in Appendix A of the Assignment 2 report.

Table 1 Patient Health Data Set Data Dictionary

Variable Name

Type and description of variable

Range of values

1.

Patient_id

Integer Patient Id

Range 1 to 20,000

2.

genhealth

Polynominal, Health Rating of each patient

Poor, Fair, Good, Very Good,

 

 

 

Excellent

3.

exerany

Integer, does the patient exercise?

1 or 0

4.

hlthplan

Integer, Health insurance plan?

1 or 0

5.

smoke100

Integer, Smoker?

1 or 0

6.

height

Integer, height in inches of patient

Height range in inches

7.

weight

Integer, weight in pounds of each patient?

Weight range in pounds

8.

wtdesire

Integer, desired weight of each patient can be

Desired weight of each patient

 

 

used to calculate if a patient is overweight etc

in pounds

9.

age

Integer

Age of each patient

10.

gender

Polynominal, Gender of each patient

M = Male; F = Female

 


100% Plagiarism Free & Custom Written,
Tailored to your instructions


International House, 12 Constance Street, London, United Kingdom,
E16 2DQ

UK Registered Company # 11483120


100% Pass Guarantee

STILL NOT CONVINCED?

View our samples written by our professional writers to let you comprehend how your work is going to look like. We have categorised this into 3 categories with a few different subject domains

View Our Samples

We offer a £ 2999

If your assignment is plagiarised, we will give you £ 2999 in compensation

Recent Updates

Details

  • Title: EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS
  • Price: £ 109
  • Post Date: 2018-11-10T06:27:42+00:00
  • Category: Assignment
  • No Plagiarism Guarantee
  • 100% Custom Written

Customer Reviews

 EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS EXPLORATORY DATA ANALYSIS AND DECISION TREE ANALYSIS
Reviews: 5

A masterpiece of assignment by , written on 2020-03-12

CIPD assignment is not my cup of tea. That’s the reason I sought out this place suggested by my friend. I would say that the writers of this site are really admiring. I was assigned the best CIPD writer that solved all my issues. He explained to me the difficult topics so well that now I am able to talk on those topics eloquently. I owe my writer a huge thanks and praise! And yes, I would recommend other students as well to come to instaresearch.co.uk for the top CIPD assignment help.
Reviews: 5

A masterpiece of assignment by , written on 2020-03-12

Writing is not my field. I take help from this website for my accounting assignment. The work is good and I scored good grades in it. Thank you from the bottom of my heart.