We're Open
+44 7340 9595 39
+44 20 3239 6980

Conduct an exploratory data analysis of the weatherAUS.csv data set using RapidMiner to understand the characteristics of each va

  100% Pass and No Plagiarism Guaranteed

Conduct an exploratory data analysis of the weatherAUS.csv data set using RapidMiner to understand the characteristics of each variable and the relationship of each variable to the other variables in the data set.

In this paper, you are required to conduct an exploratory data analysis of the weatherAUS.csv data set using RapidMiner to understand the characteristics of each variable and the relationship of each variable to the other variables in the data set.

Task 1.1 Conduct an exploratory data analysis of the weatherAUS.csv data set using RapidMiner to understand the characteristics of each variable and the relationship of each variable to the other variables in the data set. Summarise the findings of your exploratory data analysis in terms of describing key characteristics of each of the variables in the weather data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relation ships with other variables if relevant in a table named Task 1.1 Results of Exploratory Data

Analysis for weatherAUS Data Set.

 

HintStatistics Tab and Chart Tab in RapidMiner provide a lot of descriptive statistical information and useful charts like Barcharts, Scatterplots etc. You might also like to look at running some correlations and chi square tests. Indicate in Task 1.1 Table which variables you consider to be the key variables which contribute most to determining whether it is likely to rain tomorrow.

 

Briefly discuss the key results of your exploratory data analysis and the justification for selecting your five top variables for predicting whether it is likely to rain tomorrow based on today’s weather conditions. (About 250 words)

 

Task 1.2 Build a Decision Tree model for predicting whether it is likely to rain tomorrow based on today’s weather conditions using RapidMiner and an appropriate set of data mining operators and a reduced weatherAUS.csv data set determined by your exploratory data analysis in Task 1.1.  Provide these outputs from RapidMiner (1) Final Decision Tree Model process, (2) Final Decision Tree diagram, and (3) associated decision tree rules.

 

Briefly explain your final Decision Tree Model Process, and discuss the results of the Final Decision Tree Model drawing on the key outputs (Decision Tree Diagram, Decision Tree Rules) for predicting whether it is likely to rain tomorrow based on today’s weather conditions and relevant supporting literature on the interpretation of dec ision trees.

 

Task 1.3 Build a Logistic Regression model for predicting whether it is likely to rain tomorrow based on today’s weather conditions using RapidMiner and an appropriate set of data mining operators and a reduced weatherAUS.csv data set determined by your exploratory data analysis in Task 1.1.  Provide these outputs from RapidMiner (1) Final Logistic Regression Model process and (2) Coefficients, and (3) Odds Ratios. Hint you will need to install the Weka Extension in RapidMiner, use W-Logistic Regression Operator for this Task 1.3 and you may need to change data types of some variables.

 

Briefly explain your final Logistic Regression Model Process, and discuss the results of the Final Logistic Regression Model drawing on the key outputs (Coefficients, Odds Ratios) for predicting whether it is likely to rain tomorrow based on today’s weather conditions and relevant supporting literature on the interpretation of logistic regression models (About 250 words).

Task 1.4 You will need to validate your Final Decision Tree Model and Final Logistic Regression Model. Note you will need to use the X-Validation Operator; Apply Model Operator and Performance Operator in your data mining process models here.

Discuss and compare the accuracy of your Final Decision Tree Model with the Final Logistic Regression Model for whether it is likely to rain tomorrow based on today’s weather conditions based the results of the confusion matrix, and ROC chart for each final model. You should use a table here to compare the key results of the confusion matrix for the Final Decision Tree Model and Final Logistic Regression Model (About 250 words).

Notthe important outputs from your data mining analyses conducted in RapidMiner for Task 1 should be included in your Assignment 3 report to provide support for your conclusions reached regarding each analysis conducted for Task 1.1, Task 1.2, Task 1.3 and

 

Task 1.4. Note you can export the important outputs from RapidMiner as jpg image files and include these screenshots in the relevant Task 1 parts of your Assignment 3 Report.

Note you will find the North Text book a useful reference for the data mining process activities conducted in Task 1 in relation to the exploratory data analysis, decision tree analysis, logistic regression analysis and evaluation of the accuracy of the Final Decision Tree model and the Final Logistic Regression model.


100% Plagiarism Free & Custom Written,
Tailored to your instructions


International House, 12 Constance Street, London, United Kingdom,
E16 2DQ

UK Registered Company # 11483120


100% Pass Guarantee

STILL NOT CONVINCED?

View our samples written by our professional writers to let you comprehend how your work is going to look like. We have categorised this into 3 categories with a few different subject domains

View Our Samples

We offer a £ 2999

If your assignment is plagiarised, we will give you £ 2999 in compensation

Recent Updates

Details

  • Title: Conduct an exploratory data analysis of the weatherAUS.csv data set using RapidMiner to understand the characteristics of each variable and the relationship of each variable to the other variables in the data set.
  • Price: £ 115
  • Post Date: 2020-04-14T15:24:23+00:00
  • Category: Assignment Queries
  • No Plagiarism Guarantee
  • 100% Custom Written

Customer Reviews

 Conduct an exploratory data analysis of the weatherAUS.csv data set using RapidMiner to understand the characteristics of each variable and the relationship of each variable to the other variables in the data set. Conduct an exploratory data analysis of the weatherAUS.csv data set using RapidMiner to understand the characteristics of each variable and the relationship of each variable to the other variables in the data set.
Reviews: 5

A masterpiece of assignment by , written on 2020-03-12

Oh my god! This writing company has saved me from so many bothering and figured out my problem in the best way possible. I am not fond of reading and when this book review was given to me as part of my coursework, I went into depression. But I must say, my writer came up with an amazing book review covering all the major aspects of the book nicely. I am waiting for other assignments to come so that I would come here again. The place is good and quite reasonable as well which makes it easy for me to manage my budget.
Reviews: 5

A masterpiece of assignment by , written on 2020-03-12

I am not good at designing PowerPoint presentations so I took help from Insta Research. The format is cool and attractive. All the information is nicely placed and used. I am looking forward to presenting so that I could demonstrate my presentation and receive good comments. Thank you for help.