We're Open
+44 7340 9595 39
+44 20 3239 6980

DATA MINING AND DATA ANALYSIS KEY FRAMEWORKS AND CONCEPTS COVERED IN MODULES

  100% Pass and No Plagiarism Guaranteed

DATA MINING AND DATA ANALYSIS KEY FRAMEWORKS AND CONCEPTS COVERED IN MODULES

Description

Possible Marks and  Wtg(%)

Word

Due date

 

 

Count

 

Assignment 2 Written Practical Report

100 marks 30% Weighting

3500

05/09/16

 

 

 

 

 

The key frameworks and concepts covered in modules 1–5 are particularly relevant for this assignment. Assignment 2 relates to the specific course learning objectives 1, 2 and 4 and associated MBA program learning goals and skills: Global Content, Problem solving, Critical thinking, and Written Communication at level 3:

  1. Demonstrate applied knowledge of people, markets, finances, technology and management in a global context of business intelligence practice (data warehouse design, data mining process, data visualisation and performance management) and resulting organisational change and how these apply to implementation of business intelligence in organisation systems and business processes.

 

  1. Identify and solve complex organisational problems creatively and practically through the use of business intelligence and critically reflect on how evidence based decision making and sustainable business performance management can effectively address real world problems.

 

  1. Demonstrate the ability to communicate effectively in a clear and concise manner in written report style for senior management with correct and appropriate acknowledgment of main ideas presented and discussed.

Note you must use Rapid Miner Studio for Task 1 and Tableau Desktop for Task 3 in this Assignment 2. Failure to do so may result in Task 1 and/or 3 not being marked and zero marks awarded.

 

Note carefully University policy on  Academic Misconduct such as plagiarism, collusion and cheating. If any of these occur they will be found and dealt with by the  USQ Academic  Integrity Procedures. If proven, Academic Misconduct may result in failure of an individual assessment, the entire course or exclusion from a University program or programs.

 

Assignment 2 consists of three main tasks and a number of sub tasks

 

Task 1 Exploratory data analysis and Decision Tree Analysis (Worth 35 Marks)

Task 1a) Identify, critically review and discuss literature which determines higher adult income. This research will inform your assessment and identification of the key variables for determining higher level adult incomes in the data set adult-income.csv. I suggest you relate your discussion here to the variables in the adult-income.csv data set where possible (About 600 words).

Task1 b) Conduct an exploratory data analysis of the adult-income.csv data set using the RapidMiner Studio data mining tool to understand the characteristics of each variable and the relationship of each variable to the other variables in the data set adult-income.csv. Summarise the findings of your exploratory data analysis in terms of describing key characteristics of each of the variables in the adult-income.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relationships with other variables if relevant in a table named Task 1b Results of Exploratory Data Analysis for Adults Income Data Set.

Hint: The Statistics Tab and the Chart Tab in RapidMiner will provide descriptive statistical information and useful charts like Barcharts, Scatterplots etc. You might also like to look at running some correlations and chi square tests. Indicate in this Table which variables you consider to be the top five key variables and which contribute most to determining whether an adult is on a high income over $50,000. Note in completing Task 1b you will find it useful to refer to the data dictionary for the Adult Income provided in this document which defines each of the variables in terms of their data type and range of values.

 

Briefly discuss the key results of your exploratory data analysis and how you have selected your five top variables for predicting adult income higher than $50,000. (About 600 words)

 

Task 1c) Build a Decision Tree model for predicting higher Adult incomes using RapidMiner and an appropriate set of data mining operators and a reduced adult-come.csv data set determined by your exploratory data analysis in Task 1b. Provide these outputs from RapidMiner (1) Final Decision Tree Model process, (2) Final Decision Tree diagram, and (3) associated decision tree rules.

 

Briefly explain your final Decision Tree Model Process, discuss the results of the Final

 

Decision Tree Model drawing on the key outputs (Decision Tree Diagram, Decision Tree Rules) for predicting higher Adult Income and relevant supporting literature on the interpretation of decision trees (About 600 words).

 

Note For Task 1b and Task 1c completing the Tutorial Activities for RapidMiner and postings on the Assignment 2 discussion forum will assist you in determining what are appropriate RapidMiner operators to use.

 

Include all appropriate RapidMiner outputs such as RapidMiner Processes, Graphs and Tables that support the key aspects of your exploratory data analysis and decision tree model analysis of the adult-income.csv data set in Appendix A. Note you need export these outputs from

 

RapidMiner using the File/Print/Export Image option and include where relevant in Task 1 and Appendix A of the Assignment 2 report.

Table 1 Adult Income Data Dictionary

Variable Name

Type and description of

Range of values

 

 

variable

 

 

1.

SalaryGreater50K

Nominal, Target/Label

>50K,

2.

age

Integer, Age of Adult

continuous

3.

workclass

Polynominal, Category of

Private, Self-emp-not-inc, Self-emp-

 

 

work class

inc, Federal-gov, Local-gov, State-

 

 

 

gov, Without-pay, Never-worked.

4.

fnlwgt

Integer, final weighting for

continuous.

 

 

each adult income record

 

 

5.

education

Polynominal,Category of

Bachelors, Some-college, 11th, HS-

 

 

Education level obtained

grad, Prof-school, Assoc-acdm,

 

 

 

Assoc-voc, 9th, 7th-8th, 12th, Masters,

 

 

 

1st-4th, 10th, Doctorate, 5th-6th,

 

 

 

Preschool.

6.

education-num

Integer, education level

continuous.

 

 

ranking

 

 

7.

marital-status

Polynominal, Category

Married-civ-spouse, Divorced, Never-

 

 

Martial status of Adult

married, Separated, Widowed,

 

 

 

Married-spouse-absent, Married-AF-

 

 

 

spouse.

8.

occupation

Polynominal, Category of

Tech-support, Craft-repair, Other-

 

 

occupation of each Adult

service, Sales, Exec-managerial, Prof-

 

 

 

specialty, Handlers-cleaners, Machine-

 

 

 

op-inspct, Adm-clerical, Farming-

 

 

 

fishing, Transport-moving, Priv-

 

 

 

house-serv, Protective-serv, Armed-

 

 

 

Forces.

9.

relationship

Polynominal, Category of

Wife, Own-child, Husband, Not-in-

 

 

Adult relationship

family, Other-relative, Unmarried.

10. race

Polynominal, Race of each

, White, Asian-Pac-Islander, Amer-

 

 

Adult

Indian-Eskimo, Other, Black.

11. gender

Nominal

Female, Male.

12. capital-gain

Integer, capital gain for each

continuous.

 

 

adult

 

 

13. capital-loss

Integer, capital loss for each

continuous.

 

 

adult

 

 

14. hours-per-week

Integer, hours per week

continuous.

 

 

worked by each adult

 

 

15. native-country

Polynominal, Native country

United-States, Cambodia, England,

 

 

 

of each adult

Puerto-Rico, Canada, Germany,

 

 

 

 

Outlying-US(Guam-USVI-etc), India,

 

 

 

 

Japan, Greece, South, China, Cuba,

 

 

 

 

Iran, Honduras, Philippines, Italy,

 

 

 

 

Poland, Jamaica, Vietnam, Mexico,

 

 

 

 

Portugal, Ireland, France, Dominican-

 

 

 

 

Republic, Laos, Ecuador, Taiwan,

 

 

 

 

Haiti, Columbia, Hungary, Guatemala,

 

 

 

 

Nicaragua, Scotland, Thailand,

 

 

 

 

Yugoslavia, El-Salvador,

 

 

 

 

Trinadad&Tobago, Peru, Hong,

 

 

 

 

Holland-Netherlands.

 


Task 2 Data Warehousing Architecture Design (Worth 35 Marks)

 

A data warehouse is the foundation of any Business Intelligence or Business Analytics initiative. Consider the following scenario:

 

A large regional University consisting of five divisions (Academics, Academic Services, Students, Research and Campus Services), with a number of functional groups within each division. There are many different data sets residing in functional groups within the five divisions. They want high level advice on the logical design of a data warehouse architecture that will meet their reporting and decision making needs into the future.

 

Task 2a) Discuss the Kimball Model versus the Inmon Model as possible approaches by considering relative advantages and disadvantages of each approach with appropriate in-text reference support that could be used for designing and developing a data warehouse architecture that would meet the reporting and decision making needs of a large regional University described above (about 1000 words)

 

Task 2b) Provide a high level diagrammatic representation of your proposed data warehouse architecture design for a large Regional University as outlined above

 

Task 2c) Describe and justify your proposed data warehouse architecture design for a large regional University presented diagrammatically for Task 2b with appropriate in-text referencing support (about 500 words)

 

Note that the coverage of these concepts in the textbook Chapter 2 Data Warehousing is somewhat limited and dated and may not be current thinking for such a fast moving field. Hence you will need to research and critically review the current literature in relation to the concept of a data warehouse, different data warehouse design architectures and data warehouse design and implementation methodologies in more detail.

 

Task 3 Global Bike Company Sales Reports using Tableau Desktop (Worth 20 Marks)

 

The bicycles-sales.xlsx file provided for Assignment 2 on course study desk contains the following dimensions and information:

 

Region

Sales Date

Sub Region

Sales Period

Market

List Price

Business Segment

Unit Price

Category

Order Quantity

Model

Sales Amount

Color

 

 

With the bicycles-sales file use Tableau Desktop to produce four sales reports:

 

Task 3a) Create a sales report in a Text Table or Graph view that lists by sub region, business segment and model for all mountain bikes sold for the years 2002, 2003 and 2004. Analyse this sales report and comment on key trends and patterns that are apparent (about 500 words).

Task 3b) Create a sales report in a Text Table or Graph view that lists by region, sub region, business segment, order quantity and sales amount for all bicycle clothing for the years 2002, 2003 and 2004. Analyse this sales report and comment on key trends and patterns that are apparent (about 50 words).

 

Task 3c) Create a sales report in a Text Table or Graph view that lists by region, sub region, business segment and model for all the road bicycles in order of the total sales and profit for years 2002, 2003 and 2004. Analyse this sales report and comment on key trends and patterns that are apparent (about 50 words).

 

Task 3d) Create a sales report in in a Text Table or Graph view that lists by category, model, colour and order quality for all bicycles for the years 2002, 2003 and 2004. Analyse this sales report and comment on key trends and patterns that are apparent (about 50 words).

 

Note that you need provide a copy of the Text Table or Graph view of each sales report in your Assignment 2 report for the relevant sub Tasks 3a, 3b, 3c and 3d. The Tableau Menu Option Worksheet and then Copy or Export Options will allow you to copy and paste the view for each sales report into relevant section of Task 3 for your Assignment 2 report.

 

Your assignment 2 report must be structured in report format as follows:

 

  1. Title Page for Assignment 2 Report
  1. Table of Contents
  1. Body of report –main sections and subsections for assignment 2 tasks and sub tasks so

 

  • Task 1 will be a main heading with appropriate sub headings etc....for each sub task etc..
  • Task 2 …
  • Task 3 ….

 

  1. List of References
  1. List of Appendices

 

You need to submit two files for Assignment 2:

 

  1. Assignment 2 Report for Tasks 1, 2 and 3 in Word document format with the extension .docx
  1. Tableau packaged workbook with the extension .twbx contains four required sales reports for Task 3

 

Use the following file naming convention:

  1. Student_no_Student_name_CIS8008_Ass2.docx and
  1. Student_no_Student_name_CIS8008_Ass2.twbx

100% Plagiarism Free & Custom Written,
Tailored to your instructions


International House, 12 Constance Street, London, United Kingdom,
E16 2DQ

UK Registered Company # 11483120


100% Pass Guarantee

STILL NOT CONVINCED?

View our samples written by our professional writers to let you comprehend how your work is going to look like. We have categorised this into 3 categories with a few different subject domains

View Our Samples

We offer a £ 2999

If your assignment is plagiarised, we will give you £ 2999 in compensation

Recent Updates

Details

  • Title: DATA MINING AND DATA ANALYSIS KEY FRAMEWORKS AND CONCEPTS COVERED IN MODULES
  • Price: £ 109
  • Post Date: 2018-11-09T08:55:39+00:00
  • Category: Assignment
  • No Plagiarism Guarantee
  • 100% Custom Written

Customer Reviews

DATA MINING AND DATA ANALYSIS KEY FRAMEWORKS AND CONCEPTS COVERED IN MODULES DATA MINING AND DATA ANALYSIS KEY FRAMEWORKS AND CONCEPTS COVERED IN MODULES
Reviews: 5

A masterpiece of assignment by , written on 2020-03-12

Writing is not my field. I take help from this website for my accounting assignment. The work is good and I scored good grades in it. Thank you from the bottom of my heart.
Reviews: 5

A masterpiece of assignment by , written on 2020-03-12

Very professional and effective assignment writing service.