APPLY SKILLS AND KNOWLEDGE ACQUIRED THROUGHOUT THE TRIMESTER IN CLASSIFICATION ALGORITHMS

100% Pass and No Plagiarism Guaranteed

APPLY SKILLS AND KNOWLEDGE ACQUIRED THROUGHOUT THE TRIMESTER IN CLASSIFICATION ALGORITHMS

This assignment consists of a report worth 20 marks. Delays caused by student`s own computer downtime cannot be accepted as a valid reason for late submission without penalty. Students must plan their work to allow for both scheduled and unscheduled downtime.

Submission instructions: You must submit an electronic copy of all your assignment files via Cloud- Deakin. You must include both your report, source codes, necessary data files and optionally presentation file. Assignments will not be accepted through any other manner of submission. Students should note that email and paper based submissions will ordinarily be rejected.

Special requirements to prove the originality of your work: On-campus students (B and G) are required to demonstrate the execution of your classification programs in R to your tutor in Week 10; Cloud students are required to attach a 3-5 minutes Video presentation to demonstrate how your R codes are executed to derive the claimed results. The video should be uploaded to a cloud storage (You can find out how to upload a video Failure to do so will result a delayed assessment of your submission.

Late submissions: Submissions received after the due date are penalized at a rate of 5% (out of the full mark) per day, no exceptions. Late submission after 5 days would be penalized at a rate of 100% out of the full mark. Close of submissions on the due date and each day thereafter for penalties will occur at 05:00 pm Australian Eastern Time (UTC +10 hours). Students outside of Victoria should note that the normal time zone in Victoria is UTC+10 hours. No extension will be granted.

It is the student`s responsibility to ensure that they understand the submission instructions. If you have ANY difficulties ask the Lecturer/Tutor for assistance (prior to the submission date).

Copying, Plagiarism Notice

This is an individual assignment. You are not permitted to work as a part of a group when writing this assignment. The University`s policy on plagiarism can be viewed online at

Overview

The popularity of social media networks, such as Twitter, leads to an increasing number of spamming activities. Researchers employed various machine learning methods to detect Twitter spams. In this assignment, you are required to classify spam tweets by using provided datasets. The features have been extracted and clearly structured in JSON format. The extracted features can be categorized into two groups: user profile-based features and tweet content-based features as summarized in Table 1.

The provided training dataset and testing dataset are separately listed in Table 2 and Table 3. In testing dataset, we can find that the ratio of spam to non-spam is 1:1 in Dataset1, while the ratio is 1:19 in Dataset 2. In most of previous work, the testing datasets are nearly evenly distributed. However, in real world, there are only around 5% spam tweets in Twitter, which indicates that testing Dataset 2 simulates the real-world scenario. You are required to classify spam tweets, evaluate the classifiers’ performance and compare the Dataset 1 and Dataset 2 outcomes by conducting experiments.

Twitter Spam Detection Work Flow

Problem Statement

This is an individual assessment task. Each student is required to submit a report of approximately 2,000-2,500 words along with exhibits to support findings with respect to the provided spam and non-spam messages. This report should consist of:

•Overview of classifiers and evaluation metrics

•Construction of data sets, identification of features and the process of conducting classification

•Technical findings of experiment results

•Justified discussion of the performance evaluation outcomes for different classifiers

To demonstrate your achievement of these goals, you must write a report of at least 2,000 words (2,500 words maximum). Your report should consist of the following chapters:

1.A proper title which matches the contents of your report.

2.Your name and Deakin student number in the author line.

3.An executive summary which summarizes your findings. (You may find hints on writing good executive summaries from http://unilearning.uow.edu.au/report/4bi1.html.)

4.An introduction chapter which lists the classification algorithms of your choice (at least 5 algorithms), the features used for classification, the performance evaluation metrics (at least 5 evaluation metrics), the brief summary of your findings, and the organization of the rest of your report. (You may find hints on features used for classification from Twitter Developer Documentation

5.A literature review chapter which surveys the latest academic papers regarding the classifiers and performance evaluation metrics of your choice. With respect to each classifier and performance evaluation metrics, you are advised to identify and cite at least one paper published by ACM and IEEE journals or conference proceedings. In addition, Your aim of this part of the report is to demonstrate deep and thorough understanding of the existing body of knowledge encompassing multiple classification techniques for security data analytics,

specifically, your argument should explain why machine learning algorithms should be used rather than human readers. (Please read through the hints on this web page before writing this chapter

6.Technical demonstration chapter which consists of fully explained screenshots when your experiments were conducted in R. That is, you should explain each step of the procedure of classification, and the performance results for your classifiers. Note, what classifiers you presented in literature review should be what you conduct experiments.

7.Performance evaluation chapter which evaluates the performance of classifiers. You should analyse each classifier’s performance with respect to the performance metrics of your choice. In addition, you should compare the performance results in terms of evaluation metrics, e.g., accuracy, false positive, recall, F-measure, speed and so on, for the selected classifiers and datasets.

8.A conclusions chapter which summarizes major findings of the study (You should use at least 5 evaluation metrics to evaluate the performance of classifiers and compare the performance of different classifiers. You can demonstrate your experiment results in the form of table and plots), discusses whether the results match your hypotheses prior to the experiments and recommends the best performing classification algorithm.

9.A bibliography list of all cited papers and other resources. You must use in-text citations in Harvard style and each citation must correspond to a bibliography entry. There must be no bibliography entries that are not cited in the report. (You should know the contents from this page http://www.deakin.edu.au/students/study-support/referencing/harvard.)

	Proficient (above 80%)		Average (60-79%)	Satisfactory (50-59%)	Below Expectation (0-50%)	Score

Scientific	Use appropriate	Use discipline-speciﬁc		Use some discipline-speciﬁc	Fail to demonstrate	Out 0f 4
Writing in	language and genre to	language and genres to		language and prescribed genre	understanding for	marks
Introduction	extend the knowledge	address gaps of a self-selected		to demonstrate understanding	lecturer/teacher as audience.
and	of a range of audiences.	audience. Apply innovatively		from a stated perspective and	Fail to apply to a similar
Conclusion		the knowledge developed to a		for a speciﬁed audience. Apply	context the knowledge
		di erent context.		to di erent contexts the	developed.
		ﬀ		knowledge developed.
		ﬀ		ﬀ
Literature	Collect and record self-	Collect and record self-		Collect and record required	Fail to collect required	Out of 4
Review	determined information	determined information/ data		information/ data from self-	information or data from the	marks
	from self-selected	from self-selected sources,		selected sources using one of	prescribed source; Fail to
	sources, choosing or	choosing an appropriate		several prescribed	organize information/data
	devising an appropriate	methodology based on		methodologies; Organize	using
	methodology with self-	structured guidelines;		information/data using	prescribed structure; Fail to
	structured guidelines;	Organize information/data		recommended structures.	respond to questions/tasks
	Organize information	using student-determined		Manage self-determined	arising explicitly from a
	using student-	structures, and manage the		processes with multiple possible	closed inquiry
	determined structures	processes, within the		pathways; Respond to
	and management of	parameters set by		questions/tasks generated from
	processes; Generate	the guidelines; Generate		a closed inquiry.
	questions/aims/hypoth	questions/aims/hypotheses
	eses based on literature	framed within structured
		guidelines
Technical	Provide fully explained	Provide fully explained		Provide screenshots with R	No screenshots and	Out of 4
Demonstrati-	screenshots with R	screenshots with R script.		script. Explain each step of the	explanations provided.	marks
on	script. Explain each step	Explain each step of the		procedure of classification, and
	of the procedure of	procedure of classification,		the performance results. But
	classification, and the	and the performance results.		many parts of demo are not
	performance results in	The entire demo is clear, but		clear enough and/or contain
	details. The entire demo	there are some mistakes.		major flows or mistakes.
	is clear, correct and
	covers all findings.

Performan-	Evaluate	Evaluate information/data and	Evaluate information/data and	Fail to evaluate	Out of 4
ce Evaluation	information/data and	the inquiry process	reflect on the inquiry process	information/data and to	marks
	inquiry process	comprehensively developed	based on the given literature.	reflect on inquiry process.
	rigorously based on the	within the scope of the given	Use only one testing data set.	Use one or no testing data
	latest literature.	literature.	Less than 4 classifiers work	set. Less than 2 classifiers
	Reflect insightfully to	Reflect insightfully to renew	correctly. Less than 4 evaluation	work correctly. Less than 2
	renew others`	others` processes. Construct	metrics apply to analyse the	evaluation metrics apply to
	processes. Construct	and use one testing data set	performance of classifiers.	analyse the performance of
	and use one testing	and two training data sets. 4		classifiers.
	data set and two	classifiers work correctly. 4
	training data sets. 5	evaluation metrics apply to
	classifiers work	analyse the performance of
	correctly. 5 evaluation	classifiers.
	metrics apply to analyse
	the performance of
	classifiers.
Reference	More than 10	More than 10 bibliographic	More than 10 bibliographic	Less than 10 bibliographic	Out of 4
	bibliographic items (all	items (most of them are	items (most of them are	items are presented. Or	marks
	of them are academic	academic papers and at least 1	academic papers) are	there are more than 3 errors
	papers and at least 1	items per classiﬁcation) are	presented. Inline citations are	in the bibliographic list and
	item per classiﬁer/ at	presented, but there are a few	often used incorrectly.	inline citations.
	least 1 item for per	errors. Inline citations are
	evaluation metrics) are	used but with a few errors.
	correctly presented and
	inline citations are
	correctly used.

100% Plagiarism Free & Custom Written,
Tailored to your instructions

International House, 12 Constance Street, London, United Kingdom,
E16 2DQ

UK Registered Company # 11483120

100% Pass Guarantee

STILL NOT CONVINCED?

View our samples written by our professional writers to let you comprehend how your work is going to look like. We have categorised this into 3 categories with a few different subject domains

View Our Samples

We offer a £ 2999

If your assignment is plagiarised, we will give you £ 2999 in compensation