ITC516 Data Mining and Visualisation for Business Intelligence - Assessment Item 3 - Weka Data Mining - ITC516 Assignment Help

Task

Assessment Description

Weka Data Mining Practical and Report

This assignment consists of two parts.

Part 1: Comparison of classification algorithms on a nominal data set. There are two steps to perform in part 1.

Step 1: [5 marks]. In this step, you are required to perform a data mining task to evaluate two different classification algorithms on a nominal data set. Load the contact-lenses.arff data set into Weka and compare the performance on this data set for the following algorithms:

  • Decision Tree - J48 algorithm
  • Naive Bayes

Step 2: [5 marks]. From step 1 outputs, write a report that shows the performance of the two algorithms and comment on their accuracy using the confusion matrix and other performance metrics used in Weka. In your report consider:

  • Is there a difference in performance between the algorithms?
  • Which algorithm performs best?

Your report should Include the necessary screenshots, tables, graphs, etc. to make your report understandable to the reader.

Part 2: Classification on a numeric data set.

Load the cpu.arff data set into Weka. Run the Linear Regression algorithm on this data set and answer the following questions:

  1. Write down the linear regression model generated by the algorithm. [1 mark]
  2. List the weights that have been assigned to the attributes. [1 mark]
  3. What is the value of the root mean squared error? Explain the significance of this in the regression model. [1 mark]
  4. What is the value of the correlation coefficient? Explain the significance of this in the linear regression model. [1 mark]
  5. What measure has been taken in this algorithm to ensure that the training and testing data sets are not biased or over-fitted? [1 mark]

The task is worth 15 marks of the overall marks available for the assessment.

Rationale

This assessment task will assess the following learning outcome/s:

  • be able to identify and analyse business requirements for the identification of patterns and trends in data sets.
  •  be able to appraise the different approaches and categories of data mining problems.
  • be able to compare and evaluate output patterns.
  • be able to explore and critically analyse data sets and evaluate their data quality, integrity and security requirements.
  • be able to compare and evaluate appropriate techniques for detecting and evaluating patterns in a given data set.
  • These tasks aim to assess your progress towards:

  • be able to identify and analyse business requirements for the identification of patterns and trends in data sets;
  • be able to appraise the different approaches and categories of data mining problems;
  • be able to compare and evaluate output patterns;
  • be able to explore and critically analyse data sets and evaluate their data quality, integrity and security requirements;
  • be able to compare and evaluate appropriate techniques for detecting and evaluating patterns in a given data set;
  • be able to explain the importance of current and future trends likely to affect data mining and visualisation.
Order Now