Normal view MARC view ISBD view

Introduction to data mining

By: Tan, Pang-Nin
Title By: Steinbach, Michael | Kumar, Vipin, 1956-
Series: Pearson custom library.Publisher: Essex : Pearson, c2016.Edition: 2nd ed.Description: ii, 732 p. : ill. ; 28 cm.ISBN: 9780273769224Program: INFO911Subject(s): Data miningDDC classification: 006.312 TA IN Online resources: eBook
Summary:
Introduction to Data Mining, Second Edition, is intended for use in the Data Mining course. Introduction to Data Mining presents fundamental concepts and algorithms for those learning data mining for the first time. Each concept is explored thoroughly and supported with numerous examples. The text requires only a modest background in mathematics. Each major topic is organized into two chapters, beginning with basic concepts that provide necessary background for understanding each data mining technique, followed by more advanced concepts and algorithms. Teaching and Learning Experience.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Home library Call number url Status Date due Barcode Item holds
eBook University of Wollongong in Dubai
eBook
006.312 TA IN (Browse shelf) link Available T0065289
Total holds: 0

Front Cover; Title Page; Copyright Page; Dedication; Preface to the Second Edition; Contents; 1 Introduction; 1.1 What Is Data Mining?; 1.2 Motivating Challenges; 1.3 the Origins of Data Mining; 1.4 Data Mining Tasks; 1.5 Scope and Organization of the Book; 1.6 Bibliographic Notes; 1.7 Exercises; 2 Data; 2.1 Types of Data; 2.1.1 Attributes and Measurement; 2.1.2 Types of Data Sets; 2.2 Data Quality; 2.2.1 Measurement and Data Collection Issues; 2.2.2 Issues Related to Applications; 2.3 Data Preprocessing; 2.3.1 Aggregation; 2.3.2 Sampling; 2.3.3 Dimensionality Reduction 2.3.4 Feature Subset Selection2.3.5 Feature Creation; 2.3.6 Discretization and Binarization; 2.3.7 Variable Transformation; 2.4 Measures of Similarity and Dissimilarity; 2.4.1 Basics; 2.4.2 Similarity and Dissimilarity Between Simple Attributes; 2.4.3 Dissimilarities Between Data Objects; 2.4.4 Similarities Between Data Objects; 2.4.5 Examples of Proximity Measures; 2.4.6 Mutual Information; 2.4.7 Kernel Functions*; 2.4.8 Bregman Divergence*; 2.4.9 Issues in Proximity Calculation; 2.4.10 Selecting the Right Proximity Measure; 2.5 Bibliographic Notes; 2.6 Exercises 3 Classification: Basic Concepts and Techniques3.1 Basic Concepts; 3.2 General Framework for Classification; 3.3 Decision Tree Classifier; 3.3.1 A Basic Algorithm to Build a Decision Tree; 3.3.2 Methods for Expressing Attribute Test Conditions; 3.3.3 Measures for Selecting an Attribute Test Condition; 3.3.4 Algorithm for Decision Tree Induction; 3.3.5 Example Application: Web Robot Detection; 3.3.6 Characteristics of Decision Tree Classifiers; 3.4 Model Overfitting; 3.4.1 Reasons for Model Overfitting; 3.5 Model Selection; 3.5.1 Using a Validation Set; 3.5.2 Incorporating Model Complexity 3.5.3 Estimating Statistical Bounds3.5.4 Model Selection for Decision Trees; 3.6 Model Evaluation; 3.6.1 Holdout Method; 3.6.2 Cross-validation; 3.7 Presence of Hyper-parameters; 3.7.1 Hyper-parameter Selection; 3.7.2 Nested Cross-validation; 3.8 Pitfalls of Model Selection and Evaluation; 3.8.1 Overlap Between Training and Test Sets; 3.8.2 Use of Validation Error as Generalization Error; 3.9 Model Comparison*; 3.9.1 Estimating the Confidence Interval for Accuracy; 3.9.2 Comparing the Performance of Two Models; 3.10 Bibliographic Notes; 3.11 Exercises 4 Association Analysis: Basic Concepts and Algorithms4.1 Preliminaries; 4.2 Frequent Itemset Generation; 4.2.1 The Apriori Principle; 4.2.2 Frequent Itemset Generation in the Algorithm; 4.2.3 Candidate Generation and Pruning; 4.2.4 Support Counting; 4.2.5 Computational Complexity; 4.3 Rule Generation; 4.3.1 Confidence-based Pruning; 4.3.2 Rule Generation in Algorithm; 4.3.3 an Example: Congressional Voting Records; 4.4 Compact Representation of Frequent Itemsets; 4.4.1 Maximal Frequent Itemsets; 4.4.2 Closed Itemsets; 4.5 Alternative Methods for Generating Frequent Itemsets*

Introduction to Data Mining, Second Edition, is intended for use in the Data Mining course. Introduction to Data Mining presents fundamental concepts and algorithms for those learning data mining for the first time. Each concept is explored thoroughly and supported with numerous examples. The text requires only a modest background in mathematics. Each major topic is organized into two chapters, beginning with basic concepts that provide necessary background for understanding each data mining technique, followed by more advanced concepts and algorithms. Teaching and Learning Experience.

INFO911

Powered by Koha