Table of Contents
1 Introduction 1.1 What is Data Mining?
1.2 Motivating Challenges
1.3 The Origins of Data Mining
1.4 Data Mining Tasks
1.5 Scope and Organization of the Book
1.6 Bibliographic Notes
1.7 Exercises
2 Data
2.1 Types of Data
2.2 Data Quality
2.3 Data Preprocessing
2.4 Measures of Similarity and Dissimilarity
2.5 Bibliographic Notes
2.6 Exercises
3 Exploring Data
3.1 The Iris Data Set
3.2 Summary Statistics
3.3 Visualization
3.4 OLAP and Multidimensional Data Analysis
3.5 Bibliographic Notes
3.6 Exercises
4 Classification: Basic Concepts, Decision Trees, and Model Evaluation
4.1 Preliminaries
4.2 General Approach to Solving a Classification Problem
4.3 Decision Tree Induction
4.4 Model Overfitting
4.5 Evaluating the Performance of a Classifier
4.6 Methods for Comparing Classifiers
4.7 Bibliographic Notes
4.8 Exercises
5 Classification: Alternative Techniques
5.1 Rule-Based Classifier
5.2 Nearest-Neighbor Classifiers
5.3 Bayesian Classifiers
5.4 Artificial Neural Network (ANN)
5.5 Support Vector Machine (SVM)
5.6 Ensemble Methods
5.7 Class Imbalance Problem
5.8 Multiclass Problem
5.9 Bibliographic Notes
5.10 Exercises
6 Association Analysis: Basic Concepts and Algorithms
6.1 Problem Definition
6.2 Frequent Itemset Generation
6.3 Rule Generation
6.4 Compact Representation of Frequent Itemsets
6.5 Alternative Methods for Generating Frequent Itemsets
6.6 FP-Growth Algorithm
6.7 Evaluation of Association Patterns
6.8 Effect of Skewed Support Distribution
6.9 Bibliographic Notes
6.10 Exercises
7 Association Analysis: Advanced Concepts
7.1 Handling Categorical Attributes
7.2 Handling Continuous Attributes
7.3 Handling a Concept Hierarchy
7.4 Sequential Patterns
7.5 Subgraph Patterns