Pub. Date:
Introduction to Data Mining / Edition 1

Introduction to Data Mining / Edition 1

Current price is , Original price is $159.99. You

Temporarily Out of Stock Online

Please check back later for updated availability.

11 New & Used Starting at $30.75


Introduction to Data Mining presents fundamental concepts and algorithms for those learning data mining for the first time. Each major topic is organized into two chapters, beginning with basic concepts that provide necessary background for understanding each data mining technique, followed by more advanced concepts and algorithms.

Product Details

ISBN-13: 9780321321367
Publisher: Pearson
Publication date: 05/16/2005
Edition description: Older Edition
Pages: 769
Product dimensions: 7.60(w) x 9.40(h) x 1.70(d)

Table of Contents

1 Introduction

1.1 What is Data Mining?

1.2 Motivating Challenges

1.3 The Origins of Data Mining

1.4 Data Mining Tasks

1.5 Scope and Organization of the Book

1.6 Bibliographic Notes

1.7 Exercises

2 Data

2.1 Types of Data

2.2 Data Quality

2.3 Data Preprocessing

2.4 Measures of Similarity and Dissimilarity

2.5 Bibliographic Notes

2.6 Exercises

3 Exploring Data

3.1 The Iris Data Set

3.2 Summary Statistics

3.3 Visualization

3.4 OLAP and Multidimensional Data Analysis

3.5 Bibliographic Notes

3.6 Exercises

4 Classification: Basic Concepts, Decision Trees, and Model Evaluation

4.1 Preliminaries

4.2 General Approach to Solving a Classification Problem

4.3 Decision Tree Induction

4.4 Model Overfitting

4.5 Evaluating the Performance of a Classifier

4.6 Methods for Comparing Classifiers

4.7 Bibliographic Notes

4.8 Exercises

5 Classification: Alternative Techniques

5.1 Rule-Based Classifier

5.2 Nearest-Neighbor Classifiers

5.3 Bayesian Classifiers

5.4 Artificial Neural Network (ANN)

5.5 Support Vector Machine (SVM)

5.6 Ensemble Methods

5.7 Class Imbalance Problem

5.8 Multiclass Problem

5.9 Bibliographic Notes

5.10 Exercises

6 Association Analysis: Basic Concepts and Algorithms

6.1 Problem Definition

6.2 Frequent Itemset Generation

6.3 Rule Generation

6.4 Compact Representation of Frequent Itemsets

6.5 Alternative Methods for Generating Frequent Itemsets

6.6 FP-Growth Algorithm

6.7 Evaluation of Association Patterns

6.8 Effect of Skewed Support Distribution

6.9 Bibliographic Notes

6.10 Exercises

7 Association Analysis: Advanced Concepts

7.1 Handling Categorical Attributes

7.2 Handling Continuous Attributes

7.3 Handling a Concept Hierarchy

7.4 Sequential Patterns

7.5 Subgraph Patterns

7.6 Infrequent Patterns

7.7 Bibliographic Notes

7.8 Exercises

8 Cluster Analysis: Basic Concepts and Algorithms

8.1 Overview

8.2 K-means

8.3 Agglomerative Hierarchical Clustering


8.5 Cluster Evaluation

8.6 Bibliographic Notes

8.7 Exercises

9 Cluster Analysis: Additional Issues and Algorithms

9.1 Characteristics of Data, Clusters, and Clustering Algorithms

9.2 Prototype-Based Clustering

9.3 Density-Based Clustering

9.4 Graph-Based Clustering

9.5 Scalable Clustering Algorithms

9.6 Which Clustering Algorithm?

9.7 Bibliographic Notes

9.8 Exercises

10 Anomaly Detection

10.1 Preliminaries

10.2 Statistical Approaches

10.3 Proximity-Based Outlier Detection

10.4 Density-Based Outlier Detection

10.5 Clustering-Based Techniques

10.6 Bibliographic Notes

10.7 Exercises

Appendix A Linear Algebra

Appendix B Dimensionality Reduction

Appendix C Probability and Statistics

Appendix D Regression

Appendix E Optimization

Author Index

Subject Index

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews

Introduction to Data Mining 5 out of 5 based on 0 ratings. 1 reviews.
Guest More than 1 year ago
As databases keep growing unabatedly, so too has the need for smart data mining. For a competitive edge in business, it helps to be able to analyse your data in unique ways. This text gives you a thorough education in state of the art data mining. Appropriate for both a student and a professional in the field. The extensive problem sets are well suited for the student. These often expand on concepts in the narrative, and are worth tackling. The central theme in the book is how to classify data, or find associations or clusters within it. Cluster analysis gets two chapters that are superbly done. These summarise decades of research into methods of grouping data into clusters. Usually hard to do, because an element of subjectivity can creep into the results. If your data is scattered in some n-dimensional space, then clusters might exist. But how to find them? The chapters show that the number of clusters and the constituents of these can depend on which method you adopt, and various initial conditions, like [essentially] seed values for clusters, if you choose a prototype cluster method like K-means. The descriptions of the cluster algorithms are succinct. Why is this useful? Because it helps you easily understand the operations of the algorithms, without drowning you in low level detail. Plus, by presenting a meta-level comparison between the algorithms, you can develop insight into rolling your own methods, specific to your data. Part of my research involves finding new ways to make clusters, and the text was very useful in explaining the existing ideas.