Pub. Date:
Pearson Education
Introduction to Data Mining / Edition 1

Introduction to Data Mining / Edition 1

Current price is , Original price is $166.65. You

Temporarily Out of Stock Online

Please check back later for updated availability.


Introduction to Data Mining presents fundamental concepts and algorithms for those learning data mining for the first time. Each major topic is organized into two chapters, beginning with basic concepts that provide necessary background for understanding each data mining technique, followed by more advanced concepts and algorithms.

Product Details

ISBN-13: 9780321321367
Publisher: Pearson Education
Publication date: 05/02/2005
Edition description: Older Edition
Pages: 792
Product dimensions: 7.60(w) x 9.40(h) x 1.70(d)

About the Author

Dr Pang-Ning Tan is a Professor in the Department of Computer Science and Engineering at Michigan State University. He received his M.S. degree in Physics and Ph.D. degree in Computer Science from University of Minnesota. His research interests focus on the development of novel data mining algorithms for a broad range of applications, including climate and ecological sciences, cybersecurity, and network analysis. He has published more than 130 technical papers in the area of data mining, including top conferences and journals such as KDD, ICDM, SDM, CIKM, and TKDE.

Dr. Michael Steinbach is a Research Scientist in the department of Computer Science and Engineering at the University of Minnesota, from which he earned a B.S. degree in Mathematics, an M.S. degree in Statistics, and M.S. and Ph.D. degrees in Computer Science. His research interests are in the areas of data mining, machine learning, and statistical learning and its applications to fields, such as climate, biology, and medicine. This research has resulted in more than 100 papers published in the proceedings of major data mining conferences or computer science or domain journals. Previous to his academic career, he held a variety of software engineering, analysis, and design positions in industry at Silicon Biology, Racotek, and NCR.

Dr. Anuj Karpatne is a Post Doctoral Associate in the Department of Computer Science and Engineering at the University of Minnesota. He received his M.Tech in Mathematics and Computing from the Indian Institute of Technology Delhi, and a Ph.D. in Computer Science at the University of Minnesota under the guidance of Prof. Vipin Kumar. His research interests lie in the development of data mining and machine learning algorithms for solving scientific and socially relevant problems in varied disciplines such as climate science, hydrology, and healthcare. His research has been published at top-tier journals and conferences such as SDM, ICDM, KDD, NIPS, TKDE, and ACM Computing Surveys.

Dr. Vipin Kumar is a Regents Professor at the University of Minnesota, where he holds the William Norris Endowed Chair in the Department of Computer Science and Engineering. His research interests include data mining, high-performance computing, and their applications in Climate/Ecosystems and health care. Kumar's foundational research been honored by the ACM SIGKDD 2012 Innovation Award, which is the highest award for technical excellence in the field of Knowledge Discovery and Data Mining (KDD), and the 2016 IEEE Computer Society Sidney Fernbach Award, one of IEEE Computer Society's highest awards in high performance computing.

Table of Contents

1 Introduction

1.1 What is Data Mining?

1.2 Motivating Challenges

1.3 The Origins of Data Mining

1.4 Data Mining Tasks

1.5 Scope and Organization of the Book

1.6 Bibliographic Notes

1.7 Exercises

2 Data

2.1 Types of Data

2.2 Data Quality

2.3 Data Preprocessing

2.4 Measures of Similarity and Dissimilarity

2.5 Bibliographic Notes

2.6 Exercises

3 Exploring Data

3.1 The Iris Data Set

3.2 Summary Statistics

3.3 Visualization

3.4 OLAP and Multidimensional Data Analysis

3.5 Bibliographic Notes

3.6 Exercises

4 Classification: Basic Concepts, Decision Trees, and Model Evaluation

4.1 Preliminaries

4.2 General Approach to Solving a Classification Problem

4.3 Decision Tree Induction

4.4 Model Overfitting

4.5 Evaluating the Performance of a Classifier

4.6 Methods for Comparing Classifiers

4.7 Bibliographic Notes

4.8 Exercises

5 Classification: Alternative Techniques

5.1 Rule-Based Classifier

5.2 Nearest-Neighbor Classifiers

5.3 Bayesian Classifiers

5.4 Artificial Neural Network (ANN)

5.5 Support Vector Machine (SVM)

5.6 Ensemble Methods

5.7 Class Imbalance Problem

5.8 Multiclass Problem

5.9 Bibliographic Notes

5.10 Exercises

6 Association Analysis: Basic Concepts and Algorithms

6.1 Problem Definition

6.2 Frequent Itemset Generation

6.3 Rule Generation

6.4 Compact Representation of Frequent Itemsets

6.5 Alternative Methods for Generating Frequent Itemsets

6.6 FP-Growth Algorithm

6.7 Evaluation of Association Patterns

6.8 Effect of Skewed Support Distribution

6.9 Bibliographic Notes

6.10 Exercises

7 Association Analysis: Advanced Concepts

7.1 Handling Categorical Attributes

7.2 Handling Continuous Attributes

7.3 Handling a Concept Hierarchy

7.4 Sequential Patterns

7.5 Subgraph Patterns

Customer Reviews