ISBN-10:
1119967546
ISBN-13:
9781119967545
Pub. Date:
12/10/2012
Publisher:
Wiley
Visual Data Mining: The VisMiner Approach / Edition 1

Visual Data Mining: The VisMiner Approach / Edition 1

by Russell K. Anderson
Current price is , Original price is $84.0. You

Temporarily Out of Stock Online

Please check back later for updated availability.

Product Details

ISBN-13: 9781119967545
Publisher: Wiley
Publication date: 12/10/2012
Pages: 208
Product dimensions: 6.10(w) x 9.30(h) x 0.50(d)

About the Author

Russell K. Anderson, Information & Decision Management Department, West Texas A&M University, USA.

Read an Excerpt

Click to read or download

Table of Contents

Preface ix

Acknowledgments xi

1. Introduction 1

Data Mining Objectives 1

Introduction to VisMiner 2

The Data Mining Process 3

Initial Data Exploration 4

Dataset Preparation 5

Algorithm Selection and Application 8

Model Evaluation 8

Summary 9

2. Initial Data Exploration and Dataset Preparation Using VisMiner 11

The Rationale for Visualizations 11

Tutorial – Using VisMiner 13

Initializing VisMiner 13

Initializing the Slave Computers 14

Opening a Dataset 16

Viewing Summary Statistics 16

Exercise 2.1 17

The Correlation Matrix 18

Exercise 2.2 20

The Histogram 21

The Scatter Plot 23

Exercise 2.3 28

The Parallel Coordinate Plot 28

Exercise 2.4 33

Extracting Sub-populations Using the Parallel Coordinate Plot 37

Exercise 2.5 41

The Table Viewer 42

The Boundary Data Viewer 43

Exercise 2.6 47

The Boundary Data Viewer with Temporal Data 47

Exercise 2.7 49

Summary 49

3. Advanced Topics in Initial Exploration and Dataset Preparation Using VisMiner 51

Missing Values 51

Missing Values – An Example 53

Exploration Using the Location Plot 56

Exercise 3.1 61

Dataset Preparation – Creating Computed Columns 61

Exercise 3.2 63

Aggregating Data for Observation Reduction 63

Exercise 3.3 65

Combining Datasets 66

Exercise 3.4 67

Outliers and Data Validation 68

Range Checks 69

Fixed Range Outliers 69

Distribution Based Outliers 70

Computed Checks 72

Exercise 3.5 74

Feasibility and Consistency Checks 74

Data Correction Outside of VisMiner 75

Distribution Consistency 76

Pattern Checks 77

A Pattern Check of Experimental Data 80

Exercise 3.6 81

Summary 82

4. Prediction Algorithms for Data Mining 83

Decision Trees 84

Stopping the Splitting Process 86

A Decision Tree Example 87

Using Decision Trees 89

Decision Tree Advantages 89

Limitations 90

Artificial Neural Networks 90

Overfitting the Model 93

Moving Beyond Local Optima 94

ANN Advantages and Limitations 96

Support Vector Machines 97

Data Transformations 99

Moving Beyond Two-dimensional Predictors 100

SVM Advantages and Limitations 100

Summary 101

5. Classification Models in VisMiner 103

Dataset Preparation 103

Tutorial – Building and Evaluating Classification Models 104

Model Evaluation 104

Exercise 5.1 109

Prediction Likelihoods 109

Classification Model Performance 113

Interpreting the ROC Curve 119

Classification Ensembles 124

Model Application 125

Summary 127

Exercise 5.2 128

Exercise 5.3 128

6. Regression Analysis 131

The Regression Model 131

Correlation and Causation 132

Algorithms for Regression Analysis 133

Assessing Regression Model Performance 133

Model Validity 135

Looking Beyond R2 135

Polynomial Regression 137

Artificial Neural Networks for Regression Analysis 137

Dataset Preparation 137

Tutorial 138

A Regression Model for Home Appraisal 139

Modeling with the Right Set of Observations 139

Exercise 6.1 145

ANN Modeling 145

The Advantage of ANN Regression 148

Top-Down Attribute Selection 149

Issues in Model Interpretation 150

Model Validation 152

Model Application 153

Summary 154

7. Cluster Analysis 155

Introduction 155

Algorithms for Cluster Analysis 158

Issues with K-Means Clustering Process 158

Hierarchical Clustering 159

Measures of Cluster and Clustering Quality 159

Silhouette Coefficient 161

Correlation Coefficient 161

Self-Organizing Maps (SOM) 161

Self-Organizing Maps in VisMiner 163

Choosing the Grid Dimensions 168

Advantages of a 3-D Grid 169

Extracting Subsets from a Clustering 170

Summary 173

Appendix A VisMiner Reference by Task 175

Appendix B VisMiner Task/Tool Matrix 187

Appendix C IP Address Look-up 189

Index 191

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews