ISBN-10:
047007471X
ISBN-13:
9780470074718
Pub. Date:
11/28/2006
Publisher:
Wiley
Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining / Edition 1

Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining / Edition 1

by Glenn J. Myatt

Paperback

Current price is , Original price is $108.0. You

Temporarily Out of Stock Online

Please check back later for updated availability.

This item is available online through Marketplace sellers.

Product Details

ISBN-13: 9780470074718
Publisher: Wiley
Publication date: 11/28/2006
Edition description: Older Edition
Pages: 292
Product dimensions: 6.18(w) x 9.15(h) x 0.62(d)

About the Author

GLENN J. MYATT, PhD, is cofounder of Leadscope, Inc., a data mining company providing solutions to the pharmaceutical and chemical industry. He has also acted as a part-time lecturer in chemoinformatics at The Ohio State University and has held a series of industrial and academic research positions. Dr. Myatt is the author of numerous journal articles.

Read an Excerpt

Click to read or download

Table of Contents

Preface.

1. Introduction.

1.1 Overview.

1.2 Problem definition.

1.3 Data preparation.

1.4 Implementation of the analysis.

1.5 Deployment of the results.

1.6 Book outline.

1.7 Summary.

1.8 Further reading.

2. Definition.

2.1 Overview.

2.2 Objectives.

2.3 Deliverables.

2.4 Roles and responsibilities.

2.5 Project plan.

2.6 Case study.

2.6.1 Overview.

2.6.2 Problem.

2.6.3 Deliverables.

2.6.4 Roles and responsibilities.

2.6.5 Current situation.

2.6.6 Timetable and budget.

2.6.7 Cost/benefit analysis.

2.7 Summary.

2.8 Further reading.

3. Preparation.

3.1 Overview.

3.2 Data sources.

3.3 Data understanding.

3.3.1 Data tables.

3.3.2 Continuous and discrete variables.

3.3.3 Scales of measurement.

3.3.4 Roles in analysis.

3.3.5 Frequency distribution.

3.4 Data preparation.

3.4.1 Overview.

3.4.2 Cleaning the data.

3.4.3 Removing variables.

3.4.4 Data transformations.

3.4.5 Segmentation.

3.5 Summary.

3.6 Exercises.

3.7 Further reading.

4. Tables and graphs.

4.1 Introduction.

4.2 Tables.

4.2.1 Data tables.

4.2.2 Contingency tables.

4.2.3 Summary tables.

4.3 Graphs.

4.3.1 Overview.

4.3.2 Frequency polygrams and histograms.

4.3.3 Scatterplots.

4.3.4 Box plots.

4.3.5 Multiple graphs.

4.4 Summary.

4.5 Exercises.

4.6 Further reading.

5. Statistics.

5.1 Overview.

5.2 Descriptive statistics.

5.2.1 Overview.

5.2.2 Central tendency.

5.2.3 Variation.

5.2.4 Shape.

5.2.5 Example.

5.3 Inferential statistics.

5.3.1 Overview.

5.3.2 Confidence intervals.

5.3.3 Hypothesis tests.

5.3.4 Chi-square.

5.3.5 One-way analysis of variance.

5.4 Comparative statistics.

5.4.1 Overview.

5.4.2 Visualizing relationships.

5.4.3 Correlation coefficient (r).

5.4.4 Correlation analysis for more than two variables.

5.5 Summary.

5.6 Exercises.

5.7 Further reading.

6. Grouping.

6.1 Introduction.

6.1.1 Overview.

6.1.2 Grouping by values or ranges.

6.1.3 Similarity measures.

6.1.4 Grouping approaches.

6.2 Clustering.

6.2.1 Overview.

6.2.2 Hierarchical agglomerative clustering.

6.2.3 K-means clustering.

6.3 Associative rules.

6.3.1 Overview.

6.3.2 Grouping by value combinations.

6.3.3 Extracting rules from groups.

6.3.4 Example.

6.4 Decision trees.

6.4.1 Overview.

6.4.2 Tree generation.

6.4.3 Splitting criteria.

6.4.4 Example.

6.5 Summary.

6.6 Exercises.

6.7 Further reading.

7. Prediction.

7.1 Introduction.

7.1.1 Overview.

7.1.2 Classification.

7.1.3 Regression.

7.1.4 Building a prediction model.

7.1.5 Applying a prediction model.

7.2 Simple regression models.

7.2.1 Overview.

7.2.2 Simple linear regression.

7.2.3 Simple nonlinear regression.

7.3 K-nearest neighbors.

7.3.1 Overview.

7.3.2 Learning.

7.3.3 Prediction.

7.4 Classification and regression trees.

7.4.1 Overview.

7.4.2 Predicting using decision trees.

7.4.3 Example.

7.5 Neural networks.

7.5.1 Overview.

7.5.2 Neural network layers.

7.5.3 Node calculations.

7.5.4 Neural network predictions.

7.5.5 Learning process.

7.5.6 Backpropagation.

7.5.7 Using neural networks.

7.5.8 Example.

7.6 Other methods.

7.7 Summary.

7.8 Exercises.

7.9 Further reading.

8. Deployment.

8.1 Overview.

8.2 Deliverables.

8.3 Activities.

8.4 Deployment scenarios.

8.5 Summary.

8.6 Further reading.

9. Conclusions.

9.1 Summary of process.

9.2 Example.

9.2.1 Problem overview.

9.2.2 Problem definition.

9.2.3 Data preparation.

9.2.4 Implementation of the analysis.

9.2.5 Deployment of the results.

9.3 Advanced data mining.

9.3.1 Overview.

9.3.2 Text data mining.

9.3.3 Time series data mining.

9.3.4 Sequence data mining.

9.4 Further reading.

Appendix A Statistical tables.

A.1 Normal distribution.

A.2 Student’s t-distribution.

A.3 Chi-square distribution.

A.4 F-distribution.

Appendix B Answers to exercises.

Glossary.

Bibliography.

Index.

What People are Saying About This

From the Publisher

“This book is written in plain language and is useful to you who wants to learn about the business of gathering and analyzing data and accomplishing the objectives you set beforehand. Its most important message I think is that you define the problem so that the data gathered will be tailored to solving it.” (Biz India, 22 December 2012)

"…a well-written book on data analysis and data mining that provides an excellent foundation…" (CHOICE, May 2007)

"This is a must-read book for learning practical statistics and data analysis" (Computing Reviews.com, May 22, 2007)

"…the book should be accessible to all its intended readers." (MAA Reviews, December 28, 2006)

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews