Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data, Third Edition

Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data, Third Edition

by Bruce Ratner


View All Available Formats & Editions
Choose Expedited Shipping at checkout for guaranteed delivery by Thursday, January 24

Product Details

ISBN-13: 9781498797603
Publisher: Taylor & Francis
Publication date: 06/07/2017
Edition description: Revised
Pages: 696
Product dimensions: 7.00(w) x 10.00(h) x (d)

About the Author

Bruce Ratner, The Significant StatisticianTM, is President and Founder of DM STAT-1 Consulting, the ensample for Statistical Modeling, Analysis and Data Mining, and Machine-learning Data Mining in the DM Space. DM STAT-1 specializes in all standard statistical techniques, and methods using machine-learning/statistics algorithms, such as its patented GenIQ Model, to achieve its clients' goals – across industries including Direct and Database Marketing, Banking, Insurance, Finance, Retail, Telecommunications, Healthcare, Pharmaceutical, Publication & Circulation, Mass & Direct Advertising, Catalog Marketing, e-Commerce, Web-mining, B2B, Human Capital Management, Risk Management, and Nonprofit Fundraising. Bruce holds a doctorate in mathematics and statistics, with a concentration in multivariate statistics and response model simulation. His research interests include developing hybrid-modeling techniques, which combine traditional statistics and machine learning methods. He holds a patent for a unique application in solving the two-group classification problem with genetic programming.

Table of Contents

Preface to Third Edition

Preface of Second Edition



1. Introduction

2. Science Dealing with Data: Statistics and Data Science

3. Two Basic Data Mining Methods for Variable Assessment

4. CHAID-Based Data Mining for Paired-Variable Assessment

5. The Importance of Straight Data Simplicity and Desirability for Good Model-Building Practice

6. Symmetrizing Ranked Data: A Statistical Data Mining Method for Improving the Predictive Power of Data

7. Principal Component Analysis: A Statistical Data Mining Method for Many-Variable Assessment

8. Market Share Estimation: Data Mining for an Exceptional Case

9. The Correlation Coefficient: Its Values Range between Plus and Minus 1, or Do They?

10. Logistic Regression: The Workhorse of Response Modeling

11. Predicting Share of Wallet without Survey Data

12. Ordinary Regression: The Workhorse of Profit Modeling

13. Variable Selection Methods in Regression: Ignorable Problem, Notable Solution

14. CHAID for Interpreting a Logistic Regression Model

15. The Importance of the Regression Coefficient

16. The Average Correlation: A Statistical Data Mining Measure for Assessment of Competing Predictive Models and the Importance of the Predictor Variables

17. CHAID for Specifying a Model with Interaction Variables

18. Market Segmentation Classification Modeling with Logistic Regression

19. Market Segmentation Based on Time-Series Data Using Latent Class Analysis

20. Market Segmentation: An Easy Way to Understand the Segments

21. The Statistical Regression Model: An Easy Way to Understand the Model

22. CHAID as a Method for Filling in Missing Values

23. Model Building with Big Complete and Incomplete Data

24. Art, Science, Numbers, and Poetry

25. Identifying Your Best Customers: Descriptive, Predictive, and Look-Alike Profiling

26. Assessment of Marketing Models

27. Decile Analysis: Perspective and Performance

28. Net T-C Lift Model: Assessing the Net Effects of Test and Control Campaigns

29. Bootstrapping in Marketing: A New Approach for Validating Models

30. Validating the Logistic Regression Model: Try Bootstrapping

31. Visualization of Marketing Models: Data Mining to Uncover Innards of a Model

32. The Predictive Contribution Coefficient: A Measure of Predictive Importance

33. Regression Modeling Involves Art, Science, and Poetry, Too

34. Opening the Dataset: A Twelve-Step Program for Dataholics

35. Genetic and Statistic Regression Models: A Comparison

36. Data Reuse: A Powerful Data Mining Effect of the GenIQ Model

37. A Data Mining Method for Moderating Outliers Instead of Discarding Them

38. Overfitting: Old Problem, New Solution

39. The Importance of Straight Data: Revisited

40. The GenIQ Model: Its Definition and an Application

41. Finding the Best Variables for Marketing Models

42. Interpretation of Coefficient-Free Models

43. Text Mining: Primer, Illustration, and TXTDM Software

44. Some of My Favorite Statistical Subroutines


Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews