Multivariate Statistics and Machine Learning: An Introduction to Applied Data Science Using R and Python
Multivariate Statistics and Machine Learning is a hands-on textbook providing an in-depth guide to multivariate statistics and select machine learning topics using R and Python software.

The book offers a theoretical orientation to the concepts required to introduce or review statistical and machine learning topics, and in addition to teaching the techniques, instructs readers on how to perform, implement, and interpret code and analyses in R and Python in multivariate, data science, and machine learning domains. For readers wishing for additional theory, numerous references throughout the textbook are provided where deeper and less “hands on” works can be pursued. 

With its unique breadth of topics covering a wide range of modern quantitative techniques, user-friendliness and quality of expository writing, Multivariate Statistics and Machine Learning will serve as a key and unifying introductory textbook for students in the social, natural, statistical and computational sciences for years to come.   

1147062899
Multivariate Statistics and Machine Learning: An Introduction to Applied Data Science Using R and Python
Multivariate Statistics and Machine Learning is a hands-on textbook providing an in-depth guide to multivariate statistics and select machine learning topics using R and Python software.

The book offers a theoretical orientation to the concepts required to introduce or review statistical and machine learning topics, and in addition to teaching the techniques, instructs readers on how to perform, implement, and interpret code and analyses in R and Python in multivariate, data science, and machine learning domains. For readers wishing for additional theory, numerous references throughout the textbook are provided where deeper and less “hands on” works can be pursued. 

With its unique breadth of topics covering a wide range of modern quantitative techniques, user-friendliness and quality of expository writing, Multivariate Statistics and Machine Learning will serve as a key and unifying introductory textbook for students in the social, natural, statistical and computational sciences for years to come.   

69.99 Pre Order
Multivariate Statistics and Machine Learning: An Introduction to Applied Data Science Using R and Python

Multivariate Statistics and Machine Learning: An Introduction to Applied Data Science Using R and Python

by Daniel J. Denis
Multivariate Statistics and Machine Learning: An Introduction to Applied Data Science Using R and Python

Multivariate Statistics and Machine Learning: An Introduction to Applied Data Science Using R and Python

by Daniel J. Denis

Paperback

$69.99 
  • SHIP THIS ITEM
    Available for Pre-Order. This item will be released on October 10, 2025

Related collections and offers


Overview

Multivariate Statistics and Machine Learning is a hands-on textbook providing an in-depth guide to multivariate statistics and select machine learning topics using R and Python software.

The book offers a theoretical orientation to the concepts required to introduce or review statistical and machine learning topics, and in addition to teaching the techniques, instructs readers on how to perform, implement, and interpret code and analyses in R and Python in multivariate, data science, and machine learning domains. For readers wishing for additional theory, numerous references throughout the textbook are provided where deeper and less “hands on” works can be pursued. 

With its unique breadth of topics covering a wide range of modern quantitative techniques, user-friendliness and quality of expository writing, Multivariate Statistics and Machine Learning will serve as a key and unifying introductory textbook for students in the social, natural, statistical and computational sciences for years to come.   


Product Details

ISBN-13: 9781032454283
Publisher: Taylor & Francis
Publication date: 10/10/2025
Pages: 560
Product dimensions: 7.00(w) x 10.00(h) x (d)

About the Author

Daniel J. Denis, Ph.D., is Professor of Quantitative Psychology at the University of Montana where he has taught applied statistics courses since 2004. He is author of the books, Applied Univariate, Bivariate, and Multivariate Statistics and Applied Univariate, Bivariate, and Multivariate Statistics Using Python.

Table of Contents

Preface
/ Acknowledgements


PART I – Preliminaries and Foundations

Chapter 0 – Introduction, Motivation, Pedagogy and Ideas About Learning

0.1. The Paradigm Shift (What Has Changed)
/ 0.1.1. A Wide Divide
/ 0.2. A Unified Vision – The Bridge
/ 0.3. The Data Science and Machine Learning Invasion (Questions and Answers)
/ 0.4. Who Should Read this Book?
/ 0.4.1. Textbook Limbo
/ 0.4.2. Theoretical vs. Applied vs. Software Books vs. “Cookbooks”
/ 0.4.2.1. Watered Down Statistics
/ 0.4.3. Prerequisites to Reading this Book
/0.5. Pedagogical Approach and the Trade-Offs of Top-Down, Bottom-Up Learning
/ 0.5.1. Top-Down, Bottom-Up Learning
/ 0.5.2. Ways of Writing a Book: Making it Pedagogical Instead of Cryptic
/ 0.5.3. Standing on the Shoulders of Giants (A Companion to Advanced Texts)
/ 0.5.4. Making Equations “Speak”
/ 0.5.5. The Power of Problems
/ 0.5.6. Computing Languages
/ 0.5.7. Notation Used in the Book
/0.6. Nobody Learns a Million Things (The Importance of Foundations and Learning How to Learn)
/ 0.6.1. Essential Philosophy of Science and History
/ 0.6.2. Beyond the Jargon, Beyond the Hype
/ 0.7. The Power and Dangers of Analogy and Metaphor (Ways of Understanding)
/0.7.1. The Infinite Regress of Knowledge – A Venture into What it Means to “Understand” Something and Why Epistemology is Important
/ 0.7.1.2. Epistemological Maturity
/ 0.8. Format and Organization of Chapters

Chapter 1 – First Principles and Philosophical Foundations
/ 1.1. Science, Statistics, Machine Learning, Artificial Intelligence
/ 1.1.1. Mathematics, Statistics, Computation
/ 1.1.2. Mathematical Systems as a Narrative to Understanding
/ 1.2. The Scope of Data Analysis and Data Science (Expertise in Everything!)
/ 1.2.1. Theoretical vs. Applied Statistics & Specialization
/ 1.3. The Role of Computers
/ 1.3.1. The Nature of Algorithms
/ 1.3.1.2. Algorithmic Stability
/ 1.4. The Importance of Design, Experimental or Otherwise
/1.5. Inductive, Deductive, and Other Logics
/ 1.5.1. Consistency and Gödel’s Incompleteness Theorems
/ 1.5.1.2. What is the Relevance of Gödel?
/ 1.6. Supervised vs. Unsupervised Learning
/ 1.6.1. Fuzzy Distinctions
/ 1.7. Theoretical vs. Empirical Justification
/ 1.7.1. Airplanes and Oceanic Submersibles
/ 1.7.2. Will the Bridge Stay Up if the Mathematics Fail?
/ 1.8. Level of Analysis Problem
/ 1.9. Base Rates, Common Denominators and Degrees
/1.9.1. Base Rates and Splenic Masses
/ 1.9.2. Probability Neglect
/ 1.9.3. The “Zero Group”
/ 1.10. Statistical Regularities and Perceptions of Risk
/ 1.10.1. Beck Depression Inventory: How Depressed Are You?
/ 1.11. Decision, Risk Analysis and Optimization
/ 1.11.1. The Risk of Making a Wrong Decision
/ 1.11.2. Statistical Lives and Optimization
/ 1.11.3. Medical Decision-Making and Dominating Criteria
/ 1.12. All Knowledge, Scientific and Other, is Tentative
/ 1.13. Occam’s Razor
/ 1.13.1. Parsimony vs. Complexity Trade-Off
/1.14. Overfitting vs. Underfitting
/ 1.14.1. Solutions to Overfitting
/ 1.14.2. The Idea of Regularization
/1.15. The Measurement Problem
/ 1.15.1. What is Data?
/ 1.15.2. The Philosophy and Scales of Measurement
/ 1.15.3. Reliability
/ 1.15.3.1. Coefficient Alpha
/ 1.15.3.2. Test-Retest Reliability
/ 1.15.4. Validity
/ 1.15.5. Scales of Measurement
/ 1.15.6. Likert Scales
/ 1.15.6.1. Statistical Models for Likert Data
/ 1.15.6.2. Models for Ordinal and Monotonically Increasing/Decreasing Data

Overview of Statistical and Machine Learning Concepts
/1.16. Probably Approximately Correct
/1.17. No Free Lunch Theorem
/1.18. V-C Dimension and Complexity
/1.19. Parametric vs. Nonparametric Learning Methods
/ 1.19.1. Flexibility and Number of Parameters
/ 1.19.1.1. Concept of Degrees of Freedom
/ 1.19.2. Instance or Memory-Based Learning
/ 1.19.3. Revisiting Classical Nonparametric Tests
/1.20. Dimension Reduction, Distance, and Error Functions: Commonalities in Modeling
/1.20.1. Dimension Reduction: What’s the Big Idea?
/1.20.2. The Curse of Dimensionality
/1.21. Distance
/1.22. Error Minimization
/1.23. Training vs. Test Error
/1.24. Cross-Validation and Model Selection
/1.25. Monte Carlo Methods
/1.26. Missing Data
/1.27. Quantitative Approaches to Data Analysis
/1.28. Chapter Review Exercises


Chapter 2 – Mathematical and Statistical Foundations

2.1. Mathematical “Previews” vs. the “Appendix” Approach (Why Previews are Better)
/ 2.1.2. About Proofs

2.2. Elementary Probability and Fundamental Statistics
/ 2.3. Interpretations of Probability
/ 2.4. Mathematical Probability
/ 2.4.1. Unions and Intersections of Events
/ 2.5. Conditional Probability
/ 2.5.1. Unconditional vs. Conditional Statistical Models
/ 2.6. Probabilistic Independence
/ 2.6.1. Everything is About Independence vs. Dependence!
/ 2.7. Marginal vs. Conditional Distributions
/ 2.8. Independence Implies Covariance of Zero, But Covariance of Zero Does Not (Necessarily) Imply Independence
/ 2.9. Sensitivity and Specificity: More Conditional Probabilities
/ 2.10. Bayes’ Theorem and Conditional Probabilities
/ 2.10.1. Bayes’ Factor
/ 2.10.2. Bayesian Model Selection
/ 2.10.3. Bayes’ Theorem as Rational Belief or Theorizing
/ 2.11. Law of Large Numbers
/ 2.11.1. Law of Large Numbers and the Idea of Committee Machines
/ 2.12. Random Variables and Probability Density Functions
/ 2.13. Convergence of Random Variables
/ 2.14. Probability Density Functions
/ 2.15. Normal (Gaussian) Distributions
/ 2.15.1. Univariate Gaussian
/ 2.15.2. Mixtures of Gaussians
/ 2.15.3. Evaluating Univariate Normality
/ 2.15.4. Multivariate Gaussian
/ 2.15.5. Evaluating Multivariate Normality
/2.16. Binomial Distributions
/2.16.1. Approximation to the Normal Distribution
/ 2.17. Multinomial Distributions
/2.18. Poisson Distribution
/2.19. Chi-Square Distributions
/2.20. Expectation and Expected Value
/ 2.21. Measures of Central Tendency
/ 2.21.1. The Arithmetic Mean (Average)
/ 2.21.1.1. Averaging Over Cases
/ (Why Thinking in Terms of Averages Can Be Dangerous)
/ 2.21.2. The Median
/ 2.22. Measures of Variability
/ 2.22.1. Variance and Standard Deviation
/ 2.22.2. Mean Absolute Deviation
/ 2.23. Skewness and Kurtosis
/ 2.24. Coefficient of Variation
/ 2.25. Statistical Estimation
/ 2.26. Bias-Variance Trade-Off
/ 2.26.1. Is Irreducible Error Really Irreducible?
/ 2.27. Maximum Likelihood Estimation
/ 2.27.1. Why ML is so Popular and Alternatives
/ 2.27.2. Estimation and Confidence Intervals
/ 2.28. The Bootstrap (A Way of Estimating Nonparametrically)
/ 2.28.1. Simple Examples of the Bootstrap
/ 2.28.2. Why not Boostrap Everything?
/ 2.28.3. Variations and Extensions of the Bootstrap
/2.29. Elements of Classic Null Hypothesis Significance Testing
/ 2.29.1. One-Tailed vs. Two-Tailed Tests
/ 2.29.2. Effect Size
/ 2.29.3. Cohen’s d (Measure of Effect Size)
/ 2.29.4. Are p-values that Evil?
/ 2.29.5. Absolute vs. Relative Size of Effect (Context Matters)
/ 2.29.6. Comparability of Effect Sizes Across Studies
/ 2.29.7. Operationalizing Predictors
/ 2.30. Central Limit Theorem
/ 2.31. Covariance and Correlation
/ 2.31.1. Why Does rxy Have Limits -1 to +1?
/ 2.31.2. Covariance and Correlation in R and Python
/ 2.31.3. Correlating Linear Combinations
/ 2.31.4. Covariance and Correlation Matrices
/ 2.32. Z-Scores and Z-Tests
/ 2.32.1. Z-tests and T-tests for the Mean
/ 2.33. Unequal Variances: Welch-Satterthwaite Approximation
/ 2.34. Paired Data
/ 2.35. Review Exercises

2.36 Linear Algebra and Matrices

2.36.1. Vectors
/ 2.36.1.2. Vector Spaces and Fields
/ 2.36.1.3. Zero, Unit Vectors, and One-Hot Vectors
/ 2.36.1.4. Transpose of a Vector
/ 2.36.1.5. Vector Addition and Length
/ 2.36.1.6. Eigen Analysis and Decomposition
/2.36.1.7. Points vs. Vectors
/ 2.37. Matrices
/ 2.37.1. Identity Matrix
/ 2.37.2. Transpose of a Matrix
/ 2.37.3. Symmetric Matrices
/ 2.37.4. Matrix Addition and Multiplication
/ 2.37.5. Meaning of Matrices (Matrices as Data and Transformations)
/ 2.37.6. Kernel (Null Space)
/ 2.37.7. Trace of a Matrix
/ 2.38. Linear Combinations
/ 2.39. Determinants
/ 2.40. Means and Variances of Matrices
/ 2.41. Determinant as a Generalized Variance
/ 2.42. Matrix Inverse
/ 2.42.1. Nonexistence of an Inverse and Singularity
/ 2.43. Quadratic Forms
/2.44. Positive Definite Matrices
/ 2.45. Inner Products
/ 2.46. Linear Independence
/ 2.47. Rank of a Matrix
/ 2.48. Orthogonal Matrices
/ 2.49. Kernels, the Kernel Trick, and Dual Representations
/ 2.49.1. When are Kernel Methods Useful?
/ 2.50. Systems of Equations
/ 2.51. Distance
/ 2.52. Projections and Basis
/ 2.53. The Meaning of Linearity
/ 2.54. Basis and Dimension
/ 2.54.1. Orthogonal Basis
/ 2.55. Review Exercises

2.56. Calculus and Optimization
/ 2.57. Functions, Approximation and Continuity
/ 2.57.1. Definition of Continuity
/2.58. The Derivative
/ 2.58.1. Local Behavior and Approximation
/ 2.58.2. Composite Functions and Basis Expansions
/2.59. The Partial Derivative
/2.60. Optimization and Gradients
/ 2.60.1. What Does “Optimal” Mean?
/ 2.60.2. Minima and Maxima via Calculus
/ 2.60.3. Convex vs. Non-Convex Functions and Sets
/2.61. Gradient Descent
/ 2.61.1. How Does Gradient Descent Find Minima?
/2.62. Integral Calculus
/ 2.62.1. Double and Triple Integrals
/2.63. Review Exercises


Chapter 3 – R and Python Software

3.1. The Dominance of R and Python
/3.2. The R-Project
/ 3.2.1. Installing R
/ 3.2.2. Working with Data
/ 3.2.2.1. Building a Data Frame
/ 3.2.3. Installing Packages in R
/ 3.2.4. Writing Functions in R
/ 3.2.5. Mathematics and Statistics Using R
/ 3.2.5.1. Addition, Subtraction, Multiplication and Division
/ 3.2.5.2. Logarithms and Exponentials
/ 3.2.5.3. Vectors and Matrices
/ 3.2.5.4. Means
/ 3.2.5.5. Covariance and Correlation
/ 3.2.5.6. Sampling with Replacement in R
/ 3.2.5.7. Visualization and Plots
/ 3.2.5.7.1. Boxplots
/ 3.2.6. Further Readings and Resources in R

3.3. Python
/ 3.3.1. Installing Python
/ 3.3.2. Elements of Python
/ 3.3.3. Working With Data
/ 3.3.4. Python Functions for Data Analysis
/ 3.3.4.1. Mathematics Using Python
/ 3.3.4.2. Splitting Data into Train and Test Sets
/ 3.3.4.3. Preprocessing Data
/ 3.3.5. Further Readings and Resources in Python
/3.4. Chapter Review Exercises


PART II – Models and Methods

Chapter 4 – Univariate and Multivariate Analysis of Variance Models

4.1. The Classic ANOVA Model
/ 4.1.1. Mean Squares
/ 4.1.2. Expected Mean Squares of ANOVA
/ 4.1.3. Effect Sizes for ANOVA
/ 4.1.4. Contrasts and Post-Hoc Tests for ANOVA
/ 4.1.5. ANOVA in Python
/ 4.1.6. ANOVA in R
/ 4.2. Factorial ANOVA and Higher-Order Models
/ 4.2.1. Factorial ANOVA in Python
/ 4.3. Random Effects and Mixed Models
/ 4.3.1. The Meaning of a Fixed vs. Random Effect
/ 4.3.2. Is the Fixed-Effects Model Actually Fixed? A Look at the Error Term
/ 4.3.3. Mixed Models in Python
/ 4.3.4. Mixed Models in R
/ 4.4. MultiLevel Modeling
/ 4.4.1. A Garbled Mess of Jargon
/ 4.4.2. Why Do Multilevel Models Often Include Random Effects?
/ 4.4.3. A Priori vs. Post-Hoc “Nesting”
/ 4.4.4. Blocking as an Example of Hierarchical/Multilevel Structure
/ 4.4.5. Non-Parametric Random-Effects Model
/ 4.5. Repeated Measures and Longitudinal Models
/ 4.5.1. Classic Repeated Measures Models
/ 4.6. Multivariate Analysis of Variance (MANOVA)
/ 4.6.1. Suitability of MANOVA
/ 4.6.2. Extending the Univariate Model (Hotelling’s T2)
/ 4.6.3. Multivariate Test Statistics
/ 4.6.4. Evaluating Equality of Covariance Matrices (The Box-M Test)
/ 4.6.5. MANOVA in Python
/ 4.6.6. MANOVA in R
/ 4.7. Linear Discriminant Analysis (as the “Reverse” of MANOVA)
/ 4.8. Chapter Review Exercises


Chapter 5 – Simple Linear and Multiple Regression Models (and Extensions)
/ 5.1. Simple Linear Regression – Fixed Predictor Case
/ 5.1.1. Parameter Estimates
/ 5.1.2. Simple Linear Regression in R
/ 5.1.3. Simple Linear Regression in Python
/ 5.2. Multiple Linear Regression
/ 5.2.1. Minimizing Squared vs. Absolute Deviations
/ 5.2.2. Hypothesis-Testing in Multiple Regression
/ 5.2.3. Multiple Linear Regression in Python
/ 5.2.4. Multiple Linear Regression in R
/ 5.3. Geometry of Least-Squares
/ 5.4. Gauss-Markov Theorem (What We Like About Least-Squares Estimates)
/ 5.4.1. Are Unbiased Estimators Always Best?
/ 5.5. Time Series (An Example of Correlated Errors)
/ 5.6. Model Selection in Regression (Is There an Optimal Model?)
/5.7. Effect Size and Adjusting the Training Error Rate
/ 5.7.1. R2, Adjusted R2, Cp, AIC, BIC
/ 5.7.2. Comparing , , AIC, BIC to Cross-Validation
/ 5.8. Assumptions for Regression
/ 5.8.1. Collinearity
/ 5.8.1.1. Variance Inflation Factor
/ 5.8.2. Collinearity Necessarily Implies Redundancy Only in Terms of Variance
/5.9. Variable Selection Methods (Building the Regression Model)
/ 5.9.1. Forward, Backward and Stepwise in R
/5.10. Mediated and Moderated Regression Models
/ 5.10.1. Statistical Mediation
/ 5.10.2. Statistical Moderation
/ 5.10.3. Moderated Mediation
/ 5.10.4. Mediation in Python
/ 5.11. Further Directions and a Final Word of Warning on Mediation and Moderation
/5.12. Principal Components Regression
/ 5.12.1. What is Principal Components Analysis?
/ 5.12.2. PCA Regresion and Singularity
/ 5.12.3. Principal Components Regression in R
/5.13. Partial Least-Squares Regression
/ 5.13.1. Partial Least Squares in R
/ 5.13.2. Partial Least Squares in Python
/5.14. Multivariate Reduced-Rank Regression
/5.15. Canonical Correlation
/5.15.1. Canonical Correlation in R
/5.16. Chapter Review Exercises

Chapter 6 – Regularization Methods in Regression: Ridge, Lasso, Elastic Net
/ 6.1. The Concept of Regularization
/ 6.1.1. Regularization in Regression and Beyond
/ 6.2. Ridge Regression
/ 6.2.1. Mathematics of Ridge Regression
/ 6.2.2. Consequence of Ridge Estimator
/ 6.2.3. Revisiting the Bias-Variance Tradeoff (Why Ridge is Useful)
/ 6.2.4. A Visual Look at Ridge Regression
/ 6.2.5. Ridge Regression in Python
/ 6.2.6. Ridge Regression in R
/ 6.3. Lasso Regression
/ 6.3.1. Lasso Regression in Python
/ 6.3.2. Lasso Regression in R
/ 6.4. Elastic Net
/ 6.4.1. Elastic Net in Python
/ 6.5. Which Regularization Penalty is Better?
/ 6.6. Least-Angle Regression
/ 6.6.1. Least-Angle Regression in R
/ 6.7. Additional Variable Selection Algorithms
/ 6.8. Chapter Review Exercises

Chapter 7 – Nonlinear and Nonparametric Regression

7.1. Polynomial Regression
/ 7.1.1. Polynomial Regression in Python
/ 7.1.2. Polynomial Regression in R
/ 7.1.3. Polynomial Regression as a Global Strategy
/ 7.1.4. A More Local Alternative
/ 7.1.5. Least-Squares Regression Line as a “Floating Mean” (Toward a Localized Approach)
/ 7.1.5.1. Zooming in on Locality
/ 7.2. Basis Functions and Expansions
/ 7.2.1. Basis Functions and Locality
/ 7.2.2. Neural Networks as a Basis Expansion (Generalizing the Concept)
/ 7.2.3. Regression Splines and the Concept of a “Knot”
/ 7.2.4. Conceptualizing Regression Splines
/ 7.2.5. Problem with Splines and Imposing Constraints
/ 7.2.6. Polynomial Regression vs. Regression Splines
/ 7.3. Nonparametric Regression: Local and Kernel Regression
/ 7.3.1. Motivating Kernel Regression via Local-Averaging
/ 7.3.2. Kernel Regression – “Locally Weighted Averaging”
/7.3.3. A Variety of Kernels
/ 7.3.4. Kernel Regression is not Nonlinear; It is Nonparametric
/ 7.3.5. Kernel Regression in R
/ 7.4. Chapter Review Exercises

Chapter 8 – Generalized Linear and Additive Models: Logistic, Poisson, and Related Models

8.1. How to Operationalize the Response
/ 8.1.1. Pros and Cons of Binning
/ 8.1.2. Detecting New Classes or Categories
/ 8.2. The Generalized Linear Model
/ 8.2.1. Intrinsically Linear Models
/ 8.2.2. General vs. Generalized Linear Models
/ 8.3. The Logistic Regression Model
/ 8.3.1. Odds and Odds Ratios
/ 8.3.2. Logistic Regression in R
/ 8.3.3. Logistic Regression in Python
/ 8.4. Generalized Linear Models and Neural Networks
/ 8.5. Multiple Logistic Regression
/ 8.5.1. Multiple Logistic Regression in R
/ 8.6. Poisson Regression
/ 8.6.1. Poisson Regression in R
/ 8.6.2. Poisson Regression in Python
/ 8.7. Generalized Additive Models (A Flexible Nonparametric Alternative)
/ 8.7.1. Why Use a Smoother Instead of Linear Weights?
/ 8.7.2. Deriving the Generalized Additive Model
/ 8.7.3. GAM as a Smooth Extension to GLLM
/ 8.7.4. Generalized Additive Models and Neural Networks
/ 8.7.5. Linking the Logit to the Additive Logistic Model
/ 8.8. Overview and Recap of Nonlinear Approaches for Nonlinear Regression
/ 8.9. Discriminant Analysis
/ 8.9.1. Bayes is Best for Classification
/ 8.9.2. Why Not Always Bayes?
/ 8.9.3. The Linear Discriminant Analysis Model
/ 8.9.4. How LDA Approximates Bayes
/ 8.9.5. Estimating the Prior Probability
/ 8.10. Multiclass Discriminant Analysis
/ 8.11. Discriminant Analysis in a Simple Scatterplot
/ 8.12. Quadratic Discriminant Analysis
/ 8.13. Regularized Discriminant Analysis
/8.14. Discriminant Analysis in R
/8.15. Discriminant Analysis in Python
/ 8.16. Naïve Bayes (Approximating the Bayes Classifier by Assuming (Conditional) Independence)
/ 8.16.1. What Makes Naïve Bayes “Naïve”?
/ 8.16.2. Naïve Bayes in Python
/ 8.17. LDA, QDA, Naïve Bayes: Which is Best?
/ 8.18. Nonparametric K-Nearest Neighbors
/ 8.18.1. K-Nearest Neighbor (KNN): Only Looking at Nearby Points
/ 8.18.2. Example of KNN
/ 8.18.3. Disadvantages of KNN
/ 8.19. Chapter Review Exercises


Chapter 9 – Support Vector Machines

9.1. Maximum Margin Classifier
/ 9.1.1. When Sum Does Not Equal Zero
/ 9.1.1.1. So, What’s the Problem?
/ 9.1.2. Building the Maximal Margin Classifier
/ 9.2. The Case of No Separating Hyperplane
/ 9.3. Support Vector Classifier for the Non-Separable Case
/ 9.4. Support Vector Machines (Introducing the Kernel for Nonlinearity)
/ 9.4.1. Enlarging the Feature Space with Kernels
/ 9.4.2. Support Vector Machines in Python
/ 9.4.3. Support Vector Machines in R
/9.5. Chapter Review Exercises

Chapter 10 – Decision Trees, Bagging, Random Forests and Committee Machines

10.1. Why Combining Weak Learners Works (Concept of Variance Reduction Using Averages)
/ 10.2. Decision Trees
/ 10.2.1. How Should Trees Be Grown?
/ 10.2.2. Optimization Criteria for Tree-Building
/ 10.2.3. Why not Multiway Splits?
/ 10.2.4. Overfitting, Saturation, and Tree Pruning
/ 10.2.5. Cost-Complexity or Weakest-Link Pruning
/ 10.3. Classification Trees
/ 10.3.1. Gini Index
/ 10.3.2. Decision Trees in R
/10.4. Committee Machines
/10.5. Overview of Bagging and Boosting
/10.5.1. Bagging
/ 10.5.2. A Familiar Example (Bagging Samples and the Variance of the Mean)
/ 10.5.3. A Deeper Look at Bagging
/ 10.5.4. Out-of-Bag Error
/ 10.5.5. Interpreting Results from Bagging
/ 10.5.6. Bagging in R
/10.6. Random Forests
/ 10.6.1. The Problem with Bagging Decision Trees
/ 10.6.2. Equivalency of Random Forests and Bagging
/ 10.6.3. Random Forests in R
/10.7. Boosting
/ 10.7.1. Boosting Using R
/10.8. Stacked Generalization
/10.9. Chapter Review Exercises

Chapter 11 – Principal Components Analysis, Blind Source Separation, and Manifold Learning

11.1. Dimension Reduction and Jargon
/ 11.2. Deriving Classic Principal Components Analysis
/ 11.2.1. The 2nd Principal Component
/ 11.2.2. PCA as a Least-Squares Technique and Minimizing Reconstruction Error
/ 11.2.3. Choosing the Number of Derived Components
/ 11.2.4. Why Reconstruction Error is Insufficient for Choosing Number of Components
/ 11.2.5. Constraints on Components
/11.2.6. Centering Observed Variables
/11.2.7. Orthogonality of Components
/11.2.8. Proportion of Variance Explained by Each Component (Covariance vs. Correlation Matrices)
/11.2.9. Principal Components as a Rotation of Axes
/ 11.2.10 Principal Components, Discriminant Functions, Canonical Variates (Linking Foundations)
/ 11.2.11. Principal Components in Python
/11.2.12. Principal Components in R
/11.2.13. Cautionary Concerns and Caveats Regarding Principal Components
/ 11.3. Independent Components Analysis
/ 11.3.1. Principal Components vs. Independent Components Analysis
/ 11.4. Probabilistic PCA
/ 11.4.1. Motivation for Probabilistic PCA
/ 11.4.2. Probabilistic PCA in R
/ 11.5. PCA for Discrete, Binary, and Categorical Data
/ 11.6. Nonlinear Dimension Reduction
/ 11.6.1. Kernel PCA
/ 11.6.2. How KPCA Works
/ 11.6.3. Kernalizing and Computational Complexity
/ 11.6.4. Reconstruction Error in Kernel PCA
/ 11.6.5. The Matrices of Kernel PCA
/ 11.6.6. Classical PCA as a Special Case of Kernel PCA
/ 11.6.7. “Kernel Trick” is Not Simply About Cost
/ 11.6.8. Kernel PCA in Python
/ 11.6.9. Kernel PCA in R
/ 11.7. Principal Curves
/ 11.7.1. Principal Components as a Special Case of Principal Curves and Surfaces
/11.7.2. Principal Curves in R
/ 11.8. Principal Components Analysis as an Encoder
/ 11.9. Neural Networks and PCA as Autoencoders
/ 11.10. Multidimensional Scaling
/ 11.10.1. Merits of MDS
/ 11.10.2. Metric vs. Non-Metric (Ordinal) Scaling
/ 11.10.3. Weakness of MDS: “Closeness” Can be Arbitrary
/ 11.10.4. Standardization of Distances
/ 11.10.5. MDS in Python
/ 11.10.6. MDS in R
/ 11.11. Self-Organizing Maps
/ 11.12. Manifold Learning
/ 11.12.1. Manifold Hypothesis
/ 11.12.2. Example of a Simple Manifold
/ 11.12.3. Nonparametric Manifolds
/ 11.12.4. Geodesic Distances
/ 11.13. Local Linear Embedding
/ 11.13.1. LLE in Python
/ 11.14. Isomap
/ 11.14.1. Isomap in Python
/ 11.15. Stochastic Neighborhood Embedding (SNE)
/ 11.15.1. SNE in R
/ 11.16. t-SNE
/ 11.16.1. Performance of t-SNE to Other Techniques
/ 11.16.2. t-SNE in Python
/ 11.16.3. t-SNE in R
/ 11.17. Manifold Learning and Beyond
/11.18. Chapter Review Exercises

Chapter 12 – Exploratory Factor Analysis

12.1. Why Treat Factor Analysis in its Own Chapter?
/12.2. Common Orthogonal Factor Model
/ 12.2.1. Factor Analysis is a Regression Model
/ 12.2.2. Assumptions Underlying the Factor Analysis Model
/ 12.2.3. Implied Covariance Matrix
/ 12.3. The Problem with Factor Analysis
/ 12.3.1. The Problem is the Users, Not the Method
/ 12.3.2. Factor Analysis Generalizes to Machine Learning
/ 12.4. Factor Estimation
/ 12.4.1. Principal Factor (Principal Axis Factoring)
/ 12.4.2. Maximum Likelihood
/ 12.5. Factor Rotation
/ 12.5.1. Varimax
/ 12.5.2. Quartimax
/ 12.6. Bartlett’s Test of Sphericity
/ 12.6.1. Factor Analysis in Python
/ 12.6.2. Factor Analysis in R
/12.7. Independent Factor Analysis
/12.8. Nonlinear Factor Analysis (and Autoencoders)
/ 12.8.1. Unpacking the Autoencoder
/ 12.8.2. Factor Analysis as a Neural Network
/ 12.9. Probabilistic “Sensible” PCA (again)
/ 12.10. Mixtures of Factor Analysis (Modeling Local Linearity)
/ 12.11. Item Factor Analysis
/ 12.12. Sparse Factor Analysis
/ 12.13. Chapter Review Exercises

Chapter 13 – Confirmatory Factor Analysis, Path Analysis and Structural Equation Modeling

13.1. What Makes a Model “Exploratory” vs. “Confirmatory”?
/ 13.2. Why “Causal Modeling” is not Causal at all
/ 13.2.1. Misguided History
/ 13.2.2. Baron and Kenny (1986)
/ 13.3. Is the Variable Measurable? The Observed vs. Unobserved Distinction
/ 13.4. Path Analysis (Extending Regression and Previewing SEM)
/ 13.4.1. Exogenous vs. Endogenous Variables
/ 13.5. Confirmatory Factor Analysis Model
/ 13.6. Structural Equation Models
/ 13.6.1. Covariance Modeling
/ 13.6.2. Evaluating Model Fit
/ 13.6.3. Overall (Absolute) Measures
/ 13.6.4. Incremental Fit Indices
/ 13.7. Structural Equation Modeling with Nonlinear Effects
/ 13.7.1. Example of a Nonlinear SEM
/ 13.7.2. Structural Equation Nonparametric and Semiparametric Mixture Models
/ 13.8. Caveats Regarding SEM Models
/ 13.9. SEM in R
/13.10. Chapter Review Exercises

Chapter 14 – Cluster Analysis and Data Segmentation

14.1. Cluster Paradigms and Classifications
/ 14.2. Are Clusters Meaningful?
/ 14.3. Dissimilarity Metrics (The Basis of Clustering Algorithms)
/ 14.4. Association Rules (Market Basket Analysis)
/ 14.5. Why Not Consider All Groups?
/ 14.5.1. What Makes a “Good” Clustering Algorithm?
/ 14.6. Distance and Proximity Metrics
/ 14.7. Is the Data Clusterable?
/ 14.8. Algorithms for Cluster Analysis
/ 14.8.1. K-Means Clustering and K-Representatives
/ 14.8.2. How K-Means Works
/ 14.8.3. Defining Proximity for K-Means
/ 14.8.4. Setting k in K-Means
/ 14.8.5. Weakness of K-Means
/ 14.8.6. K-means vs. ANOVA vs. Discriminant Analysis
/ 14.8.7. Making K-Means Probabilistic via K-Means ++
/ 14.8.8. Using the Data: K-Medoids Clustering
/ 14.8.9. K-Means in Python
/ 14.8.10. K-Means in R
/ 14.9. Sparse and Longitudinal K-Means
/ 14.10. Hierarchical Clustering
/ 14.10.1. Agglomerative Clustering in Python
/ 14.10.2. Agglomerative Clustering in R
/ 14.11. Density-Based Clustering (DBSCAN)
/ 14.11.1. Dense Points and Crowded Regions
/ 14.11.2. DBSCAN in R
/ 14.12. Clustering via Mixture Models
/ 14.12.1. Model Selection for Clustering Solutions
/ 14.13. Cluster Validation
/ 14.14. Cluster Analysis and Beyond
/ 14.15. Chapter Review Exercises


Chapter 15 – Artificial Neural Networks and Deep Learning

15.1. The Rise of Neural Networks: Original Motivation
/ 15.2. Rosenblatt’s Perceptron
/ 15.3. Big Picture Overview of Machine Learning and Neural Networks
/ 15.3.1. What is a Neural Network? (Minimizing the Hype)
/ 15.3.2. Neural Networks are Composite Functions
/ 15.4. Single Layer Feedforward Neural Network
/ 15.5. What is an Activation Function?
/ 15.5.1. Activation Functions do not “Activate” Anything
/ 15.5.2. Types of Activation Functions
/ 15.5.3. Saturating vs. Non-Saturating Activation Functions
/ 15.5.3.1. The Problem with ReLU
/ 15.5.3.2. LeakyReLU
/ 15.5.4. Which Activation Function to Use?
/ 15.6. The Multilayer Perceptron – A Deeper Look at Neural Networks
/ 15.7. Training Neural Networks
/ 15.7.1. Backpropagation and Minimizing Error Sums of Squares
/ 15.8. How Many Hidden Nodes and Layers to Include?
/ 15.9. Overfitting in Neural Networks
/ 15.9.1. Early Stopping
/ 15.9.2. Dropout Method
/ 15.9.3. Regularized Network
/ 15.10. Types of Networks
/15.11. The Universal Approximation Theorem (The Appeal of Neural Networks)
/ 15.11.1.Visualizing the Universal Approximation Theorem
/ 15.12. Neural Networks and Projection Pursuit
/ 15.12.1. Projection Pursuit Regression and Relation to Neural Networks
/ 15.13. Summary, Warnings and Caveats of Neural Networks
/ 15.14. Neural Networks in Python
/15.15. Neural Networks in R
/15.16. Chapter Review Exercises

Concluding Remarks

References

Index

From the B&N Reads Blog

Customer Reviews