Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science

An essential guide for tackling outliers and anomalies in machine learning and data science.

In recent years, machine learning (ML) has transformed virtually every area of research and technology, becoming one of the key tools for data scientists. Robust machine learning is a new approach to handling outliers in datasets, which is an often-overlooked aspect of data science. Ignoring outliers can lead to bad business decisions, wrong medical diagnoses, reaching the wrong conclusions or incorrectly assessing feature importance, just to name a few.

Fundamentals of Robust Machine Learning offers a thorough but accessible overview of this subject by focusing on how to properly handle outliers and anomalies in datasets. There are two main approaches described in the book: using outlier-tolerant ML tools, or removing outliers before using conventional tools. Balancing theoretical foundations with practical Python code, it provides all the necessary skills to enhance the accuracy, stability and reliability of ML models.

Fundamentals of Robust Machine Learning readers will also find:

A blend of robust statistics and machine learning principles
Detailed discussion of a wide range of robust machine learning methodologies, from robust clustering, regression and classification, to neural networks and anomaly detection
Python code with immediate application to data science problems

Fundamentals of Robust Machine Learning is ideal for undergraduate or graduate students in data science, machine learning, and related fields, as well as for professionals in the field looking to enhance their understanding of building models in the presence of outliers.

1146515172

Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science

An essential guide for tackling outliers and anomalies in machine learning and data science.

Fundamentals of Robust Machine Learning readers will also find:

A blend of robust statistics and machine learning principles
Detailed discussion of a wide range of robust machine learning methodologies, from robust clustering, regression and classification, to neural networks and anomaly detection
Python code with immediate application to data science problems

110.0 In Stock

Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science

Add to Wishlist

Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science

Hardcover

$110.00

View All Available Formats & Editions

Hardcover
$110.00

View All Available Formats & Editions

SHIP THIS ITEM

In stock. Ships in 6-10 days.
PICK UP IN STORE

Your local store may have stock of this item.

Available within 2 business hours

Want it Today?
Check Store Availability

Related collections and offers

Overview

An essential guide for tackling outliers and anomalies in machine learning and data science.

Fundamentals of Robust Machine Learning readers will also find:

A blend of robust statistics and machine learning principles
Detailed discussion of a wide range of robust machine learning methodologies, from robust clustering, regression and classification, to neural networks and anomaly detection
Python code with immediate application to data science problems

Product Details

ISBN-13:	9781394294374
Publisher:	Wiley
Publication date:	05/13/2025
Pages:	416
Product dimensions:	7.40(w) x 9.20(h) x 1.10(d)

About the Author

Resve Saleh, (PhD, UC Berkeley) is a Professor Emeritus at the University of British Columbia. He worked for a decade as a professor at the University of Illinois and as a visiting professor at Stanford University. He was Founder and Chairman of Simplex Solutions, Inc., which went public in 2001. He is an IEEE Fellow and Fellow of the Canadian Academy of Engineering.

Sohaib Majzoub, (PhD, University of British Columbia) is an Associate Professor at the University of Sharjah, UAE. He also taught at the American University in Dubai, UAE and at King Saud University, KSA, and a visiting professor at Delft Technical University in The Netherlands. He is a Senior Member of the IEEE.

A. K. MD. Ehsanes Saleh, (PhD, University of Western Ontario) is a Professor Emeritus and Distinguished Professor in the School of Mathematics and Statistics, Carleton University, Ottawa, Canada. He also taught as Simon Fraser University, the University of Toronto, and Stanford University. He is a Fellow of IMS, ASA and an Honorary Member of SSC, Canada.

Preface xv

About the Companion Website xix

1 Introduction 1

1.1 Defining Outliers 2

1.2 Overview of the Book 3

1.3 What Is Robust Machine Learning? 3

1.3.1 Machine Learning Basics 4

1.3.2 Effect of Outliers 6

1.3.3 What Is Robust Data Science? 7

1.3.4 Noise in Datasets 7

1.3.5 Training and Testing Flows 8

1.4 Robustness of the Median 9

1.4.1 Mean vs. Median 9

1.4.2 Effect on Standard Deviation 10

1.5 l 1 and l 2 Norms 11

1.6 Review of Gaussian Distribution 12

1.7 Unsupervised Learning Case Study 13

1.7.1 Clustering Example 14

1.7.2 Clustering Problem Specification 14

1.8 Creating Synthetic Data for Clustering 16

1.8.1 One-Dimensional Datasets 16

1.8.2 Multidimensional Datasets 17

1.9 Clustering Algorithms 19

1.9.1 k-Means Clustering 19

1.9.2 k-Medians Clustering 21

1.10 Importance of Robust Clustering 22

1.10.1 Clustering with No Outliers 22

1.10.2 Clustering with Outliers 23

1.10.3 Detection and Removal of Outliers 25

1.11 Summary 27

Problems 28

References 34

2 Robust Linear Regression 35

2.1 Introduction 35

2.2 Supervised Learning 35

2.3 Linear Regression 36

2.4 Importance of Residuals 38

2.4.1 Defining Errors and Residuals 38

2.4.2 Residuals in Loss Functions 39

2.4.3 Distribution of Residuals 40

2.5 Estimation Background 42

2.5.1 Linear Models 42

2.5.2 Desirable Properties of Estimators 43

2.5.3 Maximum-Likelihood Estimation 44

2.5.4 Gradient Descent 47

2.6 M-Estimation 49

2.7 Least Squares Estimation (LSE) 52

2.8 Least Absolute Deviation (LAD) 54

2.9 Comparison of LSE and LAD 55

2.9.1 Simple Linear Model 55

2.9.2 Location Problem 56

2.10 Huber’s Method 58

2.10.1 Huber Loss Function 58

2.10.2 Comparison with LSE and LAD 63

2.11 Summary 64

Problems 64

References 67

3 The Log-Cosh Loss Function 69

3.1 Introduction 69

3.2 An Intuitive View of Log-Cosh 69

3.3 Hyperbolic Functions 71

3.4 M-Estimation 71

3.4.1 Asymptotic Behavior 72

3.4.2 Linear Regression Using Log-Cosh 74

3.5 Deriving the Distribution for Log-Cosh 75

3.6 Standard Errors for Robust Estimators 79

3.6.1 Example: Swiss Fertility Dataset 81

3.6.2 Example: Boston Housing Dataset 82

3.7 Statistical Properties of Log-Cosh Loss 83

3.7.1 Maximum-Likelihood Estimation 83

3.8 A General Log-Cosh Loss Function 84

3.9 Summary 88

Problems 88

References 93

4 Outlier Detection, Metrics, and Standardization 95

4.1 Introduction 95

4.2 Effect of Outliers 95

4.3 Outlier Diagnosis 97

4.3.1 Boxplots 98

4.3.2 Histogram Plots 100

4.3.3 Exploratory Data Analysis 101

4.4 Outlier Detection 102

4.4.1 3-Sigma Edit Rule 102

4.4.2 4.5-MAD Edit Rule 104

4.4.3 1.5-IQR Edit Rule 105

4.5 Outlier Removal 105

4.5.1 Trimming Methods 105

4.5.2 Winsorization 105

4.5.3 Anomaly Detection Method 106

4.6 Regression-Based Outlier Detection 107

4.6.1 LS vs. LC Residuals 108

4.6.2 Comparison of Detection Methods 109

4.6.3 Ordered Absolute Residuals (OARs) 110

4.6.4 Quantile–Quantile Plot 111

4.6.5 Quad-Plots for Outlier Diagnosis 113

4.7 Regression-Based Outlier Removal 114

4.7.1 Iterative Boxplot Method 114

4.8 Regression Metrics with Outliers 116

4.8.1 Mean Square Error (MSE) 117

4.8.2 Median Absolute Error (MAE) 118

4.8.3 MSE vs. MAE on Realistic Data 119

4.8.4 Selecting Hyperparameters for Robust Regression 120

4.9 Dataset Standardization 121

4.9.1 Robust Standardization 122

4.10 Summary 126

Problems 126

References 131

5 Robustness of Penalty Estimators 133

5.1 Introduction 133

5.2 Penalty Functions 133

5.2.1 Multicollinearity 133

5.2.2 Penalized Loss Functions 135

5.3 Ridge Penalty 136

5.4 LASSO Penalty 137

5.5 Effect of Penalty Functions 138

5.6 Penalty Functions with Outliers 139

5.7 Ridge Traces 142

5.8 Elastic Net (Enet) Penalty 143

5.9 Adaptive LASSO (aLASSO) Penalty 145

5.10 Penalty Effects on Variance and Bias 146

5.10.1 Effect on Variance 146

5.10.2 Geometric Interpretation of Bias 148

5.11 Variable Importance 151

5.11.1 The t-Statistic 151

5.11.2 LASSO and aLASSO Traces 153

5.12 Summary 155

Problems 156

References 159

6 Robust Regularized Models 161

6.1 Introduction 161

6.2 Overfitting and Underfitting 161

6.3 The Bias–Variance Trade-Off 162

6.4 Regularization with Ridge 164

6.4.1 Selection of Hyperparameter λ 165

6.4.2 Example: Diabetes Dataset 167

6.5 Generalization using Robust Estimators 169

6.5.1 Training and Test Sets 169

6.5.2 k-Fold Cross-validation 171

6.6 Robust Generalization and Regularization 173

6.6.1 Regularization with LC-Ridge 174

6.7 Model Complexity 175

6.7.1 Variable Selection Using LS-LASSO 176

6.7.2 Variable Ordering Using LC-aLASSO 176

6.7.3 Building a Compact Model 179

6.8 Summary 182

Problems 182

References 186

7 Quantile Regression Using Log-Cosh 187

7.1 Introduction 187

7.2 Understanding Quantile Regression 188

7.3 The Crossing Problem 189

7.4 Standard Quantile Loss Function 190

7.5 Smooth Regression Quantiles (SMRQ) 192

7.6 Evaluation of Quantile Methods 195

7.6.1 Qualitative Assessment 196

7.6.2 Quantitative Assessment 198

7.7 Selection of Robustness Coefficient 200

7.8 Maximum-Likelihood Procedure for SMRQ 202

7.9 Standard Error Computation 204

7.10 Summary 206

Problems 207

References 209

8 Robust Binary Classification 211

8.1 Introduction 211

8.2 Binary Classification Problem 212

8.2.1 Why Linear Regression Fails 212

8.2.2 Outliers in Binary Classification 213

8.3 The Cross-Entropy (CE) Loss 215

8.3.1 Deriving the Cross-Entropy Loss 216

8.3.2 Understanding Logistic Regression 218

8.3.3 Gradient Descent 221

8.4 The Log-Cosh (LC) Loss Function 221

8.4.1 General Formulation 223

8.5 Algorithms for Logistic Regression 224

8.6 Example: Motor Trend Cars 226

8.7 Regularization of Logistic Regression 227

8.7.1 Overfitting and Underfitting 228

8.7.2 k-Fold Cross-Validation 229

8.7.3 Penalty Functions 229

8.7.4 Effect of Outliers 230

8.8 Example: Circular Dataset 231

8.9 Outlier Detection 234

8.10 Robustness of Binary Classifiers 235

8.10.1 Support Vector Classifier (SVC) 235

8.10.2 Support Vector Machines (SVMs) 238

8.10.3 k-Nearest Neighbors (k-NN) 241

8.10.4 Decision Trees and Random Forest 243

8.11 Summary 244

Problems 244

Reference 249

9 Neural Networks Using Log-Cosh 251

9.1 Introduction 251

9.2 A Brief History of Neural Networks 251

9.3 Defining Neural Networks 252

9.3.1 Basic Computational Unit 253

9.3.2 Four-Layer Neural Network 254

9.3.3 Activation Functions 255

9.4 Training of Neural Networks 257

9.5 Forward and Backward Propagation 258

9.5.1 Forward Propagation 259

9.5.2 Backward Propagation 260

9.5.3 Log-Cosh Gradients 263

9.6 Cross-entropy and Log-Cosh Algorithms 264

9.7 Example: Circular Dataset 266

9.8 Classification Metrics and Outliers 269

9.8.1 Precision, Recall, F 1 Score 269

9.8.2 Receiver Operating Characteristics (ROCs) 271

9.9 Summary 273

Problems 273

References 280

10 Multi-class Classification and Adam Optimization 281

10.1 Introduction 281

10.2 Multi-class Classification 281

10.2.1 Multi-class Loss Functions 282

10.2.2 Softmax Activation Function 284

10.3 Example: MNIST Dataset 288

10.3.1 Neural Network Architecture 289

10.3.2 Comparing Cross-Entropy with Log-Cosh Losses 289

10.3.3 Outliers in MNIST 291

10.4 Optimization of Neural Networks 291

10.4.1 Momentum 293

10.4.2 rmsprop Approach 294

10.4.3 Optimizer Warm-Up Phase 295

10.4.4 Adam Optimizer 296

10.5 Summary 297

Problems 297

References 302

11 Anomaly Detection and Evaluation Metrics 303

11.1 Introduction 303

11.2 Anomaly Detection Methods 303

11.2.1 k-Nearest Neighbors 304

11.2.2 Dbscan 308

11.2.3 Isolation Forest 311

11.3 Anomaly Detection Using MADmax 316

11.3.1 Robust Standardization 317

11.3.2 k-Medians Clustering 317

11.3.3 Selecting MADmax 319

11.3.4 k-Nearest Neighbors (k-NN) 319

11.3.5 k-Nearest Medians (k-NM) 320

11.4 Qualitative Evaluation Methods 323

11.5 Quantitative Evaluation Methods 326

11.6 Summary 330

Problems 330

Reference 336

12 Case Studies in Data Science 337

12.1 Introduction 337

12.2 Example: Boston Housing Dataset 337

12.2.1 Exploratory Data Analysis 338

12.2.2 Neural Network Architecture 339

12.2.3 Comparison of LSNN and LCNN 342

12.2.4 Predicting Housing Prices 344

12.2.5 RMSE vs. MAE 344

12.2.6 Correlation Coefficients 345

12.3 Example: Titanic Dataset 346

12.3.1 Exploratory Data Analysis 346

12.3.2 LCLR vs. CELR 351

12.3.3 Outlier Detection and Removal 353

12.3.4 Robustness Coefficient for Log-Cosh 355

12.3.5 The Implications of Robustness 356

12.3.6 Ridge and aLASSO 357

12.4 Application to Explainable Artificial Intelligence (XAI) 359

12.4.1 Case Study: Logistic Regression 360

12.4.2 Case Study: Neural Networks 365

12.5 Time Series Example: Climate Change 366

12.5.1 Autoregressive Model 367

12.5.2 Forecasting Using AR(p) 369

12.5.3 Stationary Time Series 371

12.5.4 Moving Average 374

12.5.5 Finding Outliers in Time Series 375

12.6 Summary and Conclusions 376

Problems 376

References 382

Index 383

From the B&N Reads Blog

Page 1 of

Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science

Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science

Hardcover

Hardcover

Related collections and offers

Overview

Product Details

About the Author

Table of Contents

Customer Reviews

Related collections and offers

Overview

Product Details

About the Author

Table of Contents

Related Subjects

Customer Reviews