Foundations of Machine Learning, second edition / Edition 2 available in Hardcover, eBook

Foundations of Machine Learning, second edition / Edition 2
- ISBN-10:
- 0262039400
- ISBN-13:
- 9780262039406
- Pub. Date:
- 12/25/2018
- Publisher:
- MIT Press
- ISBN-10:
- 0262039400
- ISBN-13:
- 9780262039406
- Pub. Date:
- 12/25/2018
- Publisher:
- MIT Press

Foundations of Machine Learning, second edition / Edition 2
Buy New
$85.00Buy Used
-
SHIP THIS ITEM— Temporarily Out of Stock Online
-
PICK UP IN STORE
Your local store may have stock of this item.
Available within 2 business hours
Temporarily Out of Stock Online
-
SHIP THIS ITEM
Temporarily Out of Stock Online
Please check back later for updated availability.
Overview
This book is a general introduction to machine learning that can serve as a textbook for graduate students and a reference for researchers. It covers fundamental modern topics in machine learning while providing the theoretical basis and conceptual tools needed for the discussion and justification of algorithms. It also describes several key aspects of the application of these algorithms. The authors aim to present novel theoretical tools and concepts while giving concise proofs even for relatively advanced topics.
Foundations of Machine Learning is unique in its focus on the analysis and theory of algorithms. The first four chapters lay the theoretical foundation for what follows; subsequent chapters are mostly self-contained. Topics covered include the Probably Approximately Correct (PAC) learning framework; generalization bounds based on Rademacher complexity and VC-dimension; Support Vector Machines (SVMs); kernel methods; boosting; on-line learning; multi-class classification; ranking; regression; algorithmic stability; dimensionality reduction; learning automata and languages; and reinforcement learning. Each chapter ends with a set of exercises. Appendixes provide additional material including concise probability review.
This second edition offers three new chapters, on model selection, maximum entropy models, and conditional entropy models. New material in the appendixes includes a major section on Fenchel duality, expanded coverage of concentration inequalities, and an entirely new entry on information theory. More than half of the exercises are new to this edition.
Product Details
ISBN-13: | 9780262039406 |
---|---|
Publisher: | MIT Press |
Publication date: | 12/25/2018 |
Series: | Adaptive Computation and Machine Learning series |
Edition description: | second edition |
Pages: | 504 |
Product dimensions: | 7.00(w) x 9.10(h) x 1.20(d) |
Age Range: | 18 Years |
About the Author
Afshin Rostamizadeh is a Research Scientist at Google Research.
Ameet Talwalkar is Assistant Professor in the Machine Learning Department at Carnegie Mellon University.
Table of Contents
Preface xi
1 Introduction 1
1.1 Applications and problems 1
1.2 Definitions and terminology 3
1.3 Cross-validation 5
1.4 Learning scenarios 7
1.5 Outline 8
2 The PAC Learning Framework 11
2.1 The PAC learning model 11
2.2 Guarantees for finite hypothesis sets - consistent case 17
2.3 Guarantees for finite hypothesis sets - inconsistent case 21
2.4 Generalities 24
2.4.1 Deterministic versus stochastic scenarios 24
2.4.2 Bayes error and noise 25
2.4.3 Estimation and approximation errors 26
2.4.4 Model selection 27
2.5 Chapter notes 28
2.6 Exercises 29
3 Rademacher Complexity and VC-Dimension 33
3.1 Rademacher complexity 34
3.2 Growth function 38
3.3 VC-dimension 41
3.4 Lower bounds 48
3.5 Chapter notes 54
3.6 Exercises 55
4 Support Vector Machines 63
4.1 Linear classification 63
4.2 SVMs - separable case 64
4.2.1 Primal optimization problem 64
4.2.2 Support vectors 66
4.2.3 Dual optimization problem 67
4.2.4 Leave-one-out analysis 69
4.3 SVMs - non-separable case 71
4.3.1 Primal optimization problem 72
4.3.2 Support vectors 73
4.3.3 Dual optimization problem 74
4.4 Margin theory 75
4.5 Chapter notes 83
4.6 Exercises 84
5 Kernel Methods 89
5.1 Introduction 89
5.2 Positive definite symmetric kernels 92
5.2.1 Definitions 92
5.2.2 Reproducing kernel Hilbert space 94
5.2.3 Properties 96
5.3 Kernel-based algorithms 100
5.3.1 SVMs with PDS kernels 100
5.3.2 Representer theorem 101
5.3.3 Learning guarantees 102
5.4 Negative definite symmetric kernels 103
5.5 Sequence kernels 106
5.5.1 Weighted transducers 106
5.5.2 Rational kernels 111
5.6 Chapter notes 115
5.7 Exercises 116
6 Boosting 121
6.1 Introduction 121
6.2 AdaBoost 122
6.2.1 Bound on the empirical error 124
6.2.2 Relationship with coordinate descent 126
6.2.3 Relationship with logistic regression 129
6.2.4 Standard use in practice 129
6.3 Theoretical results 130
6.3.1 VC-dimension-based analysis 131
6.3.2 Margin-based analysis 131
6.3.3 Margin maximization 136
6.3.4 Game-theoretic interpretation 137
6.4 Discussion 140
6.5 Chapter notes 141
6.6 Exercises 142
7 On-Line Learning 147
7.1 Introduction 147
7.2 Prediction with expert advice 148
7.2.1 Mistake bounds and Halving algorithm 148
7.2.2 Weighted majority algorithm 150
7.2.3 Randomized weighted majority algorithm 152
7.2.4 Exponential weighted average algorithm 156
7.3 Linear classification 159
7.3.1 Perceptron algorithm 160
7.3.2 Winnow algorithm 168
7.4 On-line to batch conversion 171
7.5 Game-theoretic connection 174
7.6 Chapter notes 175
7.7 Exercises 176
8 Multi-Class Classification 183
8.1 Multi-class classification problem 183
8.2 Generalization bounds 185
8.3 Uncombined multi-class algorithms 191
8.3.1 Multi-class SVMs 191
8.3.2 Multi-class boosting algorithms 192
8.3.3 Decision trees 194
8.4 Aggregated multi-class algorithms 198
8.4.1 One-versus-all 198
8.4.2 One-versus-one 199
8.4.3 Error-correction codes 201
8.5 Structured prediction algorithms 203
8.6 Chapter notes 206
8.7 Exercises 207
9 Ranking 209
9.1 The problem of ranking 209
9.2 Generalization bound 211
9.3 Ranking with SVMs 213
9.4 RankBoost 214
9.4.1 Bound on the empirical error 216
9.4.2 Relationship with coordinate descent 218
9.4.3 Margin bound for ensemble methods in ranking 220
9.5 Bipartite ranking 221
9.5.1 Boosting in bipartite ranking 222
9.5.2 Area under the ROC curve 224
9.6 Preference-based setting 226
9.6.1 Second-stage ranking problem 227
9.6.2 Deterministic algorithm 229
9.6.3 Randomized algorithm 230
9.6.4 Extension to other loss functions 231
9.7 Discussion 232
9.8 Chapter notes 233
9.9 Exercises 234
10 Regression 237
10.1 The problem of regression 237
10.2 Generalization bounds 238
10.2.1 Finite hypothesis sets 238
10.2.2 Rademacher complexity bounds 239
10.2.3 Pseudo-dimension bounds 241
10.3 Regression algorithms 245
10.3.1 Linear regression 245
10.3.2 Kernel ridge regression 247
10.3.3 Support vector regression 252
10.3.4 Lasso 257
10.3.5 Group norm regression algorithms 260
10.3.6 On-line regression algorithms 261
10.4 Chapter notes 262
10.5 Exercises 263
11 Algorithmic Stability 267
11.1 Definitions 267
11.2 Stability-based generalization guarantee 268
11.3 Stability of kernel-based regularization algorithms 270
11.3.1 Application to regression algorithms: SVR and KRR 274
11.3.2 Application to classification algorithms: SVMs 276
11.3.3 Discussion 276
11.4 Chapter notes 277
11.5 Exercises 277
12 Dimensionality Reduction 281
12.1 Principal Component Analysis 282
12.2 Kernel Principal Component Analysis (KPCA) 283
12.3 KPCA and manifold learning 285
12.3.1 Isomap 285
12.3.2 Laplacian eigenmaps 286
12.3.3 Locally linear embedding (LLE) 287
12.4 Johnson-Lindenstrauss lemma 288
12.5 Chapter notes 290
12.6 Exercises 290
13 Learning Automata and Languages 293
13.1 Introduction 293
13.2 Finite automata 294
13.3 Efficient exact learning 295
13.3.1 Passive learning 296
13.3.2 Learning with queries 297
13.3.3 Learning automata with queries 298
13.4 Identification in the limit 303
13.4.1 Learning reversible automata 304
13.5 Chapter notes 309
13.6 Exercises 310
14 Reinforcement Learning 313
14.1 Learning scenario 313
14.2 Markov decision process model 314
14.3 Policy 315
14.3.1 Definition 315
14.3.2 Policy value 316
14.3.3 Policy evaluation 316
14.3.4 Optimal policy 318
14.4 Planning algorithms 319
14.4.1 Value iteration 319
14.4.2 Policy iteration 322
14.4.3 Linear programming 324
14.5 Learning algorithms 325
14.5.1 Stochastic approximation 326
14.5.2 TD(0) algorithm 330
14.5.3 Q-learning algorithm 331
14.5.4 SARSA 334
14.5.5 TD(λ) algorithm 335
14.5.6 Large state space 336
14.6 Chapter notes 337
Conclusion 339
A Linear Algebra Review 341
A.1 Vectors and norms 341
A.1.1 Norms 341
A.1.2 Dual norms 342
A.2 Matrices 344
A.2.1 Matrix norms 344
A.2.2 Singular value decomposition 345
A.2.3 Symmetric positive semidefinite (SPSD) matrices 346
B Convex Optimization 349
B.1 Differentiation and unconstrained optimization 349
B.2 Convexity 350
B.3 Constrained optimization 353
B.4 Chapter notes 357
C Probability Review 359
C.1 Probability 359
C.2 Random variables 359
C.3 Conditional probability and independence 361
C.4 Expectation, Markov's inequality, and moment-generating function 363
C.5 Variance and Chebyshev's inequality 365
D Concentration inequalities 369
D.1 Hoeffding's inequality 369
D.2 McDiarmid's inequality 371
D.3 Other inequalities 373
D.3.1 Binomial distribution: Slud's inequality 374
D.3.2 Normal distribution: tail bound 374
D.3.3 Khintchine-Kahane inequality 374
D.4 Chapter notes 376
D.5 Exercises 377
E Notation 379
References 381
Index 397
What People are Saying About This
Finally, a book that is both broad enough to cover many algorithmic topics of machine learning and mathematically deep enough to introduce the required theory for a graduate level course. Foundations of Machine Learning is a great achievement and a significant contribution to the machine learning community.
Yishay Mansour, School of Computer Science, Tel Aviv UniversityFinally, a book that is both broad enough to cover many algorithmic topics of machine learning and mathematically deep enough to introduce the required theory for a graduate level course. Foundations of Machine Learning is a great achievement and a significant contribution to the machine learning community.
“A clear, rigorous treatment of machine learning that covers a broad range of problems and methods from a theoretical perspective. This edition includes many updates, including new chapters on model selection and maximum entropy methods. It will be a standard graduate-level reference.”
Peter Bartlett, Professor of Computer Science, University of California, Berkeley"Foundations of Machine Learning is a neat and mathematically rigorous book providing broad coverage of basic and advanced topics in Machine Learning, but also a valuable textbook for graduate-level courses in the modern theory of Machine Learning. This book is unique in its content and style, a 'must-have' reference book for researchers and students."
Claudio Gentile, Inria Lille and Google Research, New York"I've found the first edition of this book to be a valuable resource in five or so years of teaching -- and look forward to using the much-improved and expanded second edition in future courses."
Aryeh Kontorovich, Associate Professor of Computer Science, Ben-Gurion UniversityA solid, comprehensive, and self-contained book providing a uniform treatment of a very broad collection of machine learning algorithms and problems. Foundations of Machine Learning is an essential reference book for corporate and academic researchers, engineers, and students.