Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms

We're in the midst of an AI research explosion. Deep learning has unlocked superhuman perception to power our push toward creating self-driving vehicles, defeating human experts at a variety of difficult games including Go, and even generating essays with shockingly coherent prose. But deciphering these breakthroughs often takes a PhD in machine learning and mathematics.

The updated second edition of this book describes the intuition behind these innovations without jargon or complexity. Python-proficient programmers, software engineering professionals, and computer science majors will be able to reimplement these breakthroughs on their own and reason about them with a level of sophistication that rivals some of the best developers in the field.

  • Learn the mathematics behind machine learning jargon
  • Examine the foundations of machine learning and neural networks
  • Manage problems that arise as you begin to make networks deeper
  • Build neural networks that analyze complex images
  • Perform effective dimensionality reduction using autoencoders
  • Dive deep into sequence analysis to examine language
  • Explore methods in interpreting complex machine learning models
  • Gain theoretical and practical knowledge on generative modeling
  • Understand the fundamentals of reinforcement learning
1123482929
Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms

We're in the midst of an AI research explosion. Deep learning has unlocked superhuman perception to power our push toward creating self-driving vehicles, defeating human experts at a variety of difficult games including Go, and even generating essays with shockingly coherent prose. But deciphering these breakthroughs often takes a PhD in machine learning and mathematics.

The updated second edition of this book describes the intuition behind these innovations without jargon or complexity. Python-proficient programmers, software engineering professionals, and computer science majors will be able to reimplement these breakthroughs on their own and reason about them with a level of sophistication that rivals some of the best developers in the field.

  • Learn the mathematics behind machine learning jargon
  • Examine the foundations of machine learning and neural networks
  • Manage problems that arise as you begin to make networks deeper
  • Build neural networks that analyze complex images
  • Perform effective dimensionality reduction using autoencoders
  • Dive deep into sequence analysis to examine language
  • Explore methods in interpreting complex machine learning models
  • Gain theoretical and practical knowledge on generative modeling
  • Understand the fundamentals of reinforcement learning
67.99 In Stock
Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms

Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms

Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms

Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms

eBook

$67.99 

Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
WANT A NOOK?  Explore Now

Related collections and offers


Overview

We're in the midst of an AI research explosion. Deep learning has unlocked superhuman perception to power our push toward creating self-driving vehicles, defeating human experts at a variety of difficult games including Go, and even generating essays with shockingly coherent prose. But deciphering these breakthroughs often takes a PhD in machine learning and mathematics.

The updated second edition of this book describes the intuition behind these innovations without jargon or complexity. Python-proficient programmers, software engineering professionals, and computer science majors will be able to reimplement these breakthroughs on their own and reason about them with a level of sophistication that rivals some of the best developers in the field.

  • Learn the mathematics behind machine learning jargon
  • Examine the foundations of machine learning and neural networks
  • Manage problems that arise as you begin to make networks deeper
  • Build neural networks that analyze complex images
  • Perform effective dimensionality reduction using autoencoders
  • Dive deep into sequence analysis to examine language
  • Explore methods in interpreting complex machine learning models
  • Gain theoretical and practical knowledge on generative modeling
  • Understand the fundamentals of reinforcement learning

Product Details

ISBN-13: 9781492082132
Publisher: O'Reilly Media, Incorporated
Publication date: 05/16/2022
Sold by: Barnes & Noble
Format: eBook
Pages: 390
File size: 13 MB
Note: This product may take a few minutes to download.

About the Author

Nithin Buduma is one of the first machine learning engineers at XY.ai, a start-up based out of Harvard and Stanford working to help healthcare companies leverage their massive datasets.


Nikhil Buduma is the cofounder and chief scientist of Remedy, a San Francisco-based company that is building a new system for data-driven primary healthcare. At the age of 16, he managed a drug discovery laboratory at San Jose State University and developed novel low-cost screening methodologies for resource-constrained communities. By the age of 19, he was a two-time gold medalist at the International Biology Olympiad. He later attended MIT, where he focused on developing large-scale data systems to impact healthcare delivery, mental health, and medical research. At MIT, he cofounded Lean On Me, a national nonprofit organization that provides an anonymous text hotline to enable effective peer support on college campus and leverages data to effect positive mental health and wellness outcomes. Today, Nikhil spends his free time investing in hard technology and data companies through his venture fund, Q Venture Partners, and managing a data analytics team for the Milwaukee Brewers baseball team.


Joe Papa has over 25 years experience in research & development and is the founder of INSPIRD.ai. He holds an MSEE and has led AI Research teams with PyTorch at Booz Allen and Perspecta Labs. Joe has mentored hundreds of Data Scientists and has taught 6,000+ students across the world on Udemy.

Table of Contents

Preface ix

1 Fundamentals of Linear Algebra for Deep Learning 1

Data Structures and Operations 1

Matrix Operations 3

Vector Operations 6

Matrix-Vector Multiplication 7

The Fundamental Spaces 7

The Column Space 7

The Null Space 10

Eigenvectors and Eigenvalues 13

Summary 15

2 Fundamentals of Probability 17

Events and Probability 17

Conditional Probability 20

Random Variables 22

Expectation 24

Variance 25

Bayes' Theorem 27

Entropy, Cross Entropy, and KL Divergence 29

Continuous Probability Distributions 32

Summary 36

3 The Neural Network 39

Building Intelligent Machines 39

The Limits of Traditional Computer Programs 40

The Mechanics of Machine Learning 41

The Neuron 45

Expressing Linear Perceptrons as Neurons 47

Feed-Forward Neural Networks 48

Linear Neurons and Their Limitations 51

Sigmoid, Tanh, and ReLU Neurons 51

Softmax Output Layers 54

Summary 54

4 Training Feed-Forward Neural Networks 55

The Fast-Food Problem 55

Gradient Descent 57

The Delta Rule and Learning Rates 58

Gradient Descent with Sigmoidal Neurons 60

The Backpropagation Algorithm 61

Stochastic and Minibatch Gradient Descent 63

Test Sets, Validation Sets, and Overfilling 65

Preventing Overfilling in Deep Neural Networks 71

Summary 76

5 Implementing Neural Networks in PyTorch 77

Introduction to PyTorch 77

Installing PyTorch 77

PyTorch Tensors 78

Tensor Init 78

Tensor Attributes 79

Tensor Operations 80

Gradients in PyTorch 83

The PyTorch nn Module 84

PyTorch Datasets and Dataloaders 87

Building the MNIST Classifier in PyTorch 89

Summary 93

6 Beyond Gradient Descent 95

The Challenges with Gradient Descent 95

Local Minima in the Error Surfaces of Deep Networks 96

Model Identifiability 97

How Pesky Are Spurious Local Minima in Deep Networks? 98

Flat Regions in the Error Surface 101

When the Gradient Points in the Wrong Direction 104

Momentum-Based Optimization 106

A Brief View of Second-Order Methods 109

Learning Rate Adaptation 111

AdaGrad-Accumulating Historical Gradients 111

RMSProp-Exponentially Weighted Moving Average of Gradients 112

Adam-Combining Momentum and RMSProp 113

The Philosophy Behind Optimizer Selection 115

Summary 116

7 Convolutional Neural Networks 117

Neurons in Human Vision 117

The Shortcomings of Feature Selection 118

Vanilla Deep Neural Networks Don't Scale 121

Filters and Feature Maps 122

Full Description of the Convolutional Layer 127

Max Pooling 131

Full Architectural Description of Convolution Networks 132

Closing the Loop on MNIST with Convolutional Networks 134

Image Preprocessing Pipelines Enable More Robust Models 136

Accelerating Training with Batch Normalization 137

Group Normalization for Memory Constrained Learning Tasks 139

Building a Convolutional Network for CIFAR-10 141

Visualizing Learning in Convolutional Networks 143

Residual Learning and Skip Connections for Very Deep Networks 147

Building a Residual Network with Superhuman Vision 149

Leveraging Convolutional Filters to Replicate Artistic Styles 152

Learning Convolutional Filters for Other Problem Domains 154

Summary 155

8 Embedding and Representation Learning 157

Learning Lower-Dimensional Representations 157

Principal Component Analysis 158

Motivating the Autoencoder Architecture 160

Implementing an Autoencoder in PyTorch 161

Denoising to Force Robust Representations 171

Sparsity in Autoencoders 174

When Context Is More Informative than the Input Vector 177

The Word2Vec Framework 179

Implementing the Skip-Gram Architecture 182

Summary 188

9 Models for Sequence Analysis 189

Analyzing Variable-Length Inputs 189

Tackling seq2seq with Neural N-Grams 190

Implementing a Part-of-Speech Tagger 192

Dependency Parsing and SyntaxNet 197

Beam Search and Global Normalization 203

A Case for Stateful Deep Learning Models 206

Recurrent Neural Networks 207

The Challenges with Vanishing Gradients 210

Long Short-Term Memory Units 213

PyTorch Primitives for RNN Models 218

Implementing a Sentiment Analysis Model 219

Solving seq2seq Tasks with Recurrent Neural Networks 224

Augmenting Recurrent Networks with Attention 227

Dissecting a Neural Translation Network 230

Self-Attention and Transformers 239

Summary 242

10 Generative Models 243

Generative Adversarial Networks 244

Variational Autoencoders 249

Implementing a VAE 259

Score-Based Generative Models 264

Denoising Autoencoders and Score Matching 269

Summary 274

11 Methods in Interpretability 275

Overview 275

Decision Trees and Tree-Based Algorithms 276

Linear Regression 280

Methods for Evaluating Feature Importance 281

Permutation Feature Importance 281

Partial Dependence Plots 282

Extractive Rationalization 283

LIME 288

SHAP 292

Summary 297

12 Memory Augmented Neural Networks 299

Neural Turing Machines 299

Attention-Based Memory Access 301

NTM Memory Addressing Mechanisms 303

Differentiable Neural Computers 307

Interference-Free Writing in DNCs 309

DNC Memory Reuse 310

Temporal Linking of DNC Writes 311

Understanding the DNC Read Head 312

The DNC Controller Network 313

Visualizing the DNC in Action 314

Implementing the DNC in PyTorch 317

Teaching a DNC to Read and Comprehend 321

Summary 323

13 Deep Reinforcement Learning 325

Deep Reinforcement Learning Masters Atari Games 325

What Is Reinforcement Learning? 326

Markov Decision Processes 328

Policy 329

Future Return 330

Discounted Future Return 331

Explore Versus Exploit 331

∈-Greedy 333

Annealed ∈-Greedy 333

Policy Versus Value Learning 334

Pole-Cart with Policy Gradients 335

OpenAI Gym 335

Creating an Agent 335

Building the Model and Optimizer 337

Sampling Actions 337

Keeping Track of History 337

Policy Gradient Main Function 338

PGAgent Performance on Pole-Cart 340

Trust-Region Policy Optimization 341

Proximal Policy Optimization 345

Q-Learning and Deep Q-Networks 347

The Bellman Equation 347

Issues with Value Iteration 348

Approximating the Q-Function 348

Deep Q-Network 348

Training DQN 349

Learning Stability 349

Target Q-Network 350

Experience Replay 350

From Q-Function to Policy 350

DQN and the Markov Assumption 351

DQN's Solution to the Markov Assumption 351

Playing Breakout with DQN 351

Building Our Architecture 354

Stacking Frames 354

Setting Up Training Operations 354

Updating Our Target Q-Network 354

Implementing Experience Replay 355

DQN Main Loop 356

DQNAgent Results on Breakout 358

Improving and Moving Beyond DQN 358

Deep Recurrent Q-Networks 359

Asynchronous Advantage Actor-Critic Agent 359

UNsupervised REinforcement and Auxiliary Learning 360

Summary 361

Index 363

From the B&N Reads Blog

Customer Reviews