Table of Contents
Preface to the Dover Edition xi
1 Introduction 1
1.1 Introduction 1
1.2 Learning: Perspectives and Context 7
1.3 Learning Automata 24
1.4 Plan of This Book 31
2 The Learning Automaton 35
2.1 Introduction 35
2.2 The Environment 36
2.3 The Automaton 40
2.4 Feedback Connection of Automaton and Environment 52
2.5 Norms of Behavior 54
2.6 Conclusion 57
3 Fixed Structure Automata 59
3.1 Introduction 59
3.2 The Two-State Automaton L2,2 59
3.3 Extensions of the L2,2 Automaton 62
3.4 ε-Optimal Schemes 78
3.5 The Cover-Hellman Automaton 85
3.6 Automata with Multiple Actions 90
3.7 Rate of Convergence 96
3.8 Significance of Fixed Structure Automata 100
3.9 Related Historical Developments 101
4 Variable Structure Stochastic Automata 103
4.1 Introduction 103
4.2 Variable Structure Stochastic Automata 104
4.3 Reinforcement Schemes 105
4.4 General Reinforcement Schemes 106
4.5 Variable Structure Learning Automaton as a Markov Process 108
4.6 Learning Automata with Two Actions 109
4.7 Multi-Action Learning Automata 116
4.8 Some Nonlinear Learning Schemes 120
4.9 Absolutely Expedient Schemes 124
4.10 Simulation Results 137
4.11 Related Developments 147
5 Convergence 149
5.1 Introduction 149
5.2 Concepts of Convergence 150
5.3 Ergodic Schemes 157
5.4 Absolutely Expedient Schemes 168
5.5 Rate of Convergence 187
5.6 Caution for Convergence Results 195
5.7 Conclusion 196
6 Q and S Models 199
6.1 Introduction 199
6.2 Performance Criteria 201
6.3 General Reinforcement Scheme 204
6.4 Some Specific Reinforcement Schemes 206
6.5 Absolutely Expedient Schemes 208
6.6 Specific Q Model 210
6.7 Simulation Results 215
6.8 Conclusion 224
7 Nonstationary Environments 227
7.1 Introduction 227
7.2 Nonstationary Environments 229
7.3 Expediency and Related Concepts 231
7.4 Automata in MSE 234
7.5 State-Dependent Nonstationary Environments 240
7.6 Nonstationary Effects in a Hierarchical System 248
7.7 Environments with a Fixed Optimal Action 257
7.8 Multiple Environments 258
7.9 Other Nonstationary Environments 261
7.10 Generalizations of the Learning Automaton 273
7.11 A Parameterized Stochastic Learning Unit 276
8 Interconnected Automata and Games 281
8.1 Introduction 281
8.2 Decentralization, Games, and Uncertainty 282
8.3 Mathematical Formulation of Automata Games 285
8.4 Two-Person Zero-Sum Games of Automata 292
8.5 Games with Identical Payoffs 309
8.6 Nonzero-Sum Games 330
8.7 Interconnected Automata 333
8.8 Decentralized Control of Markov Chains 350
8.9 Conclusion 357
9 Applications of Learning Automata 359
9.1 Introduction 359
9.2 Routing in Networks 362
9.3 Other Applications of Learning Automata 392
9.4 Conclusion 416
Epilogue 419
Appendix A Markov Chains 423
Appendix B Martingales 435
Appendix C Distance Diminishing Operators 443
Bibliography 449
Index 469