Robot Shaping: An Experiment in Behavior Engineering

Robot Shaping: An Experiment in Behavior Engineering

by Marco Dorigo, Marco Colombetti
     
 

ISBN-10: 0262041642

ISBN-13: 9780262041645

Pub. Date: 11/06/1997

Publisher: MIT Press

foreword by Lashon Booker

To program an autonomous robot to act reliably in a dynamic environment is a complex task. The dynamics of the environment are unpredictable, and the robots' sensors provide noisy input. A learning autonomous robot, one that can acquire knowledge through interaction with its environment and then adapt its behavior, greatly

…  See more details below

Overview

foreword by Lashon Booker

To program an autonomous robot to act reliably in a dynamic environment is a complex task. The dynamics of the environment are unpredictable, and the robots' sensors provide noisy input. A learning autonomous robot, one that can acquire knowledge through interaction with its environment and then adapt its behavior, greatly simplifies the designer's work. A learning robot need not be given all of the details of its environment, and its sensors and actuators need not be finely tuned.

Robot Shaping is about designing and building learning autonomous robots. The term "shaping" comes from experimental psychology, where it describes the incremental training of animals. The authors propose a new engineering discipline, "behavior engineering," to provide the methodologies and tools for creating autonomous robots. Their techniques are based on classifier systems, a reinforcement learning architecture originated by John Holland, to which they have added several new ideas, such as "mutespec," classifier system "energy," and dynamic population size. In the book they present Behavior Analysis and Training (BAT) as an example of a behavior engineering methodology.

Product Details

ISBN-13:
9780262041645
Publisher:
MIT Press
Publication date:
11/06/1997
Series:
Intelligent Robotics and Autonomous Agents series
Pages:
221
Product dimensions:
7.10(w) x 9.00(h) x 0.80(d)
Age Range:
18 Years

Table of Contents

Foreword xiii(2)
Preface xv(2)
Acknowledgments xvii
Chapter 1 Shaping Robots
1(20)
1.1 Introduction
1(2)
1.2 Learning to Behave
3(3)
1.3 Shaping an Agent's Behavior
6(2)
1.4 Reinforcement Learning, Evolutionary Computation, and Learning Classifier Systems
8(2)
1.5 The Agents and Their Environments
10(7)
1.5.1 The Agents' "Bodies"
10(1)
1.5.2 The Agents' "Mind"
11(3)
1.5.3 Types of Behavior
14(1)
1.5.4 The Environment
15(2)
1.6 Behavior Engineering as an Empirical Endeavor
17(1)
1.7 Points to Remember
18(3)
Chapter 2 ALECSYS
21(24)
2.1 The Learning Classifier System Paradigm
21(12)
2.1.1 LCS(o): The Performance System
23(3)
2.1.2 LCS(o): The Apportionment of Credit System
26(4)
2.1.3 LCS(o): The Rule Discovery System
30(3)
2.2 ICS: Improved Classifier System
33(4)
2.2.1 ICS: Calling the Genetic Algorithm When a Steady State Is Reached
34(1)
2.2.2 ICS: The Mutespec Operator
34(2)
2.2.3 ICS: Dynamically Changing the Number of Classifiers Used
36(1)
2.3 The ALECSYS System
37(6)
2.3.1 Low-Level Parallelism: A Solution to Speed Problems
38(3)
2.3.2 High-Level Parallelism: A Solution to Behavioral Complexity Problems
41(2)
2.4 Points to Remember
43(2)
Chapter 3 Architectures and Shaping Policies
45(12)
3.1 The Structure of Behavior
45(2)
3.2 Types of Architectures
47(4)
3.2.1 Monolithic Architectures
47(1)
3.2.2 Flat Architectures
48(1)
3.2.3 Hierarchical Architectures
49(2)
3.3 Realizing an Architecture
51(4)
3.3.1 How to Design an Architecture: Qualitative Criteria
51(1)
3.3.2 How to Design an Architecture: Quantitative Criteria
52(1)
3.3.3 Implementing an Architecture with ALECSYS
53(2)
3.4 Shaping Policies
55(1)
3.5 Points to Remember
56(1)
Chapter 4 Experiments in Simulated Worlds
57(38)
4.1 Introduction
57(7)
4.1.1 Experimental Methodology
58(1)
4.1.2 Simulation Environments
59(2)
4.1.3 The Simulated AutonoMice
61(3)
4.2 Experiments in the Chase Environment
64(8)
4.2.1 The Role of Punishments and of Internal Messages
65(2)
4.2.2 Experimental Evaluation of ICS
67(2)
4.2.3 A First Step toward Dynamic Behavior
69(3)
4.3 Experiments in the Chase Escape Environment
72(7)
4.3.1 The Simulation Environment
73(1)
4.3.2 Target Behavior
73(2)
4.3.3 The Learning Architectures: Monolithic and Hierarchical
75(1)
4.3.4 Representation
75(2)
4.3.5 The Reinforcement Program
77(1)
4.3.6 Choice of an Architecture and of a Shaping Policy
77(2)
4.4 Experiments in the Chase Feed Escape Environment
79(10)
4.4.1 Monolithic Architecture
79(2)
4.4.2 Monolithic Architecture with Distributed Input
81(1)
4.4.3 Two-Level Switch Architecture
81(4)
4.4.4 Three-Level Switch Architecture
85(3)
4.4.5 The Issue of Scalability
88(1)
4.5 Experiments in the FindHidden Environment
89(4)
4.5.1 The FindHidden Task
91(1)
4.5.2 Sensor Granularity and Learning Performance
91(2)
4.6 Points to Remember
93(2)
Chapter 5 Experiments in the Real World
95(20)
5.1 Introduction
95(1)
5.2 Experiments with AutonoMouse II
96(13)
5.2.1 AutonoMouse II Hardware
96(1)
5.2.2 Experimental Methodology
97(1)
5.2.3 The Training Policies
98(2)
5.2.4 An Experimental Study of Training Policies
100(9)
5.3 Experiments with AutonoMouse IV
109(5)
5.3.1 AutonoMouse IV Hardware
109(1)
5.3.2 Experimental Settings
110(1)
5.3.3 FindHidden Task: Experimental Results
111(2)
5.3.4 Sensor Granularity: Experimental Results
113(1)
5.4 Points to Remember
114(1)
Chapter 6 Beyond Reactive Behavior
115(28)
6.1 Introduction
115(1)
6.2 Reactive and Dynamic Behaviors
115(4)
6.3 Experimental Settings
119(5)
6.3.1 The Simulation Environment
120(1)
6.3.2 The Agent's "Body"
120(1)
6.3.3 Target Behavior
121(1)
6.3.4 The Agent's Controller and Sensorimotor Interfaces
122(1)
6.3.5 Experimental Methodology
122(2)
6.4 Training Policies
124(8)
6.4.1 External-Based Transitions
125(3)
6.4.2 Result-Based Transitions
128(2)
6.4.3 Meaning and Use of the Reinforcement Sensor
130(2)
6.5 Experimental Results
132(10)
6.5.1 Experiment 1: External-Based Transitions and Flexible Reinforcement Program
132(2)
6.5.2 Experiment 2: External-Based Transitions, Flexible Reinforcement Program, and Agent-to-Trainer Communication
134(1)
6.5.3 Experiment 3: External-Based Transitions, Flexible Reinforcement Program, and Transfer of the Coordinator
134(2)
6.5.4 Comparison of Experiments 1-3
136(2)
6.5.5 Experiment 4: Result-Based Transitions and Rigid Reinforcement Program with Reinforcement Sensor
138(1)
6.5.6 Experiment 5: Result-Based Transitions, Rigid Reinforcement Program with Reinforcement Sensor, and Agent-to-Trainer Communication
138(1)
6.5.7 Experiment 6: Result-Based Transitions, Rigid Reinforcement Program, and Transfer of the Coordinator
138(3)
6.5.8 Comparison of Experiments 4-6
141(1)
6.6 Points to Remember
142(1)
Chapter 7 The Behavior Analysis and Training Methodology
143(26)
7.1 Introduction
143(1)
7.2 The BAT Methodology
144(9)
7.2.1 Application Description and Behavior Requirements
146(1)
7.2.2 Behavior Analysis
147(1)
7.2.3 Specification
147(2)
7.2.4 Design, Implementation, and Verification of the Nascent Robot
149(1)
7.2.5 Training
149(1)
7.2.6 Behavior Assessment
150(3)
7.3 Case 1: AutonoMouse V
153(7)
7.3.1 Application Description and Behavior Requirements
154(1)
7.3.2 Behavior Analysis
155(1)
7.3.3 Specification
156(3)
7.3.4 Design, Implementation, and Verification
159(1)
7.3.5 Training
159(1)
7.3.6 Behavior Assessment
159(1)
7.4 Case 2: HAMSTER
160(5)
7.4.1 HAMSTER's Shell
160(1)
7.4.2 The Hoarding Behavior
161(1)
7.4.3 Specification
161(2)
7.4.4 Training
163(1)
7.4.5 Assessment
164(1)
7.4.6 Conclusions
165(1)
7.5 Case 3: The CRAB Robotic Arm
165(3)
7.6 Points to Remember
168(1)
Chapter 8 Final Thoughts
169(18)
8.1 A Retrospective Overview
169(1)
8.2 Related Work
170(6)
8.2.1 Classifier System Reinforcement Learning
172(1)
8.2.2 Temporal Difference Reinforcement Learning and Related Algorithms
172(1)
8.2.3 Evolutionary Reinforcement Learning
173(1)
8.2.4 Work on Shaping and Teaching in Reinforcement Learning
174(2)
8.3 Training versus Programming
176(2)
8.4 Limitations
178(1)
8.5 The Future
179(8)
8.5.1 The Near Future
180(1)
8.5.2 Beyond the Horizon
181(6)
Notes 187(4)
References 191(10)
Index 201

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >