How Dogs Learnby Mary R. Burch
...this book should be on every animal trainer's bookshelf for future reference. How Dogs Learn covers the content of an undergraduate course in learning and behavior, but the examples are taken from dog training it is practical and very useful without sacrificing scientific and technical accuracy. Jack Michael, PhD, Department of Psychology, Western Michigan
...this book should be on every animal trainer's bookshelf for future reference. How Dogs Learn covers the content of an undergraduate course in learning and behavior, but the examples are taken from dog training it is practical and very useful without sacrificing scientific and technical accuracy. Jack Michael, PhD, Department of Psychology, Western Michigan University
How Dogs Learn explore the fascinating science of operant conditioning, where science and dog training meet. How Dogs Learn explains the basic principles of behavior and how they can be used to teach your dog new skills, diagnose problems and eliminate unwanted behaviors. It's for anyone who wants to better understand the learning process in dogs. Every concept is laid out clearly and precisely, and its relevance to your dog and how you train is explained.
A Howell Dog Book of Distinction
- Turner Publishing Company
- Publication date:
- Sales rank:
- Product dimensions:
- 6.40(w) x 9.60(h) x 0.79(d)
Read an Excerpt
[Figures are not included in this sample chapter]
How Dogs Learn
- 3 -
What dogs like, what they want to avoid, what they'll work for
Understanding reinforcement is the key to understanding how learning takes place.From a behaviorist's perspective, each day is made up of a series of behaviors thatare either reinforced or not reinforced. This is true for both animals and humans.The reinforcement that occurs, along with the strength and timing of that reinforcement,will determine whether the behaviors are likely to occur again.
A stimulus is any object or event that can be detected by the sensesand that can affect a person's or an animal's behavior.
Stimuli (the plural of stimulus) can be sounds, food or drinks, smells, touchesor visual signals.
In dog training, trainers are said to be providing reinforcement when they provideconsequences that increase or maintain a behavior. Some examples of stimuli thataffect behavior in dogs include verbal commands, encouraging noises, clickers orwhistles, food treats, pats or snaps on a leash.
In operant conditioning, reinforcement can be categorized as either primary orsecondary, or as either positive or negative. Let's take a look a t what these variablesreally mean.
Reinforcement occurs when a behavior, followed by a consequent stimulus, is strengthened, or becomes more likely to occur again.
PRIMARY REINFORCEMENT: WILL WORK FOR FOOD
Primary reinforcers are reinforcers that are related to biology. Examplesof primary reinforcers include food, drink, some kinds of touch and sexual contact.When dogs are trained using treats, that is a primary reinforcer.
But primary reinforcers are more than food. For some breeds that are innatelywired to be visual, visual stimuli seem to be primary reinforcers. Because of selectivebreeding over many centuries, many sporting dogs are very visually oriented. Somespaniels will look out a window and notice a leaf falling from a tree across thestreet. Owners of dogs with such finely tuned sensitivities often wonder why theirdog isn't paying attention in an outdoor obedience class. "I'm giving him treats,"they tell us, not understanding that for this dog, a treat just can't compete withthe field full of birds across the road.
A reinforcer is a stimulus that, when presented following a behavior, causes that behavior to be more likely to occur again in the future.
SECONDARY REINFORCEMENT: "GOOD BOY!"
Secondary reinforcers are reinforcers that can be related to social conditions.In other words, they have a cultural context. Humans respond to secondary reinforcerssuch as praise, smiles, thumbs-up gestures and money. Dogs are social creatures,and many dogs also respond well to smiles, praise, attention, clapping, toys andpats. But, just as we hav e to learn that thumbs-up means "well done," dogshave to learn that praise is something positive.
Secondary reinforcers become reinforcing by being paired with primary reinforcers.
Verbal praise is the most commonly used secondary reinforcer. When a dog is ayoung puppy, saying "good girl" to her has no meaning. It's just a bunchof sounds. But when you say "good girl" and give her a piece of food ora pat, she learns to associate praise with good things. Praise has become a secondaryreinforcer.
Secondary reinforcers are also called conditioned reinforcers. This isbecause secondary reinforcers depend upon some conditioning taking place. For example,if a dog owner takes the leash out of a particular drawer just before taking herdog for a walk, taking the leash out of the drawer can become a conditioned reinforcer.If one day the dog is chewing on a shoe when she takes out the leash, chewing ona shoe may accidentally become reinforced; the dog may think that going for a walkis its reward for chewing on the shoe.
In place of verbal praise, some trainers use sounds as conditioned reinforcersthat essentially mean "good job!" In marine mammal shows at places likeSea World, when the porpoise trainers blow a whistle to let the animal know it hasperformed correctly, the trainers are delivering conditioned reinforcement. Conditionedreinforcement is a way a trainer can offer reinforcement from a distance and whenit is not handy to give food to the animal. Applause is another conditioned reinforcer,for both human performers and many competition dogs.
A conditioned reinforcer is a previously neutral stimulus that begins to function as a reinforcer after being paired a number of times with an established reinforcer.
Now We're Clicking
Recently, the dog training community has seen an increase in the use of clickertraining, which uses the clicker as a conditioned reinforcer. The clicker is a smallmetal and plastic device that makes a clicking sound. Clickers have been around fordecades. Years ago they were called "crickets" and were sold as children'stoys.
When clickers are used as conditioned reinforcers, the dog is first given foodalone as a primary reinforcer. Once the dog has clearly shown that it likes food,the pairing begins. The clicker is clicked; then the food is given. Because the clickingsound is associated with the delivery of food, the click alone begins to take onreinforcing properties. (You'll find more information about clicker training in Chapter14.)
The clicker is a conditioned reinforcer because it started out as a neutral stimulus,meaning it had no meaning for the dog. But when it was paired with food, the dogeventually learned that a clicker also means "good job!"
Clicker training is one application of conditioned reinforcement. The clicker is a neutral stimulus that is followed by a primary reinforcer, and eventually becomes a conditioned reinforcer.
POSITIVE REINFORCEMENT: WILL STILL WORK FOR FOOD
Reinforcement can also be described as positive or negative: something an animalwants to acquire more of, or something it wants to escape from or avoid.
A positive reinforcer is a stimulus that, when presented following a behavior, makes it more likely that type of behavior will occur in the future.
The delivery of a positive reinforcer results in positive reinforcement. Hereare some examples of positive reinforcement.
Sugar was an English Cocker Spaniel being trained for obedience. In obediencecompetition, dogs are expected to sit straight in front of their owners, at a precise90-degree angle, and are penalized for crooked sits. Sugar was sitting straight onlysome of the time in obedience routines. During training, Sugar's owner began givingher a small piece of dog food when she sat straight, and no food when she sat crooked.Within a few sessions, Sugar was sitting straight.
|Dog Situation||Dog Behavior||Trainer Response||Results (Consequences)|
|Sugar was a spaniel being trained for obedience||Sugar would come to her owner and sit crooked some of the time||Give treats for straight sits (food = primary, positive reinforcement)||Straight sits increased|
Sam was a five-year-old Vizsla whose owner decide to take him to noncompetitivefield dog training classes as a leisure activity. Sam was initially more interestedin his owner than in finding the birds. In training, as he began to move away fromhis owner's side in the correct direction, she would excitedly say, "Good boy,good boy Sam!" Sam began wagging his tail and moving forward to find the birds.
|Dog Situation||Dog Behavior||Trainer Response||Results (Consequences)|
|Sam was a Vizsla being trained for field work||Sam was more interested in his owner than in finding birds||Trainer would excitedly say, "Good boy, good Sam," when Sam moved away from her toward the hidden birds (Praise = secondary, positive reinforcement)||Sam began moving out to find the birds|
Cody was a German Shepherd that worked full time as a drug detection dog. Somenights, Cody had to search cars in an airport parking lot. When he stayed on-taskand completed his searches, his trainer gave him his toy--a knotted towel that couldbe used to play tug-of-war. Cody's trainer decided that they could complete theirevening duties faster if the playtime was eliminated. Soon after, Cody's searchingand on-task behavior decreased. Cody's trainer restored playtime as a reward forsearching, and Cody's on-task behavior improved.
|Dog S ituation||Dog Behavior||Trainer Response||Results (Consequences)|
|Cody was a German Shepherd involved in drug detection work||Cody's searching behavior decreased when playtime was eliminated||Trainer resumed playtime following searches (Play and toy = secondary, positive reinforcement)||Cody' s on-task behavior improved|
The positive reinforcers in these three examples are food, praise and a toy. Whatmakes these items positive reinforcers is the effect each had on the behavior ofthe dog. In each case, the desired behavior increased when the positive stimuluswas presented following the behavior. Given treats, Sugar's sits improved. When Samwas praised, he began moving in the desired direction. If a toy was offered aftersearches, Cody stayed on-task and searched.
The important thing to remember about positive reinforcement is that it must seempositive to the dog. Some dogs couldn't care less about a trainer saying, "Goodboy." Dogs that eat dinner at 6 p.m. and go to obedience class at 7 p.m. maynot find treats particularly reinforcing in class. If a dog has free access to aparticular toy, that toy may not be a positive reinforcer. If you deliver what youthink is a reinforcer and it does not result in the desired behavior, it's possiblethat what you think is a reinforcer and what the d og thinks is a reinforcer are twodifferent things.
Positive reinforcement can be used to:
Teach new skills
Maintain existing behavior
NEGATIVE REINFORCEMENT: ESCAPE FEELS GOOD
Negative reinforcement can be a confusing concept. Many people equate it withpunishment, but these two operant conditioning principles are very different. Verysimply put, negative reinforcement is used to get the dog to do something more often.Punishment gets the dog to stop doing something.
Punishment decreases the probability that a behavior will occur again. That is,if a dog is punished for a behavior and the punisher is effective, the dog will beless likely to engage in the punished behavior in the future.
Negative reinforcement, like positive reinforcement, will increase the likelihoodthat a behavior will occur again. So what's the difference? With positive reinforcement,a positive stimulus is used, such as food, toys or petting. With negative reinforcement,a negative stimulus is removed, conditioning an escape response.
With negative reinforcement, the probability of a behavior occurring in the future is increased when the behavior is followed by the removal or avoidance of a negative stimulus.
For example, some dogs are trained to heel with a chain training collar. Whenthe dog lags behind the handler, the handler gives a snap-release correction on thecollar. If the dog feels or hears the handler starting to make the correction andhurries to get into the heel position to avoid the correction, negative reinforcementhas taken place.<
Like secondary reinforcement, negative reinforcement must be conditioned. In otherwords, the dog has to have been exposed to some punishment and understand it as anegative experience. In the case of the training collar, there must be at least oneexperience where the dog actually receives the jerk on the collar and finds it unpleasant.When the dog hurries into heel position to avoid the jerk, learning has taken place.
POSITIVE AND NEGATIVE: HOW CAN THAT BE?
Some stimuli in the environment are positive reinforcers for some people (anddogs), and negative reinforcers for others. Heavy metal music blaring from a boombox at the beach may be a positive reinforcer for some teenagers, while others ofus may respond to the same music by leaving the beach. For those who find escapefrom the music a relief, it's a negative reinforcer.
Individual dogs of the same breed can be as different as two people: one who likesloud music and another who detests it. And different breeds of dogs react differentlyto different stimuli. In dog training, selecting an effective reinforcement strategydepends on both your knowledge of behavioral principles and a good understandingof the individual dog.
You can also make choices about using neutral stimuli as positive or negativereinforcers. Some stimuli that are traditionally thought of as positive reinforcersmay be used as negative reinforcers in dog training. Clickers are most often usedas positive conditioned reinforcers. Paired with a food treat, clickers give thedog the message that it has behaved correctly. In a less traditional approach, sometrainers use clickers to warn the dog that it must stop what it is doing. As withthe positive approach, the clicker must initially be paired with a negative experienceto make it function as a negative reinforcer.
ESCAPE CONDITIONING: OK, I'LL SAY "UNCLE"
Negative reinforcement works by making use of escape conditioning and avoidanceconditioning. In escape conditioning, a response is more likely to occur inthe future if the negative stimulus is removed immediately after the response. Here'san example.
Star was a Labrador Retriever being trained to work as a service dog. As partof her training, she had to learn to hold a wooden dumbbell as the first step inlearning how to retrieve. However, Star refused to take the dumbbell in her mouth.She'd clench her teeth shut and would not give in. Her trainer used a common procedureof holding the dumbbell up to her mouth and then pinching her ear. When Star openedher mouth to take the dumbbell, her ear was released and she was praised for takingit.
|Dog Situation||Dog Behavior||Trainer Response||Results (Consequences)|
|Star was a Labrador Retriever that had to learn to hold a dumbbell||Star clenched her teeth and refused to take the dumbbell in her mouth||Trainer held the dumbbell at her mouth and pinched her ear (pinch = negative reinforcement escape conditioning)||Star opened her mouth, took the dumbbell and was praised|
In escape conditioning, escape comes after the aversive event has occurred.
AVOIDANCE CONDITIONING: A WARNING
In avoidance conditioning, if a behavior can prevent a negative stimulusfrom occurring, the behavior increases in frequency. In order for this to happen,the dog must know that the negative stimulus is coming and must know what it cando to avoid the negative stimulus. This requires conditioning.
An example is Buddy, a Basset Hound whose owner wanted to take him for short walksin the neighborhood. But Buddy was more interested in his surroundings than in walking,and would stop every few feet to sniff and smell. It made walks tedious. His ownerresponded by making a loud, nasal noise that sounded a bit like a sneeze every timeBuddy stopped. The noise was always followed by a jerk on the leash. When Buddy startedwalking, he was praised. After several corrections, Buddy understood that the noisecame just before a correction. And the praise taught him that he was supposed tobe walking. So whenever he heard the noise, he kept walking in order to avoid thecorrection.
|Dog Situation||Dog Behavior||Trainer Response||Results (Consequences)|
|Buddy's owner wanted to take him for short walks||Buddy stopped to sniff every four feet||His owner made a loud noise, followed by a jerk on the leash, every time Buddy stopped (noise + jerk = negative reinforcement, avoidance conditioning)||Buddy started walking whenever he heard the noise, and was praised|
In Buddy's case, he learned to avoid the jerk on the leash by responding to thewarning sound. This is a situation where a knowledge of breeds is critical to providinga solution that is fair and humane to the dog. Bassets are scenthounds, and werebred for centuries for their abilities to trail small game by using their sense ofsmell. Buddy's owner began using two different leashes and collars to teach Buddythat there were times when he needed to keep walking and times when he could engagein his favorite leisure-time activity, smelling. When Buddy wore his chain collarand leather leash, it told him he would be required to keep moving. When he worehis nylon collar and leash, he was free to take his time and investigate the localsmells. Sometimes, even dogs need to take time to smell the roses.
In avoidance conditioning, avoidance occurs before the aversive event takes place.
SCHEDULES OF REINFORCEMENT: WHICH ONES AND HOW MANY?
To teach a dog a new behavior, improve the proficiency of a previously learnedbehavior or maintain a behavior, you must know how to use reinforcement effectively.That means understanding what kind of reinforcement is suitable in a particular trainingsituation, and also knowing h ow and when to deliver that reinforcement. In operantconditioning, the rules pertaining to how many or which specific responses will bereinforced are called schedules of reinforcement.
Schedules of reinforcement define which responses will be reinforced, and how often.
Understanding reinforcement can improve training effectiveness. For example, ifa puppy is being taught to sit, should you deliver a reinforcer every time the puppysits on command, every second time, every tenth time or randomly? It really doesmake a difference.
Reinforcers can be given for every single correct response, or for only some correctresponses. When a behavior is reinforced every time it occurs, it's called continuousreinforcement. In training a puppy to sit, if you give the pup a small pieceof food for every correct sit, the puppy is on a continuous reinforcement schedule.
Continuous reinforcement is an excellent tool to use while training a new skill,but it should eventually be faded to a more functional level. The ultimate goal oftraining is to make the dog as independent as possible, and a dog expecting continuousreinforcement will not perform without the reinforcer.
When responses are reinforced only some of the time, it's called intermittentreinforcement. If the puppy has learned to sit on command and responds to theverbal cue "sit" most of the time, you might choose to give a food rewardonly some of the time. This is enough to reinforce the behavior, but not enough tomake the behavior dependent on the reinforcer.
Intermittent reinforcement schedules can get pretty complex. There are six types:
- Fixed ratio
- Variable ratio
- Fixed interval
- Variable interval
- Fixed duration
- Variable duration
Fixed Ratio Schedules
In a fixed ratio schedule of reinforcement, the same number of responsesmust be performed before a reinforcer is given. In the example we just used, youmight decide to reward the dog for every third sit or every fifth sit, or whatevernumber you specify.
Fixed ratio schedules are sometimes abbreviated as FR and a number, so FR4 means a dog is on a fixed ratio schedule where it is rewarded every fourth time.
Some factory workers are paid on a fixed ratio schedule. This is called piecework.For example, if the workers are paid $1 for every 10 items they assemble, they arebeing rewarded every tenth time they perform the desired behavior.
Some things to remember about fixed ratio schedules:
1. They are easy to use. It's easy to remember when to deliver reinforcement because it's on a fixed schedule.
- 2. Many competitive dog events have the same preset routines, so fixed ratio schedules can be used to teach the whole routine.
3. Since a fixed ratio schedule is predictable and fixed, dogs learn when reinforcement is coming. That means when several responses are required for the dog to earn reinforcement, the dog may hesitate immediately after the reinforcer is given before engaging in the next behavior.
Variable Ratio Schedules
In variable ratio schedules, reinforcement is delivered after a certainnumber of responses that vary unpredictably. Well-known variable ratio schedulesof reinforcement for humans include slot machines, fishing and the lottery. Eachof these activities pays off only after an unpredictable number of responses haveoccurred.
Variable ratio schedules can also be abbreviated. In a VR4 schedule, the number of responses required before reinforcement is given varies, but the average number of responses is four.
Because of the unpredictability of the reinforcers, variable ratio schedules havethe power to keep the animal, whether it is human or dog, performing for a long periodof time. They are random reinforcement, and we all want to keep playing to see whenthe next reinforcer will come.
Some things to remember about variable ratio schedules:
1. The animal will perform at a consistently high rate.
- 2. If you stop giving the reinforcer, the behavior that has been reinforced on a variable ratio schedule continues for a longer period of time than with other schedules.
3. Beginning trainers may find it harder to use a variable ratio schedule. In attempting to randomize reinforcement, inexperienced trainers often spread the schedule of reinforcement so thin that the dog stops responding.
Fixed Interval Schedules
In a fixed interval schedule, a specified amount of time must pass beforethe animal is given the reinforcer. Fixed interval schedules are most commonly usedin laboratory settings, often to study the effects of certain drugs over time. Thetime in fixed interval schedules is usually expressed in minutes. For example, inan FI3 schedu le, the animal is reinforced for the first response after three minuteshave passed.
Ratio schedules set a number of responses that must occur before reinforcement is given. Interval schedules set an interval of time that must pass before reinforcement is given.
In the real world, most fixed interval schedules related to dogs occur in hoursrather than minutes. Dogs fed at 7 A.M. and 7 P.M. every day are on a fixed intervalschedule of 12 hours. Dogs that go with their owner to fetch the paper that is deliveredat 6 A.M. every morning are on a 24-hour fixed interval schedule.
Some things to remember about fixed interval schedules:
1. They are more suitable for laboratory research with rats and pigeons than in dog training.
- 2. They are often confused with fixed duration schedules.
3. After the reinforcer is given in a fixed interval schedule, there is usually a pause in the animal's responses.
4. Performance with fixed interval schedules is less consistent than with fixed or variable ratio schedules.
Variable Interval Schedules
Like fixed interval schedules of reinforcement, variable interval scheduleslet some time pass before an animal is reinforced. The difference is that the amountof time varies.
When dogs need to stay on task or behave in a certain way for a period of time,a variable interval schedule might be used. A dog that is not too happy about stayingin a crate at a dog show might be reinforced on a variable interval schedule forquiet, calm behavior in the crate. Dogs that are required to w ork at a task for along period of time, such as in police or search work, might be reinforced on a variableinterval schedule.
Some things to remember about variable interval schedules:
1. These schedules result in a more consistent performance than fixed interval schedules.
- 2. They require random delivery of reinforcement.
3. They may be harder for beginning trainers to use properly.
4. There are not many training situations where a variable interval schedule is appropriate.
Fixed Duration Schedules
A less well-known schedule of reinforcement is the fixed duration schedule.With this kind of schedule, a reinforcer is given after a behavior has occurred(or has been occurring) for a certain amount of time. A biweekly paycheck is oneexample (although it is often mistakenly identified as a fixed interval schedule).
In dog training, fixed duration schedules of reinforcement could be used to teachthe timed stays in obedience competition, such as the three-minute sit and five-minutedown. If you wish to teach dogs to "watch me" for extended periods of time,a fixed duration schedule could be used to improve eye contact.
Some things to remember about fixed duration schedules:
1. They are easy to administer.
- 2. They are different from fixed interval schedules. In fixed interval schedules, the first response after the time has passed is reinforced. In fixed duration schedules, the reinforcer is given if the behavior has been engaged in continuously for the fixed time.
3. In a fixed duration schedule, the behavior must occur for the whole time period to be reinforced. For example, if the dog breaks the three-minute down after two minutes, no reinforcer is given.
4. Fixed duration schedules can be used to produce long periods of continuous performance.
Variable Duration Schedules
In the variable duration schedule, the interval of time that the dog mustbe engaged in a behavior in order to be reinforced changes unpredictably. For example,a guide dog that has to wait different amounts of time at busy city intersectionsbefore helping its owner cross the street is on a variable duration schedule.
Some things to remember about variable duration schedules:
1. Like other variable schedules, these are more difficult for inexperienced trainers to use.
- 2. They result in long periods of continuous performance and do not have the postreinforcement delays seen in fixed duration schedules.
LIMITED HOLD: DON'T BE LATE!
Some dogs work well once they get started, but getting them started is a problem.Limited hold is a principle that relates to both schedules of reinforcement and thisproblem. A limited hold is a limited window of time during which a responsewill produce the reinforcer. One example for humans is arriving at the airport tocatch a plane. Each flight only accepts passengers for a limited period of time.Late arrival means the reinforcer (the trip) will not be available.
Limited hold can be used to build quick responses.
Dogs in many performance events must begin a behavior with in a certain amountof time after the handler's cue. In advanced obedience exercises, for example, whenthe handler directs the dog to go out to a specified area for directed jumping, thejudge only waits for so long before disqualifying a dog that does not start.
Limited hold is used in conjunction with schedules of reinforcement to give theanimal a deadline for performing. After the deadline has passed, there is no rewardfor the behavior.
ESTABLISHING OPERATIONS: THIS REINFORCER LOOKS BETTER TO ME NOW
Dogs, like humans, are more likely to respond at certain times to a given reinforcerthan at other times. Food rewards are not as appealing immediately after the doghas eaten a large bowl of dog food. A dog that has been in its crate all day maybe much more likely to respond to play as a reinforcer than one that has alreadyhad several play sessions that day.
The effectiveness of a reinforcer is often described in terms of motivation, deprivationand satiation. Motivation is at the heart of reinforcement. If a dog is notmotivated to get something (or to get away from something), the item or event cannotact as a reinforcer. An example would be offering a piece of cheese to a dog thatdislikes cheese.
Deprivation occurs when a reinforcer becomes more powerful after the doghas gone without it for a period of time. An example is a dog being more interestedin food when it has missed a meal.
Satiation occurs when the dog has had enough of a particular reinforcer,making it less effective. An example is a show dog that got a lot of liver treatsin the breed ring. When it enters the next round of competition, the Groups, livermay have lost some of its appe al.
Establishing operations establish the effectiveness of reinforcers at particular times or in particular situations.
A More Accurate Term
In 1982, Jack Michael suggested using the term establishing operation (firstintroduced by Keller and Schoenfeld in 1950) rather than the terms satiation anddeprivation, because there are some other effects that look like satiationand deprivation but are not. For example, water deprivation results in thirst. However,there are some other things that result in thirst, such as eating too much salt.The result of withholding water and taking in too much salt are the same, but itis not accurate to say that too much salt is related to deprivation.
Using Michael's definition, establishing operations are events that alterthe value of a reinforcer. They are relevant in dog training because you must bekeenly aware of whether or not a reinforcer is effective at a certain time or ina certain situation.
An example of an establishing operation might be a lengthy play session with adog's favorite toy just before a training session. That toy would then not be aneffective reinforcer because the dog has just had a chance to play with it. Similarly,an active dog that is people-oriented and usually enjoys training may not followinstructions if it has had no exercise during the day and would prefer running overhaving an obedience lesson. (You'll find more information on how establishing operationsare used to deal with problem behaviors in Chapter 17.)
Understanding if reinforcers are effective at a particular time and in a particularsituation is one of the most important concepts in d og training. Reinforcement isat the heart of operant conditioning, but you can't reinforce a behavior until youunderstand what the dog considers reinforcement to be.
Reinforcer Sampling: There's More Where That Came From
Reinforcer sampling is one kind of establishing operation. It entails givingthe dog a little taste of the reinforcer before training begins. The idea is to getthe dog excited and eager to work for something new. For example, you might givethe dog one or two improved tasty liver treats before training begins, in order towhet its appetite for more treats in the training session.
The Premack Principle: Grandma's Rule
The Premack Principle, named after researcher David Premack, states that high-probabilitybehavior reinforces low-probability behavior. The idea here is that a preferred activitycan be used to reinforce some less-favored activity.
While Premack's work was done with rats and monkeys, the Premack Principle isoften discussed in teacher and parent training workshops. In work with humans, thePremack Principle is often referred to as "grandma's rule," because grandmotherssometimes say, "If you clean your room, then you can go outside."
Dog trainers say they are using the Premack Principle with the dog when they trainfirst and have a play session later. But technically, the Premack Principle as itis most commonly taught in education and training settings would not apply to dogs,because the principle requires self- management and understanding what is in one'sbest interest. However, the Premack Principle can be applied quite nicely to thedog's trainer, who can develop good habits such as training the do g first and playinglater.
JACKPOTTING: LET'S PLAY THE SLOT MACHINES
Schedules of reinforcement tell trainers when reinforcement is to be delivered.But in addition to knowing when, you also need to decide which reinforcers will beused and how much will be given. Will the dog get one tidbit of food, two treatsor a whole hot dog?
Most of the time, when food is used as a reward trainers use a small piece offood for each instance of reinforcement. In the literature on animal learning, thereis some research related to how factors such as frequency, size and delay of reinforcementaffect the performance of the animal. It seems to suggest that in some situationsinvolving choice, animals will prefer smaller, immediate rewards to larger, delayedrewards.
Despite this, in recent years dog trainers have begun using a reward procedurecalled jackpotting. Like hitting the jackpot at the slot machines in Las Vegas,the term means giving the dog a large, unexpected reinforcer. For example, insteadof a small bit of food, the dog might be reinforced with a large handful of food.Trainers say jackpotting results in an animal that is excited and curious about whatmight be coming next. However, there is not much in the operant research literatureto support this. In fact, the experimental literature suggests that if used regularly,larger rewards can create faster responses in the short term, but the animals stopresponding faster when these reinforcers are eliminated.
As early as 1942, the ideal amount of reinforcement was a topic of interest toresearchers. Leo Crespi trained different groups of rats to run along a runway. Onegroup received 1 food pellet as a reward, one group received 4 pellets and one groupreceived 16. When they had all learned the behavior and were performing it at aboutthe same rate, he started giving the 1-pellet and 4-pellet groups 16 pellets of food.The result was that the rats ran faster for a few trials. Crespi called this theelation effect, and suggested that it caused more pellets to look better.
In another experiment, when rats were given 256 pellets as a reward and were latergiven just 16 pellets, they began to underperform. Crespi called this depression.
The use of terms like "elation" and "depression" is a goodexample of a scientist using emotional interpretations and cognitive explanationsfor a behavior. Since there was no way for Crespi to really know what the rats werethinking or feeling, he was also not being very accurate. The terms elation and depressionare therefore no longer used. Instead, when describing these effects, researcherscall them positive and negative contrast effects.
Crespi's work suggests that jackpotting can have some short-term effect. In humanequivalents, jackpotting is like expecting a card on Valentine's Day and gettinga big diamond ring instead. While you might be absolutely ecstatic, the implicationsfor your future behavior are not clear. So it is with the ongoing benefits of jackpotting.
When new procedures are introduced in the dog training world, everyone wants toget on the bandwagon. One trainer told us she was having a hard time teaching herdog some of the advanced obedience exercises, so she was using "the jackpottingmethod." She went on to describe how she was jackpotting the dog with food everyday in every training session. We were tempted to ask her how much the dog weighs.But we did suggest that constant jackpotting takes away the elements of delight andsurprise--likely diminishing the value of the jackpot.
Dog trainers are quick to say "dogs are not rats" when they see behavioralresearch, and this is true. Jackpotting is an interesting procedure for which controlledresearch should be done on dogs being trained. While animals may appear excited whena jackpot is presented, and this excitement makes trainers feel great, the long-term,lasting effects of jackpotting on canine behavior are as yet unknown. Any resultthat is observed as a result of jackpotting is not so much operant conditioning asit is the contrast effect. The important thing to remember about jackpotting is thatif a trainer chooses to use it, and it is overused, it can diminish the effects ofstandard reinforcers.
REINFORCER ASSESSMENT: WHAT DOES THIS DOG LIKE?
People are different. We have different preferences about the foods we eat, thecars we drive and the style of homes in which we live. Some of us wouldn't walk acrossthe street to watch a football game, even if the best two teams in the country wereplaying and the admission was free. Others have no interest in seeing a ballet orvisiting an art gallery. As unbelievable as it seems, not everyone loves chocolate.Some people are motivated to earn as much money as possible; for others, having freetime is more important than having money. Dogs are no different. Breed differences,learning histories, and specific environmental and physical conditions all play apart in determining which stimuli will work as reinforcers for a particular dog.
When dogs are having learning problem s, you may want to conduct a reinforcer assessment.Rather than assuming something is a reinforcer for the dog, you can use a reinforcerassessment to observe and record the dog's actual responses to specific stimuli.
Good trainers experiment with a variety of reinforcers and have an understandingof each individual dog. For dogs that are very people- oriented, the attention thatgoes along with training may be a big reinforcer. For other dogs that are not sopeople-oriented, ending the training session may be reinforcing. The reinforcer assessmentcan be used to identify a dog's preferred reinforcers.
WHAT DOES IT ALL MEAN?
No matter how many times we've heard it, we're still taken aback when someonetells us, "I tried operant conditioning and it didn't work with my dog."Operant conditioning does work. There's no doubt about it. Some trainers maynot be using the procedures correctly; consequently, they may not be seeing the resultsthey'd like.
Operant conditioning is a science that comes complete with a difficult body ofknowledge to master and apply with skill. One book, a few seminars and a video ortwo probably will not make anyone an expert in operant conditioning. But when thereare problems in training, trainers who at least have a knowledge of the basic principlescan begin to analyze their animal's performance problems.
The principles of reinforcement we've discussed in this chapter teach us thatfor maximum effectiveness, the right reinforcers must be identified for each individualdog. The reinforcers need to be delivered on the appro- priate schedule. And finally,training is not finished until reinforcement has been faded and conditioned reinforcers eestablished so that the dog can work reliably, consistently and happily for a longtime to come.
Meet the Author
Mary R. Burch, PhD, is a certified applied animal behaviorist. Considered an international expert on the subject of therapy dogs, she is the author of Citizen Canine, among other books.
Jon S. Bailey, PhD, Professor Emeritus of Psychology at Florida State University, is a founding director of the International Behavior Analysis Certification Board. In 2005, he received the Distinguished Service to Behavior Analysis Lifetime Achievement Award from the Society for the Advancement of Behavior Analysis.
Christopher Solimene approaches voice art with a lifetime of experience and passion as a director, producer, performer, and educator. Studying story during a recently earned master's degree from Yale University, Chris appreciates the enriching qualities that story brings to enlightening each individual.
Most Helpful Customer Reviews
See all customer reviews