Tactics of Training: Classical and Operant Conditioning
Training Requires Tactics
Sometimes I think of Wally and I as locked in a virtual duel where he's trying to get what he wants and I'm trying to get what I want out of him, and it can be some mental manuevers going on from both our parts. Add in the third "player", the environment, and it can seem like a mini battle going on.
Well, in any fight, you need weapons and you need tactics. The tactics of the trainer are the principles of learning and conditioning. Our weapons are things we use to implement our chosen tactics. Be they clickers, training collars, or simply our voices and body language (or all of the above!). What we as trainers have to do is use those weapons to our advantage so that the dog will feel driven to do what we would like while also choosing the right tactics to ensure (or at least increase the chances of) our success.
Since I'm not knowledgeable about the weapons, aside from what I use, I'll only stick with what I've used with Wally (I've never used anything on him - so I won't be able to talk training collars or spray bottles, as examples). But I've developed some knowledge of the tactics in this fight from my "battlefield experience" with Wally.
Our Tactical Options
There are two tactical spheres we have to work with. That might seem like a gross oversimplification, but anything we do as trainer can be fit into one of these two spheres:
- Classical Conditioning
- Operant Conditioning
Everything else is either a subset or an approach to get at one of these two areas. Modeling? That's a method to get towards operant conditioning (you're "showing the dog how" so he can complete the behavior and be rewarded for doing so - that's operant conditioning). Charging a clicker (or word, or other action)? Classical conditioning. Feeding a dog in the face of a triggering stimulus (person, object, sound)? Classical conditioning.
Positive, Negative, and Punishment in the Dog Trainer's World
"Positive" and "Negative" are words that are loaded with connotation in "regular" language. However, in the dog training world, and the behaviorist world in general, these words mean the following:
- Positive - something is added to the dog's environment.
- Negative - something is taken away from the dog's environment.
"Punishment" also has it's share of connotation - often considered something physical, like spanking or leash pops/corrections. While, yes, those are punishments (they fall into the +P category), that's not all there is. Negative Punishment has no physical componant at all in most cases, however, it is still punishment because the goal is to decrease the occurrence of a behavior.
The "trick" to punishment is that it has to be a disliked event in the dog's eyes. If you push a dog down to the ground when he jumps up on you, that could be punishing to the dog, or it could be considered the invitation to a physical game. You have to know the dog and what makes him/her tick rather well in order to be efficient with punishment, both -P and +P. Too "easy" and the dog won't get the point. Too "hard" and you can create other problems.
Tactical Manual: Classical Conditoning
Both Classical and Operant Conditioning were mentioned in Understanding the Basics of Dog Training , but we'll look at them in more detail here.
Classical Conditioning is pairing two events together consistently to where one event predicts the other. I gave the example of "charging the clicker", but that's only one possible use. So how else do I use classical conditioning?
Wally was a fearful dog, or at least had fearful tendencies to a lot of things. He wouldn't attack, but he would try to stay away (or run away if I'd let him) and look visibly shaken and anxious in the face of a trigger. So, what I did was a game called "Look at That!". I learned this game from a book called "Control Unleashed". The essence of this game is to reward the dog for looking at whatever is triggering him. The idea is to pair the reward to what used to be a fearful whatever, changing his association to an expectation of something good happening. This, at it's heart, falls under classical conditioning.
It took some time with him, which is not unusual. Classical conditioning can take patience, especially when you're trying to "overwrite" an existing association. Whatever made him that way has set into his personality and his view of the world. Like with any habit, it took time for him to break the old ways and embrace the idea that kids don't mean death is near, but instead could very well mean a treat is coming soon. Fear is such a strong motivator and is instinctive and emotional (often triggering before we even have time to act or begin intervention) and very self-rewarding (either the scary thing went away, or he got away from the scary thing...both forms of negative reinforcement), it often takes a while to see success, let alone a total reversal.
However, it can work. I noticed Wally improving much more quickly under the "Look at That!" game than the months before of trying to "be brave". This game speaks his behavioral language, it's not a concept - but an actual reality. He probably never understands "bravery", but he can understand "look at kid = treat", "sniff kid = treat", "sit and look at dog = treat". This combines both classical and operant conditioning. Classical in that his association is changing based on what's happening. Operant in that he's performing behaviors, now of his own free will, and getting favorable consequences as a result.
Tactical Manual: Operant Conditioning
Operant Conditioning is more complicated. The principle is that the dog (the operant) will choose behaviors based on what's been rewarded and what's been punished. This is where the dog's learning comes into play more significantly. In essence, behaviors have consequences. There's four possibilities:
- Positive Reinforcement (+R) - The dog gets a reward for choosing that behavior. The reward will then increase the chance the dog chooses that behavior when presented with that stimulus. Example: Trainer says "sit", dog sits, dog gets rewarded.
- Negative Reinforcement (-R) - The dog ends a disliked event by performing a behavior. This will increase the chances of the dog performing the behavior in hopes of avoiding the disliked event. Example: Trainer says "sit" and pinches the dog's ear. The pinching stops when the dog sits. Dog will be more likely to sit when he hears "sit" in hopes of avoiding the ear pinch.
- Negative Punishment (-P) - The behavior causes the dog to lose something he would otherwise have wanted or loses what he was enjoying. The dog will be less likely to repeat the behavior. Example: Owner is playing fetch with the dog. The dog doesn't return the object. Owner turns away and ends the game and all interaction with the dog.
- Positive Punishment (+P) - The behavior causes the dog to have a disliked event occur to him. The dog will be less likely to perform the behavior in hopes of avoiding the disliked event. Example: Dog is caught chewing on a shoe. Owner bats the dogs nose with a newspaper, causing the dog discomfort. The dog will be less likely to chew shoes.
My Weapons of Choice
- Marker. Important to me, primarily for the teaching phase. Once a behavior is learned, it's not necessary. He already knows when the behavior is complete. However, whenever teaching a different application for the behavior (a new behavior chain or new context, as examples), the marker returns to help guide him in this new concept. You can also use it at any time to simply reinforce the meaning and keep the connection fresh in his mind. Markers take on many forms. The definition of a marker is any stimulus that predicts a consequence. A click from the clicker can predict a treat coming. So the click becomes the marker. Clickers have advantages, perhaps the best of which is a generally distinctive sounds the dog doesn't hear often, so it will quickly be noticed both initially and during training. However, any repeatable stimulus that can be timed accurately to the completion of a behavior or step in a process is sufficient.
- Treats. No, they don't have to make your dog fat. They don't have to be huge (dog's don't care - size doesn't matter...as long as they can taste it), and you don't have to be destined to a life of pockets full of dried-up, week-old treats. Mostly for the learning phase, or when doing things like shaping. I make sure to buy/make only healthy treats and keep the pieces small, both for delivery (easy for me to shove in his mouth and easy for him to eat and stay focused) and to limit calories.
- Voice. This can be powerful, especially if your dog is sensitive to sound and/or your voice. Wally certainly is and I use it to my advantage. A high tone (relative to your normal speaking voice to your dog) can indicate excitement while a deep voice can indicate displeasure or a very sharp command/reprimand. This is my reward when I don't give him treats. A "good boy" and maybe a rub or two. You can also "charge a word" the same way you charged the clicker. Using a simple foreign language word could be good for this. Maybe "Da" (Russian for Yes) or "Si" (Spanish for Yes) - simple syllables a dog can pick up and easy for you to remember and say with consistency.
- Body Language. This is powerful as it's the dog's native language, but it can be hard to master. I know there's been times where I've said "good dog" but my body said "you idiot!" and Wally's like "thanks...I think?" However, nothing says "get back" like a body block (standing in the path of the dog and not moving or walking a little towards him - best applied when the dog can't go around you). There's other signals that dogs can pick up on as well.
- Other Rewards. Games, toys, just running around like fools, letting him bark to his heart's content (not that Wally barks much at all), or just letting him sniff every clover, blade of grass, and dandelion for the next 2 minutes. Anything your dog loves can be a reward.
My Choosen Tactics
(I will use the abbreviations for the four methods in the interest of saving space and encouraging familiarity with them).
Given the way Wally is, my tactical approach is primarily +R/-P, though I used -R in a really odd way when teaching him to walk on a loose leash. The +R approach comes in with the clicker and rewards for performed behaviors. You can go in a 100 ways with +R methods, and not all 100 of them will be effective. Know what you're reinforcing! If you don't know, how will the dog know? Well the dog will make a connection, just...maybe not the one you're going for.
Anyway, when Wally does what's required of him, he'll get rewarded in some way, often times just by allowing what he wanted in the first place (remember Premack Principle?). When he doesn't, I'll just withdraw the chance. Doors close (literally and figuratively), games end, treats vanish, I withdraw. All these things get to him and I can notice a chance in attitude pretty quick. Of course, I only punish known behaviors and contexts - that's only fair after all.
I rarely, if ever, use +P on him as he's already sensitive and can get the message without it. That's not to say I'd never ever use it (I never say never...except in the sentence "I never say never"), but it's the last resort tactic for me.
-R was used by accident. When teaching him to walk on a loose leash, I simply let him "outwalk" me to the end of the leash. He didn't like the feeling of the leash making the collar push on his neck, so he made some slack. Being the opportunist I am, I made the leash shorter. And shorter. And shorter. He kept getting that sensation and wanted it to end. End result? He'll slow down, even stop, at the slightest feel of tension pulling on him. Useful because sometimes the leash gets under his legs and wraps around them - he'll feel the tightening and stop instead of walking more and making it even more of a mess. Also stops me from having to use direct leash corrections on him. He self-corrected based on the feeling.
I considered that -R applied accidentally with good success! Of course, it could be +P, just passively applied (I didn't jerk, but the leesh had to in order to create the tension). A little gray area there, was the sensation added by walking out too far, or did he take action to remove it because he felt it and didn't like it instead of saying "oh well" and keep pulling?
To Be Continued
There's more in the Operant Conditioning Manual and ways I've used it on Wally, but there's enough here for this hub. To see examples of Operant Conditioning as I've used it - take a look at the Examples of Operant Conditioning hub.
© 2009 Brian McDowell