Positive and Negative Reinforcement Training
A reinforcer is anything that when paired with a behavior tends to increase the chances that the behavior will occur again. There are two main types of reinforcers: positive and negative. A positive reinforcer is something that the subject wants- food- that will increase the probability that the behavior will occur again once added. A negative reinforcer is something that the subject wants to avoid- unpleasant sound- that will increase the behavior once removed.
There are two sub-types of reinforcers: primary (unconditioned) and secondary (conditioned). A primary reinforcer is something that the subject doesn't have to learn to like or dislike and usually includes food. A secondary reinforcer is something the subject has to learn to like or dislike- bells, lights, clicks- and is sometimes called a bridge.
A positive reinforcer is something that the subject wants- food, petting, praise. With positive reinforcement, something is added or started that the subject likes and perceives as good. Because the subject wants to receive the positive reinforcer, it repeats the behavior that caused it to gain the ‘Good Thing.'
A behavior that is already occurring can be intensified by positive reinforcement. For example, when you call a puppy, and it comes to you, you automatically pet it for coming. The petting is the positive reinforcer, making the puppy more reliable to coming even without other training.
Positive reinforcement is the most basic and effective part of training, but it is relative, not absolute. For example, water is a positive reinforcer for ducks, but a negative reinforcer for cats; giving food or treats to a subject who has just eaten and is satiated, is not considered a positive reinforcer anymore. The one downfall of positive reinforcement occurs when you reward an animal acting out of fear- coddling and babying a shy dog.
Positive Vs. Negative
When trying to decide between positive and negative reinforcement training, you may have a hard time distinguishing between the two. An easy way to remember is positive reinforcement adds something to the behavior to enforce that it occurs again. Negative reinforcement takes away something to enforce that the behavior will occur again.
Behavior + Treat = Positive
Behavior - Tug on leash = Negative
Negative Reinforcement VS Punishment
Many people think that negative reinforcement is the same as punishment, but it isn't. With punishment, the aversive stimulus occurs after the behavior it was meant to modify, having no effect on the behavior, itself. Although the ongoing behavior stops, there is no predictable outcome of the future or where or not the behavior will occur again. Punishment does not result in predictable changes. Negative reinforcement can be effectively used to train a specific behavior.
Just because a reinforcer is something that increases a behavior, it doesn't mean that it has to be something the subject wants. Avoiding something that you don't like can be just as reinforcing, so a negative reinforcer is something that the subject wants to avoid- an unpleasant sound or uncomfortable feeling.
For example, a horse's reins are loosened when the horse slows down, or the choke collar is relaxed when the dog stops pulling or goes into the desired direction.
Negative reinforcement may consist of mild or extreme aversive stimuli- any negative stimulus that causes the subject to respond by avoiding it. The subject can avoid an aversive stimulus by stopping or changing a particular behavior. When the new behavior starts, the negative reinforcer stops, in turn, strengthening the behavior.
Training can be done entirely with negative reinforcements, and much of traditional training was done that way, but it is not as effective as positive reinforcement. Negative reinforcement is a useful process, but it's important to remember that with each session of negative reinforcement, a punisher is involved. For example, when a rider pulls a horse's left rein until the horse turns the desired direction, the rider is punishing going straight. Negative reinforcers should be used with caution because overuse of aversive stimuli can lead to the "fallout" effects of punishment: fear, passivity, resistance, and reduced enthusiasm.
A primary reinforcer is a natural reinforcer that reinforces the subject the first time they are presented to the subject. A trainer does not need to teach his subject how to react to a primary reinforcer because it is something the subject already likes- food- or dislikes- an uncomfortable sound.
Secondary reinforcers are meaningless signals- lights, motions, sounds- that are presented before or during the delivery of a reinforcer. Secondary reinforcers are a form of operant conditioning, in which the subject creates an association between two stimuli. For example, dolphin trainers use a police whistle as the conditioned reinforcer because it is easily heard and it leaves the trainer's hands free for signaling and fish throwing. In this case, The whistle is the secondary reinforcer, and the fish is the primary reinforcer.
Secondary reinforcers solve the problem of getting the primary reinforcer to the subject during the particular part of the behavior that the trainer wants to encourage. When training a dolphin, the whistle will teach the animal which heights, arches, and reentries the trainer prefers; otherwise, just throwing fish will eventually cause a delay between jumping and eating the reward, and it will never teach the dolphin which aspects of the jump the trainer prefers.
Secondary reinforcers can become a very powerful tool when training, and sometimes the subject will work harder and longer, especially when pairing the conditioned reinforcer with a variety of primary reinforcers. Once a secondary reinforcer is established, one must not throw it around meaninglessly because its force will be diluted.
A reinforcer must occur in conjunction with the behavior it is meant to adjust. The timing of a reinforce tells the subject exactly what the trainer wants, and the timing of the reinforcer becomes more important than the reinforcer itself. The biggest problem of a beginner trainer is that of a slow reinforcement. For example, if the dog sits, but the trainer says "Good dog," after the dog is standing again. The "Good dog" reinforces the behavior of standing up, not sitting. Reinforcing too early is also a problem and is just as ineffective.
When training with negative reinforcers, timing is just as important as when training with positive reinforcers. The horse learns to turn left when the rider pulls the left rein, but the horse learns only if the pulling stops with he turns; the end of the pulling is the reinforce. If the pulling stops too early or too late, the horse will not be sure what caused his negative reinforcer, never learning to turn left on cue. When beginner riders get on a horse, most tend to continue kicking, telling the horse to move, but a rider should only kick a horse's sides to get it to start moving forward. The continuous kicking does not give the horse any information on what he is supposed to do, and the horse develops ‘iron-sides' that move at a slow pace
Beginner trainers also have a problem at deciding upon the appropriate size of the reinforcer. The trainer should try to get the reinforcer as small as he can get away with because the smaller the reinforcer, the more quickly it will be eaten and the more that can be given per session.
One of the keepers at the National Zoological Park in Washington, D.C., complained that her training of a panda had been proceeding too slowly, and the keeper was only gradually succeeding in shaping a particular body movement. The keeper gave the panda a whole carrot as a reinforcer causing the panda to take precious time when eating it, so the trainer was only able to give three carrots in each training session. After eating three whole carrots, the panda was getting tired of them and full. If a trainer only plans to have one session a day, the trainer should count on the subject working well for about a quarter of its usual ration, but if a trainer can get about three or four sessions, the trainer can divide the usual reinforcers into about eighty, giving about twenty or thirty each session.
A jackpot is a reinforcer that is either larger in size or quantity or better than the normal reinforcer and comes as a surprise to the subject. A jackpot may be used to help a subject make a sudden breakthrough with a particular behavior. Jackpots may also aid in improving the behavior of a stubborn, fearful, or resistant subject that may not offer the trainer the desired behavior.
Sometimes the size of the jackpot isn't what counts but the type of reinforcer, something the subject prefers over the regular treat reinforcer. While agility training a dog, to get him to jump higher obstacles, a hotdog slice can be given as the reward versus the usual jerky treat, being surprised at the better treat, the dog will become more enthused about jumping higher, hoping to receive the hotdog slice.
Jackpots may cause the subject to continue a behavior because the jackpot is random. This causes the subject to be more excited about a behavior, but this may cause anticipation on the subject's part causing the subject to perform the behavior before asked.