Corrections In Horse Training
I've spent a lot of time thinking about whether or not corrections have an appropriate place and use in horse training. Positive Reinforcement with Clicker Training is an extremely powerful training tool that eliminates the need to use force or fear in horse training, but does it completely eliminate the need for any and all corrections?
First... a little science...
The scientific terms for the two types of corrections are Positive Punishment (+P) or Negative Punishment (-P). But don't get confused by the word "positive", since using +P is hardly a positive experience for the horse and -P is not as severe as it sounds.
In scientific lingo, positive means to "apply", negative means to "take away", and punishment is another word for correction. So positive punishment means to apply a correction and negative punishment means to take away something the horse wants/needs as their punishment.
An example of +P would be if a horse reaches out to bite, you smack it with a whip or hand. You applied a correction following a behavior you did not care for. Think of this as what we commonly consider a "punishment".
Another scenario that would be an example of +P would be if you're round pen chasing your horse and the horse kicks out at you, taking the whip and either striking the horse with the whip or even aggressively pursuing the horse in a sort of "scare tactic" fashion are both positive punishments following an undesired behavior.
An example of -P would be if you had a horse that acted aggressively during feeding time. You could begin approaching the stall with the food in hand and the moment the horse started acting aggressively you would turn and walk away with the food. In a few moments you could try again, continuing to approach if the horse maintained a calm demeanor and gradually work up to being able to put the food into the stall while the horse remains calm.
Another example would be, if you practice positive reinforcement during your training, you could remove yourself (which would remove the positive reinforcement the horse desires) for a short time as a correction for a specific behavior. You'll be removing what the horse wants, creating -P. You can think of this form of correction something more along the lines of what we consider a "penalty".
One more example of how to use -P can be found being used by positive reinforcement horse trainers when they use what's called a "no reward marker". It's when a horse is asked to perform a behavior, and when it fails to perform the desired behavior the handler withholds the positive reinforcer (food, scratches etc.) and gives some sort of signal that the horse failed to earn its reward, usually a verbal cue such as "uh oh" or "try again".
+P is far more common in the horse training world than -P, and is the form of correction most under scrutiny. I also believe +P the most mislabeled, misused, and misunderstood form of training, but -P also has been mislabeled and misused over the years.
There are many trainers, even famous ones, that create step by step training programs for their students and horses that use kind or fun sounding labels to explain their methods. They explain them in ways that often seem so easy and so horse friendly.... but .... The problem is that often these programs that are being advertised as negative reinforcement based (-R aka Pressure and Release, the release of pressure being the reward) often cross over into using excessive amounts of +P or -P without anyone ever recognizing it. Sometimes these programs aren't even recognized to be using -R and people just believe they are speaking the language of the horse, they are completely unaware of what type of training they are using at all! This is a HUGE problem!
I firmly believe anyone handling or working with a horse should know exactly what method they are applying in any given situation. If you are going to use +P or -P you need to know how it works and when to use it. The same goes with +R and -R. Yet I've only ever met one trainer out of the thousands that will sit down with a student and explain to them the science behind the method.... they will tell you all day long why they do what they do, in their own language, but they don't give you the tools you need to see or understand it for yourself.
Being willing to talk about the science and the process is something called "transparency". Trainers and horse handlers who refuse to be transparent are usually in one of three categories... uneducated on the science themselves, not very thorough instructors, or unwilling to have their methods scrutinized by an educated student. I personally believe in being transparent.... I want you to know why I do what I do... and I want you to be able to look at other trainers working with their horses and know what they are doing.
If you've been reading this blog and following me on Instagram you know that I use primarily positive reinforcement (+R aka applying a reward following the performance of a desired behavior) in all my training. Through +R I also teach -R (negative reinforcement/pressure & release), which.... basically looks like using positive reinforcement to shape correct responses to what are common negative reinforcement cues (like moving forward off the leg), but let's talk about that more in-depth later. For now I want to talk about corrections.. and the big question...
Do I use +P or -P and feel it has a place in horse training?
The answer is yes. I know, probably a shock to some of you, but hear me out.
During training I will use small amounts of negative punishment (-P) if the situation requires it, similar to the examples given above. Negative Punishment has rather limited applications in horse training but becomes far more effective and useful when a positive reinforcer is involved... such as meal time (being given their food), using rewards during training, or when the human themselves becomes the positive reinforcer.
Since I use positive reinforcement (+R) during training, I'm able to use no-reward markers when improving on the accuracy of "known behaviors". There are different stages in training each behavior for the horse, one of the final stages being when a horse consistently performs a behavior on cue. At this stage the behavior is considered "known". During this point, or when approaching this point, no-reward markers can be used to help the horse perfect its cue response. However, I caution against using no-reward makers too frequently or too early in training as it can lead to the horse becoming discouraged or frustrated, feeling it's unable to earn a reward. [UPDATE April 2019: I don't use no reward markers anymore, as I find they are unnecessary. I also have made many changes to eliminate as much P- from my training as possible]
Also, with using +R as the primary form of training my presence can become the positive reinforcer as well, since positive reinforcement can't be earned without me being there. This makes removing myself from the environment extremely effective when working with my horses, especially with my filly River, who is extremely social and has been trained with almost exclusive +R. Her drive to be with me and work with me is so strong that removing myself from her is very effective when necessary. Keep in mind though, if your horse does not find your presence to be a positive experience, the effectiveness of -P in that particular application is likely to be rather limited or even counterproductive.
Positive Punishment (+P) can be effective in very specific situations but has a tendency to be grossly overused or misused. It has an extremely small window of time it can be effectively used, and in a very restricted area of training/handling. Its effectiveness depends greatly on the horse already knowing the desired "correct" behavior. If the horse hasn't been patiently and consistently taught an alternative desired/correct behavior then positive punishment will only temporarily halt an incorrect behavior and make the situation worse in the long run. To put it simply, +P corrections will temporarily suppress an incorrect behavior but will not teach the horse the correct behavior. Which makes it only useful once a horse completely understands the correct behavior.
The strength and duration of a +P correction is also something to be talked about. I will on very rare occasions use Positive Punishment (+P) if the situation calls for it, but the correction has to be entirely emotionless, almost instantaneous, humane, and brief. If for whatever reason I am unable to perform a +P correction with those four criteria met, then it isn't the right time for a +P correction.
For these reasons I avoid the use of +P almost entirely, and find it be only acceptable in situations where safety is at risk and/or once an alternative correct behavior is well known by the horse. Positive Punishment has absolutely no place in an everyday training program, and is an ineffective training method on its own.
Now that I've walked you through all of that, I want to show you in two real life examples of how I chose to apply corrections... , specifically -P and +P. Whenever people argue for or against the use of punishment/correction during horse training we always go straight to the dangerous behaviors so I'm specifically choosing to use these examples as potentially dangerous behavior was being displayed. I am not however going to walk through all the applications for both +P and -P, as that would take an eternity, but please keep in mind that the method and theory is all the same no matter the scenario. Biting, kicking, rearing, bolting, food aggression, sitting back, not standing for grooming, poor trailering manners.. the list goes on.
[UPDATE April 2019: The following example of P- is not something I use intentionally anymore, I've found with better application of R+ and better management of the environment and training plan it's unecessary and can encourage food anxiety in the horse. It's still a great example of how P- can be used, so I'm going to leave it in the article, but I wanted to make sure my readers know that I advocate for better behavior modification techniques when possible, to avoid P- ]
River went through a phase where she would become impatient and frustrated easily during leading practice. Her response was to shake her head, agitated, and then to bump her body into me in a very forceful way. Obviously this is an unacceptable and dangerous behavior, especially as she grows and gets bigger, so I had a decision to make...
First, I needed to consider how well she knew the desired behavior. Did she understand what she was supposed to be doing? (Which was to act in a calm safe manner) Had I been absolutely clear, effective, and consistent with my training? Was this incorrect behavior a result of my own mistakes or inconsistency? If I were to correct this behavior would she know what to do instead?
After careful consideration it was decided that I had rushed training this behavior a little too fast and her agitation was due to my mistakes. She did not understand what she was supposed to be doing or how to behave correctly. However.... the behavior was still unsafe and a poor habit for her to develop so I did feel a correction was appropriate.
In this situation -P through removal of my presence was the most effective form of correction. It would offer me personal safety while not escalating River's level of agitation.. but, the primary goal was to avoid triggering this agitated behavior entirely by improving my techniques and back up our training a few steps until she really understood the desired behavior.
We went back to leading practice and I worked on preventing the agitation from starting, but if for any reason the dangerous behavior unexpectedly returned I would simply unclip the lead line and calmly walk a short distance away, shutting the tack room door or stall door behind me to create a barrier. River's immediate response was to stop acting agitated and to follow me up to the door where she would stand like a sad puppy and wait for me to return. After a few minutes I would return and we would try again. At this point we would work towards just one great response and end the session there. This whole scenario played out maybe four times at most, each time more effective than the last until the agitated behavior subsided completely.
In this first scenario River did not know what she was supposed to be doing so I chose a correction that wouldn't escalate her frustration but also offer me personal safety. The reason I chose to not use a -P "no-reward marker", or a +P correction is because both of those would have only increased the level of agitation without offering me any personal safety. By removing myself I was able to give us both space, and a barrier, River was able to process that "acting unsafe makes human and reward go away" in a calm non-escalating way, while offering my relatively fragile human body safety.
If I had chosen to smack her or back her up (+P), I could have very quickly escalated the situation and created an even more agitated horse that was bound to continue in a frustrated manner, especially since she was still attached to me. Potentially she would take the correction without becoming reactive, but had she chosen to become upset by the correction I would be forced to continue struggling with her until she gave up the fight, and at the same time potentially risked harming her and/or myself in the process.
Sure, it would have temporarily suppressed the behavior I didn't want, but it wouldn't have taught her how I did want her to behave. All River would have learned was that humans are very confusing and use ropes and fences to keep you trapped next to them while they punish you, and if you react they continue until you stop fighting. She would have learned to fear me.
I could have chosen to use another form of -P and used a "no reward marker" like I mentioned earlier. But this too would be ineffective and might potentially exacerbate the level of agitation since the reason River was frustrated to begin with was due to a lack being "successful". By telling her what she already knew, that she wasn't earning a reward, I wasn't helping her understand what it was that I did want so she could earn a reward.
This left the form of -P where I removed myself from the situation when necessary. But like I mentioned before, this was only a small part of what I did to amend the situation. The bulk of the process to eliminate the unsafe behavior happened by being clearer, more consistent, and more patient with my training to help River understand the behaviors I did want when leading. The correction was avoided as much as possible, and the responsibility was on me to be a better trainer.
For scenario number two, we are going to take that same behavior of leading with safe manners but fast forward many weeks. River had been leading like a seasoned pro for some time at this point, we had successful moved past our earlier hiccups, and she was starting to show me that the behavior was fast approaching a level at which I would consider it a "known" behavior. Meaning, she knew how to act correctly and safely while on lead.
We were leading out to the pasture and her baby play antics got the better of her suddenly. She jumped up in a little playful bouncy half rear, tossing her head around and moving behind me. While I love seeing her play, playing on the lead line is not safe. I jiggled the lead rope fairly firmly, while saying "NO!" firmly (but not yelling, or upset), and then firmly, but patiently, guided her back to being beside me. We immediately followed up with practicing basic handling skills using positive reinforcement, reinforcing all calm and safe leading manners before returning to the pasture where she was allowed to play to her heart's content.
In this second situation I was able to apply a mild +P correction, to let her know that I didn't care for unsafe behavior, because she did know the correct way to act. River was just swept up in the moment and had forgotten about me temporarily. It was emotionless, humane, instantaneous, and brief. The correction met all of the four criteria.
However, the most vital parts of scenario number two that allowed me to choose a +P correction were that she understood what was desired of her from previous patient and consistent training, and that I followed up the correction with positively reinforcing the desired behavior. It's absolutely vital to remember you can't just punish the wrong behavior without offering reinforcement for the correct behavior. A lack of punishment is not a form of reinforcement.. which brings us back to my earlier point about how +P can be so grossly misused and misunderstood.
If all you ever do is correct and apply punishments you will be left with an animal that feels trapped and forced, and very possibly afraid. Just think about this from a parent to child point of view. If all a parent ever did was spank or time out a child but never once told them "THERE! That's it! Good job!" this poor child would feel it could never do anything right. It would live in constant fear of being corrected for even the simplest actions. Eventually the child would become shut down and suppressed.. never speaking, never playing, slave-like, lifeless, robotic.. you would likely forget the child was there for the most part. It's the same with any animal. Positive reinforcement should far outweigh the use of punishments, corrections, and penalties.
At the same time without some corrections you run the risk of allowing potentially dangerous behaviors to develop and blossom into full blown behavioral problems. I say this with extreme caution, since human tendency is to overuse corrections and under use positive reinforcement, but there are times when a horse will find acting in a potentially dangerous (to us) manner more important than earning a reward. However, these situations are few and far between.
To put into perspective the figurative ratio at which I believe corrections should exist in a training program I'm going to use a percentage example.
Positive Punishment corrections should be present in no more than .01% of the everyday training and handling of your horse. Negative Punishment corrections should take up no more than .10% of the everyday training and handling of your horse. Which means that 99.89% of all training should be based primarily on very patient and consistent use of positive reinforcement and negative reinforcement. (And in my opinion .. most of the 99.89% should be positive reinforcement.)
It is 100% the responsibility of the handler/trainer to be aware of the methods they are applying and to take the responsibility of their horse's behavior upon themselves, even if you didn't do the bulk of your horse's training. A horse's behavior and actions are a direct result of training, environment, and genetics. We cannot change genetics (other than through selective breeding), but we can change the way we train and handle them. Punishing them for being the animals they were created to be is selfish, impatient, and inhumane. We need to educate ourselves and learn to work with the animals that we are blessed to have in our lives.