What is Wins Above Replacement (WAR)? The New, Complicated Baseball Statistic
What is it?
Wins Above Replacement is a very new statistic, conceived to holistically evaluate major league baseball players. A simple explanation is possible, but most would like to know how all the nuances of baseball can be combined into such a simple number. When looking at players' WAR values, the number will refer to wins. Wins of baseball games at the MLB level. These wins are compared to how much value the average AAA call-up or waiver wire player would perform. If a player in question has a 0 WAR, then he is performing at exactly the level his average replacement would. If negative, then the player is even worse than the hypothetical replacement. Rarely does a player go as far negative as even -1, though it very occasionally happens. Positive WAR numbers often go higher than that, since you do not get benched for playing well. For instance, FanGraphs' WAR leader in 2012 was Mike Trout with 10; he led second place, Robinson Cano, by over 2 wins. This means Mike Trout made his team 10 wins better than it would have been with a 0 WAR player.
This sounds nice, but how can statisticians possibly determine how many wins a single player is worth? Let's check out the process.
Calculating WAR for Offensive Players
For a complicated statistic like WAR, the process of calculation is not extremely difficult to grasp. Unfortunately, though, it does use advanced sabermetrics in the process. There are a couple of different ways to calculate WAR, so I will be explaining the method used by FanGraphs and I will indicate later how Baseball-Reference is different. The first and most important statistic to know is wRAA, weighted runs above average.
wRAA constitutes the batting component of WAR. wRAA will sound somewhat similar to WAR in that it is a measure of how many runs a player contributes compared to an average player. Bear in mind that WAR measures against a replacement-level player while wRAA measures against an average MLB player, so there is some sophisticated math involved in making it an apples to apples comparison.
To get a wRAA value, you start with wOBA, another advanced metric. wOBA is, basically speaking, a combination of batting average, on base percentage, and slugging percentage. wOBA gives weights to different batting events based on how valuable they are, and then shows the average value of a given player's at bat. wRAA converts wOBA from an average into a run value based on how many times a player batted and using numbers based on how much offense there is in the league that season.
Then, 10 runs of wRAA are considered 1 win for WAR.
Fielding is the next major component of WAR. Ultimate Zone Rating is the latest and greatest measure of defense, which uses information from Stats, Inc. in which every single defensive play is reviewed by a human and assigned numeric values for range, difficulty, etc. The numeric value for UZR is how many runs are saved or given up by a player versus an average defender at that position. It is, basically, a defensive version of wRAA.
The final aspect of the computation for a position player is baserunning. In the arm portion of UZR, outfielders are given credit or charged value when a baserunner takes an extra base, is thrown out, or holds up (such as a single with a man on first base: does he go to third or not?). While outfielders are given defensive value for this, the opposite value is given to the baserunner. On plays that don't involve outfielders, a baserunner can get baserunning value as well, such as stealing a base or beating the lead throw on a potential double play grounder. This value is converted to runs and contributes to wins much the same way.
The final considerations are dealing with position by position variance. WAR would like a shortstop more than a first baseman if they both hit the same way, for instance. Good defense is worth more at more important positions as well.
Calculating WAR for Pitchers
The value assigned to pitchers is somewhat simpler, since they don't have so many things to measure. They just pitch. The statistic used for this is called FIP, fielding independent pitching.
This statistic weights out exceptional or bad fielding from pitching statistics as well as ballpark effects, so a pitcher who has played in PETCO Park or Coors Field does not ruin the validity of the measure. These weighting measures take a good deal of statistical prowess, but it is a well-received statistic in the baseball community.
The final number read-out is meant to be a "substitute" for ERA, in that it means roughly the same thing. For WAR, this is converted to a run value based on innings pitched; a pitcher with a 2.50 FIP is worth far more if he pitches 200 innings rather than 50, so converting to runs instead of an average can take care of that.
WAR as Criteria for MVP Voting
Does knowing about WAR change your opinion on the 2012 AL MVP Award?
FanGraphs vs Baseball-Reference
Baseball-Reference debuted the WAR statistic, while FanGraphs brought their own version to the table shortly thereafter. There are important differences in their calculations which lead to them being more and less useful for certain situations. I have explained the calculations so far based on FanGraphs' method. When referring to them, many will use rWAR for Baseball-Reference and fWAR for FanGraphs.
Baseball-Reference does not use FIP for pitchers, instead using a proprietary mixture of fielding and ballpark adjustments to the raw runs allowed stat.
For defense, Baseball-Reference uses Total Zone instead of UZR. TZ is useful because it does not require video review, opening up all of baseball history to WAR values. FanGraphs use TZ for seasons before video review began (2002).
Baserunning is measured similarly, but with slightly different equations.
The respective systems of weighting for position and making year-by-year league adjustments have subtle differences as well. Conceiving of a replacement player remains an inexact science, since the player does not exist.
Baseball-Reference's system, just by nature of being consistent, is useful for comparing players of all eras. FanGraphs would likely have more supporters when it comes to looking at recent seasons, due to its use of FIP and UZR, well-liked statistics in the sabermetrics community.