Looking At FanGraphs’ Position Player WAR
Really, the last thing I want to do right now is write about the Giants. Heading in to today, they were ahead of the Padres by half a game, and this was after being shut out by Randy Frickin’ Wolf. Now they’ve lost 2-1 to Yovani Gallardo- which is, to be honest, somewhat expected- and that lead is gone.
I was over at FanGraphs’ forum earlier tonight, and there was a thread in which a poster asked how WAR is calculated. Since nobody picked it up, I jumped in and gave step-by-step instructions. I thought I’d re-post it here, along with some comments:
I’ll use MLB 2009 as an example to explain WAR for hitters (pitcher hitting is included; sorry for being a bit lazy):
1. Find the league runs per out. This is defined as (Runs/IP*3). You’ll get .173. This is how you get your run values.
2. The run values are as follows:
NIBB = R/O + .14 = .31
HBP = NIBB + .025 = .34
1B = NIBB + .155 = .47
2B = 1B + .3 = .77
3B = 2B + .27 = 1.04
HR = 1.40
SB = .20
CS = -1*(2*R/O + .075) = -.42
3. Multiply the run values by the frequency of the event, and then divide by (AB – H + SF) *-1. This is the value of the out.
4. Apply these coefficients to the player’s batting line. Let’s say we have +65.4 runs for Albert Pujols in 700 PA. Divide the +65.4 into his PA, for +.093 runs per PA. Multiply this by his intentional walks (44) and his sacrifice hits (0), and then add this to the original total. This gives you +69.5 runs above the average.
Right off the bat, you may be thinking “this doesn’t look anything like wRAA” (which is FanGraphs’ run estimator of choice for WAR). That’s because it’s hidden. Add the run value of the out (in this case, 0.27). We do this in order to absorb the value of the out in the numerator, so we can rewrite it as a rate. This gives us coefficients of .74 for the single, 1.04 for the double, etc. (SB/CS remain untouched, however, because they’re not batting events). Multiply by the frequency of the event, and divide by plate appearances sans the intentional walk. This gives you something like .270. Solve for the OBP scale- let’s say you want .330 as your average- so Scale = .330/.270 = 1.22. Multiply the coefficients by the scale and you have your wOBA equation:
wOBA = (.72*NIBB + .75*HBP + .90*1B + 1.27*2B + 1.60*3B + 2.04*HR + .24*SB – .51*CS) / (PA – IBB)
wRAA (Weighted Runs Above Average), then, would be (wOBA – League wOBA) / Scale * PA. In this case, League wOBA is .330 and the Scale is 1.22. I’m actually a bit bothered by the fact that FanGraphs doesn’t point this out- the value of the scale isn’t always 1.15, as Dave Cameron writes. It’s probably closer to 1.18-1.22. People commonly cite 1.15, because it’s the value listed in The Book. But because it’s so dependent on the run environment/the scale you want to reach, it’s going to vary quite a bit. Anyways, when we divide by the scale, we put it back to runs above the out- and when you multiply it by plate appearances, then you have total runs above average. What I did in step four is what Terpsfan suggested to me in one of my older posts a while back- we treat the run value of an intentional walk and a sacrifice hit as equal to the player’s overall rate of performance. This is because the player has no control over being intentionally walked (that’s the choice of the opposition) or a sacrifice hit (the choice of the manager). So we assume he’ll perform the same in those plate appearances as he would in all others.
I’m actually not a big fan of the way FanGraphs calculates their batting runs. That’s not because it’s not a sound design- it is. Linear Weights are the ideal method for assessing an individual player’s offense. But the way they generate the weights is exceptionally lazy, in my opinion. The “proper” way of going about this would be to use the “value-added” method, by looking at the average change in run expectancy per event. They have the data at hand. What they’re doing is approximating the weights based on assumptions of what the weights should be. All in all, we’re probably not going to see a huge difference between empirically-derived weights and these- but I’d rather be more theoretically accurate than work with assumptions that won’t always hold up. Also, I’m of the opinion that ROE should be included. For some players, we’re talking about +/- five runs per season- and for a guy like Derek Jeter, who has accumulated approximately 35 runs over the course of his career by reaching base on an error- is being shorted about 3.5 wins off of his career value.
The beautiful thing about WAR is that it’s a framework, and the components can be altered. But for a site like FanGraphs, which is often cited for their WAR, it would behoove them to try and tighten up their run estimator a bit.
5. Apply a park factor. I’m not sure how FanGraphs does it- I use League Runs per PA * (1 – Park Factor) * Player PA. Doing this with a PF of .98 (that’s my guess for STL’s PF, I don’t have FG’s or my PF in front of me) gives us +1.68 runs. Add this to #4, and we get a total of +71.2 runs above the average.
I’m always fascinated by park factors, as some are extremely comprehensive. I can’t comment on their PF, as I don’t know how they go about it. I assume they park adjust the player’s runs above average by dividing their wOBA by the square root of the park factor. The method I use is the same one outlined in The Hidden Game of Baseball.
6. Add a replacement level. FanGraphs calculates Replacement level of +20 runs per 600 plate appearances (the equivalent of 150 games). Albert had 700 PA; that’s +23.3 runs of replacement. Add this to step 5 and we have +94.5 runs above replacement. But we’re not done.
Replacement Level varies from WAR to WAR. I believe FanGraphs implements an overall replacement level around .290; Sean Smith’s WAR over at Baseball-Reference.com is around .350, if I remember right. FanGraphs doesn’t adjust for league quality, which I would recommend. I use the same one Tango suggests- 20 runs per 650 PA for the NL, 25 per 650 for the AL. Again, no complaints here.
7. Add in the player’s positional adjustment. Per 162 games (I don’t know if they do it by innings or by games participated in; I assume it’s the former), we have the following values: +12.5 C, -12.5 1B, +2.5 2B/3B/CF, -7.5 LF/RF, +7.5 SS, -17.5 DH. Albert had a positional adjustment of -11.9 runs for playing first base. Add this to step 6; that’s 94.5 – 11.9 = +82.6 runs.
8. Add the player’s UZR. Albert was a +3.1; that’s +85.7 Runs Above Replacement.
I’ve got no qualms with the positional adjustments, as they seem right to me. I do find it interesting, though, that they’re still using straight UZR rather than averaging it with the Plus/Minus system. I don’t know why they choose one over the other, as it appears to favor one over the other. I’m assuming they keep it to UZR simply for consistency, as that’s what they began WAR with- but there’s nothing wrong with making a change. Also, MGL recently suggested in a post over at The Book Blog that if we’re looking to measure WAR and if we incorporate UZR, we should regress the UZR figure. I wonder if they’ll begin doing that…just some food for thought.
9. Divide Step 8 by 10 to convert from runs to wins. 85.7/10 = 8.6 Wins Above Replacement.
FanGraphs has Albert at 8.7 WAR. That’s pretty close, as I didn’t make an adjustment for pitcher hitting or for the leagues, and we’re likely using a different park factor (I pulled that number out of my rear).
I hope that helps some.
Again, working under an assumption- the conversion of runs to wins isn’t always 10; that’s a rule of thumb. If I remember right, the PythagenPat method of runs to wins is something like (0.75*Runs per Game +2.75). For 2009 MLB, that means a 4.6 runs per game environment would yield a runs to wins conversion of 9.7. If Albert is 86.6 RAR, that would make him 9.0 WAR, not 8.7.
Anyways, those are just some of my thoughts (ramblings) on their WAR. I love FanGraphs, but I think they could tighten things up a little bit. Baseball Prospectus is about to make a pretty big push with all of the changes Colin Wyers is bringing, and FanGraphs might have to up the ante a little bit- at least, in my eyes- to keep going there as my major source of statistics (aside from Baseball-Reference, of course).