Matt Cain and FIP/xFIP
I’m coming out of my study cave for a bit so I can add a quick two cents to a topic that got some play yesterday.
Paapfly has a post regarding Matt Cain and his consistent ability to outperform his FIP and his xFIP. A number of people- by and large non-Giants fans- have referred to Cain as being “lucky” for owning Earned Run Averages significantly lower than his FIP and his xFIP; that he’s due to regress. Cain will undoubtedly regress at some point in his career, but it wouldn’t surprise me one bit if he continued this trend for a few more seasons. Personally, I don’t really care for FIP or xFIP due to the way it is handled- and I figure this is as good a time as any to address my issues with the way people are using it.
First of all, what is FIP?
It stands for Fielding Independent Pitching, and it is a DIPS (Defense Independent Pitching Statistic) created by esteemed saberist Tom Tango. The formula is excruciatingly simple:
((13*HR + 3*(BB+HBP) – 2*K)/IP) + C
Where C is a constant designed so that the league FIP equals the league ERA. The constant typically sits around 3.2, but it varies depending on the league, year, and the run environment. There seems to be a common misconception that the numbers are “drawn out of thin air” or that it’s based on an extraordinarily complex formula- really, it’s not. If you know Tango, you know he likes to keep things simple. The coefficients are derived from linear weights, which are the average run value of an event. The run values for different events in 2010, for example, are as follows: .48 for singles, .77 for doubles, 1.06 triples, 1.42 home runs, .33 for walks and hit by pitches (I’m including IBB in these figures), and -.27 for outs. Multiply each run value by the frequency of the event per ball in play, and you’ll find the run value of balls in play is around -.03 runs. Add this to the FIP coefficients. This gives us .36 BB+HBP, 1.44 HR, -.25 non batted ball outs, and 0.00 for balls in play. Multiply by nine. Now you have 3.2 for BB+HBP, 13.0 for HR, and -2.2 for non batted ball outs. And…there you go. Tango rounds the coefficients for simplicity’s sake, but you’re really never going to see a major discrepancy in the coefficients, at least at the Major League level.
That’s all there is to it. It’s a linear run estimator that regresses all balls in play to the league average. 100% regression to the mean, which is what Robert “Voros” McCracken did with his original dERA. Of course, pitching is not that simple- and this is why I’m so bothered by the frequent usage of it and treatment of it as gospel: it is far from perfect. You see, FIP is merely one component of pitching- sort of like on-base percentage or slugging percentage for hitters. It works better than ERA as a predictor for next-season ERA, yes, but it’s far from ideal and far from definitive. Pitchers have different ball in play distributions, and pitchers have different sequencing patterns. FIP just looks at the basic events and makes a reasonable guess as to how they “should” have performed. It’s essentially a shorthand version of McCracken’s dERA.
And what of xFIP?
xFIP attempts to regress the one batted ball portion of the equation, home runs, to the league average. This helps predict next-season ERA pretty well, but that doesn’t mean it works the same for all pitchers. Again, it’s largely ignoring batted ball types aside from outfield flies. It wouldn’t surprise me one bit if pitchers like Cain- who induce a lot of popups- have a tendency to suppress their HR/FB rate. Just looking at some pitchers from 2006-2010, the top 10 pitchers in infield fly rate average a 9.4% HR/FB rate, which is about 2% below the league average (keep in mind this is a crude look; it certainly warrants a deeper investigation). xFIP takes a simple construct and makes it slightly more complex.
Really, we shouldn’t pay that much attention to FIP and xFIP. Do they have substance? Yes, in that they have some predictive value and they’re one thing to look at among many when evaluating a pitcher. But the simplicity of the formulae prevent them from being more diverse in the pitchers they predict accurately. On average, they work pretty well- but that doesn’t mean that pitchers like Cain fall under that profile. Cain and pitchers like him are not terribly overrated because they outperform their FIP and xFIP- they’re underrated, because the metric doesn’t account for a number of things that are a part of pitching.
And another thing: there is really no reason whatsoever why FIP should be used as a value metric. FanGraphs uses it for WAR, and this makes absolutely no sense to me. If we’re looking for a pitcher’s context-neutral value, we must make an adjustment for the types of balls in play he allows. FanGraphs uses a very complex metric in UZR; why not use tERA for pitchers? It’s a simple adjustment, and theoretically speaking, it makes a heck of a lot more sense to use that for pitcher value than something that assumes all balls in play are equal.