# Matt Cain and FIP/xFIP

I’m coming out of my study cave for a bit so I can add a quick two cents to a topic that got some play yesterday.

Paapfly has a post regarding Matt Cain and his consistent ability to outperform his FIP and his xFIP. A number of people- by and large non-Giants fans- have referred to Cain as being “lucky” for owning Earned Run Averages significantly lower than his FIP and his xFIP; that he’s due to regress. Cain will undoubtedly regress at some point in his career, but it wouldn’t surprise me one bit if he continued this trend for a few more seasons. Personally, I don’t really care for FIP or xFIP due to the way it is handled- and I figure this is as good a time as any to address my issues with the way people are using it.

First of all, what is FIP?

It stands for Fielding Independent Pitching, and it is a DIPS (Defense Independent Pitching Statistic) created by esteemed saberist Tom Tango. The formula is excruciatingly simple:

((13*HR + 3*(BB+HBP) – 2*K)/IP) + C

Where *C* is a constant designed so that the league FIP equals the league ERA. The constant typically sits around 3.2, but it varies depending on the league, year, and the run environment. There seems to be a common misconception that the numbers are “drawn out of thin air” or that it’s based on an extraordinarily complex formula- really, it’s not. If you know Tango, you know he likes to keep things simple. The coefficients are derived from linear weights, which are the average run value of an event. The run values for different events in 2010, for example, are as follows: .48 for singles, .77 for doubles, 1.06 triples, 1.42 home runs, .33 for walks and hit by pitches (I’m including IBB in these figures), and -.27 for outs. Multiply each run value by the frequency of the event *per ball in play*, and you’ll find the run value of balls in play is around -.03 runs. Add this to the FIP coefficients. This gives us .36 BB+HBP, 1.44 HR, -.25 non batted ball outs, and 0.00 for balls in play. Multiply by nine. Now you have 3.2 for BB+HBP, 13.0 for HR, and -2.2 for non batted ball outs. And…there you go. Tango rounds the coefficients for simplicity’s sake, but you’re really never going to see a major discrepancy in the coefficients, at least at the Major League level.

That’s all there is to it. It’s a linear run estimator that regresses *all* balls in play to the league average. 100% regression to the mean, which is what Robert “Voros” McCracken did with his original dERA. Of course, pitching is not that simple- and this is why I’m so bothered by the frequent usage of it and treatment of it as gospel: it is far from perfect. You see, FIP is merely one component of pitching- sort of like on-base percentage or slugging percentage for hitters. It works better than ERA as a predictor for next-season ERA, yes, but it’s far from ideal and far from definitive. Pitchers have different ball in play distributions, and pitchers have different sequencing patterns. FIP just looks at the basic events and makes a reasonable guess as to how they “should” have performed. It’s essentially a shorthand version of McCracken’s dERA.

And what of xFIP?

xFIP attempts to regress the one batted ball portion of the equation, home runs, to the league average. This helps predict next-season ERA pretty well, but that doesn’t mean it works the same for all pitchers. Again, it’s largely ignoring batted ball types aside from outfield flies. It wouldn’t surprise me one bit if pitchers like Cain- who induce a lot of popups- have a tendency to suppress their HR/FB rate. Just looking at some pitchers from 2006-2010, the top 10 pitchers in infield fly rate average a 9.4% HR/FB rate, which is about 2% below the league average (keep in mind this is a crude look; it certainly warrants a deeper investigation). xFIP takes a simple construct and makes it slightly more complex.

Really, we shouldn’t pay *that* much attention to FIP and xFIP. Do they have substance? Yes, in that they have some predictive value and they’re one thing to look at among many when evaluating a pitcher. But the simplicity of the formulae prevent them from being more diverse in the pitchers they predict accurately. On average, they work pretty well- but that doesn’t mean that pitchers like Cain fall under that profile. Cain and pitchers like him are not terribly overrated because they outperform their FIP and xFIP- they’re underrated, because the metric doesn’t account for a number of things that are a part of pitching.

And another thing: there is really no reason whatsoever why FIP should be used as a value metric. FanGraphs uses it for WAR, and this makes absolutely no sense to me. If we’re looking for a pitcher’s context-neutral value, we must make an adjustment for the types of balls in play he allows. FanGraphs uses a very complex metric in UZR; why not use tERA for pitchers? It’s a simple adjustment, and theoretically speaking, it makes a heck of a lot more sense to use that for pitcher value than something that assumes *all* balls in play are equal.

That is good stuff and a solid explanation. I was intrigued by this discussion as well and wanted to look further into if pitchers had some sort of control on their home run rate.

The conclusion after some quick looking around is that maybe certain types of pitchers do. Namely fastball pitchers who throw a hard rising fastball have a lower expected home run rate then league average.

The research isn’t perfect as my data wasn’t exactly what I wanted but I thought the results were pretty interesting.

Full explanation of the idea:

http://bit.ly/ePGfZQ

Thank you for this, JT! People need to understand that FIP and xFIP are predictive stats. They’re not to be treated as the sole evaluator of a pitcher’s talent. Furthermore, pitcher WAR is meaningless. In the anti-Matt Cain arguments I’ve read, people have often brought up WAR, and it really doesn’t prove anything. I don’t think people understand how “rough” pitcher WAR is at this point…it’s ridiculous that it treats all balls in play the same.

I was actually just discussing how using FIP in any kind of value calculation doesn’t make a lot of sense, glad to see it’s not just me! I’m a big fan of tERA in that regard, and I think it’d make WAR for pitchers a more useful stat. Ultimately all stats need to be measured against each other and a lot of other things though, so it’s always a question of knowing how much weight to put into anything that is distilling a whole lot of factors down to something concrete.

If we have a tight fit for the discussed metrics for 90 percent of the pitchers, it seems like a good general metric, however if one pitcher falls into outlier status for one year, than perhaps regression to the mean is a reasonable assumption, but if the same pitcher continues to be an outlier in the same direction, ie; better than predicted in this case, then clearly the metric fails to capture independent variables which contribute to that pitcher’s success. I thought your post did an excellent job in pointing out what some of those factors could be. Thank you for your well thought-out post.

This page definitely has all the information and facts I needed about this subject and didn’t know who to ask.