Skip to content

An Announcement

March 13, 2011

Triples Alley may be coming to an end.  Grant Brisbee of McCovey Chronicles has asked me to come on board as his “stats guy,” and I have happily accepted his offer.  I feel guilty for being unable to write here on a consistent basis without anything in between to keep you guys engaged- and moving to a site where there will be constant content, regardless of whether or not I’ll be able to write multiple times a week, works better for my peace of mind and for my readers.

I say the site “may” be coming to an end because 1) I don’t think I could fully say goodbye (yes, I am a sentimental kind of guy), and 2) because I may still post here on occasion.  Since McCovey Chronicles is a Giants-centered site, and THT Live is analysis-centered, this site may remain as an outlet for somewhere in-between.  I don’t know- I haven’t made that decision just yet.  There’s also 3) the possibility that Dylan may come back at some point in time, and I’d like to keep the site open for him should he find the time to write again.

I want to thank everyone that has read and supported our site.  What first began as a simple outlet for Dylan and myself has turned into something larger than I ever expected.

If you’re wondering when my first post will be over at The Chronicles, it’ll likely be sometime in late March.  This is my last week of undergraduate studies- in other words, it’s finals week for me.  I’ll be heading out to Arizona to watch some Spring Baseball, and I should find some time to put together my first post there.

Understanding Linear Weights

March 9, 2011

It’s so nice to have a night where I actually have some free time to write.  I’ll be able to start writing with regularity in about a week and a half or so, right around the time I make my yearly trip to Arizona for a few days of good ol’ Spring Training.  I don’t really have anything in particular to write about, and I’m not really in the mood to do player analysis, so I thought I’d write a little bit about a very important sabermetric principle that’s found its way into essentially every aspect of sabermetrics- linear weights.  And the more I think about it, if you understand linear weights (hereafter referred to as “LW” or “LWTS”), you’ll understand a lot about sabermetrics.

A brief history on LW, and how they are applied

The history of LW begins well before a night custodian by the name of Bill James, when a man named Ferdinand Cole Lane built a weighted system for measuring the impact of hitting events.  This was later picked up by George Lindsey, who recorded detailed play-by-play data on over 1,000 Major League games and produced what is referred to as a run expectancy matrix, which tells us the probability of scoring from a particular base-out state.  In 2010, for example, a team was expected to score .49 runs from a man on first and no outs until the end of the inning, and 2.4 runs with the bases loaded and no outs.  This merely quantifies what we know- teams are more likely to score more runs in situations like 123_0 (bases loaded, no out) than they are in situations like 001_2 (man on first, two outs).  Lindsey then took the average increase in run expectancy from each event to find the average value of each event.  And really, that’s all there is to it.  Linear weights are merely the empirical average impact an event has towards the run-scoring process.  Pete Palmer expanded upon Lindsey’s work in the 1984 classic Hidden Game of Baseball, which introduced the Linear Weights System.  What separated Palmer from the rest of the pack is that he included negative events into the equation, so that players were held accountable for the outs he made while at the plate; not just the positive outcomes.  Palmer’s original equation (sans outs on bases):

LWTS = .46*1B + .80*2B + 1.02*3B + 1.40*HR + .33*(BB + HBP) + .30*SB – .60*CS – .25*(AB – H)

Singles are worth about .46 runs, doubles about .8, a home run adds about 1.40 runs on average each time, and walks and hit batsmen create about .33 runs each time; slightly less than that of a single.  Later analysis revealed that Palmer’s original SB/CS values were too high; if I remember right, he increased the figures arbitrarily in an attempt to account for basestealing in high leverage (or pressure) situations.  The reason why linear weights works, compared to the traditional statistics, is explained beautifully by Palmer:

“What Linear Weights does is to take very offensive event and treat it in terms of its impact upon the team- an average team, so that a man does not benefit in his individual record for having the good fortune to bat cleanup with the Brewers or suffer for batting cleanup with the Mets.  The relationship of individual performance to team play is stated poorly or not at all in conventional baseball statistics.  In Linear Weights it is crystal clear: the linear progression, the sum, of the various offensive events, when weighted by their accurately predicted run values, will total the runs contributed by that batter or that team beyond the league average.” (67)

Players have absolutely no control over where they hit in the lineup.  Think of it this way- Bengie Molina was the cleanup hitter for the Giants for a number of years.  Had he been on another team, would he have hit in the same spot in the order?  And would he have collected as many RBI?  Remember, RBI opportunities are highly dependent on one’s slot in the lineup.  The same goes for runs scored- yes, good baserunners will score more runs than bad ones.  But the player’s teammates are the ones that have to put the ball in play first.  So it is foolish to rate players based on team-dependent numbers.  Batting average is useless towards player value as well- yes, it tells us the rate of hits by the player, but what about the impact of the hits and the walks?  OPS sure is nice, but it doesn’t tell us the amount of runs the player helps generate.  Linear weights provides us with a player’s runs above or below the league average based on the ratio of his positive run output to his outs created.  If a player is +0 LWTS, this simply means he hit at exactly the league average rate.  If a player is -10 LWTS, he provided 10 less runs than a league average player in the same amount of opportunities, and if he’s +10, he’s provided 10 more runs than a league average hitter.  That’s all there is to it.

How I generate LW

This is where things get technical, so you may just want to skip ahead.  There are various ways to generate LW values- there is the empirical method, as outlined above (and described in more detail here).  This is the most “correct” method.  But since not everyone is a programming genius (guilty), there are other methods.  One is to use a Markov model to simulate the impact of each event.  This takes a heck of a lot of calculations, so it might not be for you- but there is one very basic Markov calculator on the internet that will spit out marginal values for you.  The more simple method, and the one that I use, is the “plus-one” method outlined by Brandon Heipp, which squeezes out the marginal events from a dynamic run estimator- in this case, Base Runs (BsR).  Why BsR?

Because it’s a very simple run estimator, is extremely flexible for the run environment, and works with a true model of run scoring.  The original dynamic run estimator, Bill James’ Runs Created, works as follows:

Runs = (A*B)/C

Where “A” are the times on base, “B” is the advancement factor, and “C” are the opportunities; plate appearances.  The problem with RC is pretty simple- it doesn’t treat home runs correctly.  The simple equation shown above works, yes, but it doesn’t model baseball as well as it could.  RC seems to forget that a home run creates a run every single time- excluding it is taking out a major aspect of the game.  And this is one of the reasons why BsR works so well.  It is constructed as:

Runs = A * (B/(B+C)) + D

Where “A” and “B” are the same as RC, “C” are outs made, and “D” are home runs.  In short, it is essentially:

Runs = Times on Base * Score Rate + Home Runs

And it just so happens that it spits out marginal run values that match up perfectly with the empirical run values.  Anyways, let’s say we use the simplest BsR formula out there to extract run values for the 2010 MLB season:

A = H  – HR + BB + HBP

B = .88*1B + 2.42*2B + 3.96*3B + 2.2*HR + .11*(BB + HBP) + .99*SB – .99*CS

C = AB – H + CS

D = HR

First, we can reconcile the coefficients in the “B” term so that it matches actual league runs scored.  To find our required “B,” we simply use (R – D)*C/(A – R + D) to solve for it.  Divide this by the estimated “B” and we get 0.88, which we multiply all of our coefficients by.  This is for accuracy purposes only.  Once we have our new coefficients, we extract each run value through this (pretty darn intense) formula:

LW = ((B+C)*(A*b + B*a) – (A*B)*(b+c))/((B+C)^2) + d

Where the capitalized terms are the sum of the factor (“B,” for example, would be the frequency of the event times the coefficient in the B term), and the lower case terms are the coefficient of the factor (i.e. .88 for singles, 2.42 for doubles, etc.).  Doing so yields us the following equation:

LWRC = .47*1B + .75*2B + 1.04*3B + 1.40*HR + .33*(BB + HBP) + .18*SB – .28*CS – .09* (AB – H)

You’ll notice that the title and the out terms look a bit different.  The title stands for “Linear Weights Runs Created,” and the out term is -.09 because it is expressed in absolute terms.  In order to make it relative to average, we find the overall runs per out- or runs scored divided by C- and add this figure to the events in C.  For 2010, runs per out (excluding pitcher hitting) is .178.  That gives us this:

LW = .47*1B + .75*2B + 1.04*3B + 1.40*HR + .33*(BB + HBP) + .18*SB – .45*CS – .27* (AB – H)

And that’s all there is to it.  I know it may seem like a lot, but it really isn’t- especially if you have a spreadsheet set up for it.  Heipp has one in the aforementioned link, and the wOBA calculator that I published a while back does all of this for you.  More terms can be added to spice things up a little bit- for example, the LW formula from Tango’s coefficients give us this slightly more complicated equation:

Tango LW = .48*1B + .77*2B + 1.06*3B + 1.41*HR + .49*ROE + .31*NIBB + .34*HBP – .28*(AB – H – ROE – K + SF) – .29*K

And another equation developed from Retrosheet data that spans from 1911 until 2009 gives us the following formula:

Retro LW = .47*1B + .77*2B + 1.05*3B + 1.40*HR + .50*ROE + .31*NIBB + .34*HBP – .27*(AB – H – ROE – K + SF) – .29*K

All slightly different coefficients that give us slightly different results.  It’s not a big deal, but I wanted to show how different datasets and BsR formulae can influence the run values provided.  When all is said and done, though, you’re not going to see a big difference between them.

Applications beyond hitting

LW values have expanded beyond the realm of just offense- it is applied to defense and to pitching metrics.  With defense, the run value of a play made above average is the difference between a batted ball and an out, or about .75 runs.  For the outfield, it’s about .85 (more doubles and triples, obviously).  With pitching, FIP takes the basic run values, places them above the value of a ball in play and multiplies by 9 to attain its coefficients.  Uber-stat tERA takes the linear weight value of each batted ball to estimate the pitcher’s defense neutral runs allowed.  So LW doesn’t apply just to offense- it has spread to other aspects of the game, as well.  The same applies to baserunning runs as well.

All in all, LW are the best way to measure a player’s offense due to its simplicity and theoretical practicality.  The process to get them is a bit complicated, sure, but it will always provide you with an outstanding overall view of a player’s value provided with the bat.  And it’s a construct that allows you to look at all other aspects of the game, as well.

Productive Outs and Double Plays

March 2, 2011

Expect to see more regular posting from me around late March and early April- I’ve only got a few weeks left before being completely finished with my undergraduate work.  I’ve got a few minutes of down time, so I thought I’d post some data.  I track both productive outs created and double plays avoided per opportunity when I calculate player value (i.e. WAR; Wins Above Replacement), and figure it’d be nice to share it rather than keep it all to myself.  I used to use a static run value for both productive outs and double plays, but have since found a better way to approximate the values- it just so happens that the run value of a productive out is roughly equal to the difference between a sacrifice hit (around -.06 runs in 2010) and a strikeout (around -.27 runs), about +.21 runs.  That’s not a whole lot, mind you, but it is something.  The run value of a double play avoided (above or below average) is approximately the difference between a sacrifice hit and a double play (about -.44 runs), +.38 runs.

The top five leaders in productive outs:

Elvis Andrus, Julio Borbon, Juan Pierre, Mark Teixeira, and Erick Aybar at +2 runs apiece.  All of these guys, with the exception of Teixeira, are small-ball type of players.  The trailers?  Adrian Beltre, Mike Aviles, Troy Glaus, B.J. Upton, and Aaron Hill at -2.  That said, the overall difference between the best and worst hitters at making productive outs is about four runs; about half a win.  That’s a noticeable difference and something that should be accounted for in player valuation, since we’re always striving to increase theoretical accuracy.  The best team were the Rangers at +8 runs; the worst were the Brewers at -9.  That’s a gigantic difference (about two wins).

When it comes to avoiding double plays, Carl Crawford (+5), Curtis Granderson (+5), Carlos Peña (+4), Jonny Gomes (+4), and Brennan Boesch (+3) lead the pack.  Billy Butler (-7), Ivan Rodriguez (-6), Adrian Beltre (-4), Wilson Valdez (-4) and Michael Cuddyer (-4) were the worst.  The best team at avoiding them were the Rays (+11); the worst were the Giants (-13).

As a whole, the best player was Carl Crawford (+6); the worst Adrian Beltre (-6).  That’s close to a win and a half in difference.  Again, this is something that we need to pay attention to in player valuation.  The difference between the best (Rays at +14) and worst team (Twins at -11) was 25 runs; almost three wins.  That’s a lot.

Sorry for the rushed post, guys.  I’ve got a lot going on right now.  You can find the whole spreadsheet here; I hope you find it as interesting as I do.

Moving Michael Young

February 21, 2011

My THT Live post.  It’s a good thing the Giants aren’t on his list- although he’d make a pretty decent utility infielder if he didn’t cost so darn much.

Notes: Torres’ Forecast, Sandoval and Situational Hitting

February 20, 2011

It feels really nice to finally have some free time.

Forecasting Andres Torres

I’ve noticed recently that projection systems just aren’t very high on Andres Torres.  If you’re familiar with the way forecasting systems work, it makes perfect sense- they weight multiple years of data (with the most recent year weighted the heaviest), regress to the mean (Marcel regresses to the mean of all non-pitchers as hitters; Oliver and CAIRO to the positional mean, and ZiPS and PECOTA to the players they compare best to historically) and add an aging factor.

Andres Torres doesn’t really have any of these things going for him.  He has relatively little Major League experience (1,025 PA- 740 of which came within the last two years; all previous PA occurred between 2002-2005), which means we have less to work with in terms of making an educated guess about his skill level.  This means we have to regress him more towards the mean, which means there is less certainty about his forecast; and, given that he is 33 years old, he doesn’t have much upside left to him (at least, based on standard aging factors).  These forecasts are completely unaware that Torres has revamped his swing entirely and is taking medication for ADHD.  And really, you can’t blame them for not knowing this.  Forecasting systems use the information that is available to them, and they have no clue as to when a player makes a mechanical adjustment or has a mental breakthrough of some sort.  This is why scouting is so imperative.

Another factor that has been bringing Torres’ projection down would have to be the appendicitis he dealt with in September.  Through August, Torres was hitting .284/.365/.502; a wOBA of .369, about 28% greater than the league average.  Torres hit .164/.188/.328 in 69 plate appearances in September and October, which dropped his overall line to .268/.343/.479, only ~15% above the average.  Correlation doesn’t equal causation, but it’s pretty clear from the reports and numbers that his major decline in the last two months of the season was most likely due to his health.  Believe it or not, this does affect his forecast.  I can’t speak for the advanced forecasters, but Marcel’s revised projection- if we exclude his last two months- changes from a .349 wOBA (.264/.337/.465) to .355 (.273/.349/.477).  That’s three runs per 600 plate appearances.  I know that might not seem like a lot, but it certainly says something about the sensitivity of a forecast.  Bottom line: take forecasts seriously, but know that they’re certainly fallible, especially in cases like this.  I would expect Torres to regress some in 2011, but not as severely as the forecasts expect him to.

Pablo’s (Hopeful) Reformation

Speaking of uncertainty in forecasts, Pablo Sandoval is perhaps the most difficult player in the Major Leagues to project.  Sandoval apparently weighed around 280 pounds at the end of last season.  Andrew Baggarly wrote a fantastic article on the matter today, and here are some of the highlights:

He couldn’t take a half-dozen ground balls without panting, hands on knees. His chronically sore hips locked up his swing, especially from the right side.

I actually had no idea that it was that bad.  That’s just terrible.

“He ate in a way that crushed his metabolism,” Banning said. “He’d not eat breakfast, sleep till he got to the ballpark, go out at night and eat a mammoth meal, probably some adult cocktails. That’s the way it went down.”

Sandoval couldn’t do three pull-ups in early November. Now he does sets of 10. His legs shook when he tried to squat 135 pounds. Now he is squatting 400. The first day, Sandoval struggled to complete two reps of an exercise called the inverted row. He maxed out at 26 last week.

His flexibility and range of motion vastly increased, too. Sandoval, a switch-hitter, complained of constant hip pain last season, and now acknowledges that the problems wrecked his right-handed swing. (He hit .379 from the right side in ’09 but just .227 last season.)

“It was bad, my hips,” Sandoval said. “I (couldn’t) even get through to the ball. Now I can swing hard. Now I get loose and nothing is sore.”

Sandoval received chiropractic alignments and deep-tissue rubs — what Banning called “hurt-you” massages — to correct the dysfunction in his hips. Three months ago, he couldn’t touch his fingertips to his toes. Now he palms the floor.

Sandoval now stands at a much-improved 240 pounds.  Given that he’s 5’11”, this isn’t an ideal weight- but geez, talk about an improvement.  It sounds like he’ll have a personal chef with him while he’s in San Francisco; let’s hope that he continues to eat well while on the road.  It also sounds like he’s spent some time talking to Barry Bonds about his free-swinging tendencies.  I guess you could say that one word describes his future: discipline.  If he is able to maintain a strong work ethic not only at the dinner table but show a bit more discipline at the plate, we could be looking at a reformed player.  And with his contact abilities the way they are, he could really become an elite hitter.

Productive Outs and Double Plays

The Giants added +4 runs above the league average when it came to making productive outs last season; about half a win.  The best in the Majors were the Texas Rangers at +8 runs and the worst the Milwaukee Brewers at -9 runs.  The difference between the best and worst teams at making productive outs is approximately two wins.  The Giants were -13 runs below the average at hitting into double plays, tied with the Baltimore Orioles for the worst in the Major Leagues.  The best team at avoiding them were the Tampa Bay Rays at +11 runs.  It would behoove the Giants to avoid double plays in 2011, but that might be a difficult feat- recently signed Miguel Tejada is a double play machine, and early reports of him hitting near the middle of the order have me pretty darn worried.

A Quick Announcement

February 11, 2011

If you got excited, I apologize for the misleading title.  I’m working full-time now in addition to finishing up my undergraduate work- so I haven’t had much time for, well, really anything.  There’s a possibility that I’ll be able to put up a post sometime next week, but I’m not sure just how good those chances are.  I’ve got a bit of a project going that I intend on posting at THT Live that should spark some discussion, and something on the Giants I’d like to do- so you’ll hopefully see that sooner rather than later.

Since I hate putting up posts that have little to no content to them, I thought I’d post a little early Christmas wishlist for FanGraphs (I originally posted this in a thread in their forums):

Some (hopefully) realistic wishes, some probably unrealistic:

1. Use empirical run values for LW; not approximated ones in which the values are held constant from one another. Have the league wOBA set to .330 for every year- it’s far easier to interpret, and it’s a very easy fix.

2. Situational hitting data- knowing how often the player makes productive outs or avoids double plays is really useful. It really makes things more “complete.”

3. Regress UZR. MGL himself has been saying this for quite some time.

4. IF the funds are available, purchase data from STATS and generate sUZR figures to work in tandem with bUZR for fielding values. It’s expensive as heck, though.

5. No more FIP in WAR. If we’re looking for theoretical accuracy, we want to take the pitcher’s batted ball distribution into account. I’d recommend a BsR-derived version of tRA…no linear equations.

6. A baserunning metric incorporating hit location would be wonderful.

7. A first baseman’s “scoop” opportunities. Otherwise, the “scoops” data doesn’t have much meaning to it.

8. A pony would be nice.

I love FanGraphs, but with all the changes BP is bringing to its site to enhance its metrics I’d like to see FG do the same. Some of the stuff on my wishlist is asking a bit much- situational hitting, sUZR (especially that), scoops, and baserunning runs- but the other stuff, I think, should be relatively easy changes.

It sounds like they’ll be implementing one of them- guess which one that is?  Any ideas for other changes?

If I get a bit of down time, I’ll explain why I think FG would benefit from some of these “fixes” down in the comments.

Matt Cain and FIP/xFIP

February 3, 2011

I’m coming out of my study cave for a bit so I can add a quick two cents to a topic that got some play yesterday.

Paapfly has a post regarding Matt Cain and his consistent ability to outperform his FIP and his xFIP.  A number of people- by and large non-Giants fans- have referred to Cain as being “lucky” for owning Earned Run Averages significantly lower than his FIP and his xFIP; that he’s due to regress.  Cain will undoubtedly regress at some point in his career, but it wouldn’t surprise me one bit if he continued this trend for a few more seasons.  Personally, I don’t really care for FIP or xFIP due to the way it is handled- and I figure this is as good a time as any to address my issues with the way people are using it.

First of all, what is FIP?

It stands for Fielding Independent Pitching, and it is a DIPS (Defense Independent Pitching Statistic) created by esteemed saberist Tom Tango.  The formula is excruciatingly simple:

((13*HR + 3*(BB+HBP) – 2*K)/IP) + C

Where C is a constant designed so that the league FIP equals the league ERA.  The constant typically sits around 3.2, but it varies depending on the league, year, and the run environment.  There seems to be a common misconception that the numbers are “drawn out of thin air” or that it’s based on an extraordinarily complex formula- really, it’s not.  If you know Tango, you know he likes to keep things simple.  The coefficients are derived from linear weights, which are the average run value of an event.  The run values for different events in 2010, for example, are as follows: .48 for singles, .77 for doubles, 1.06 triples, 1.42 home runs, .33 for walks and hit by pitches (I’m including IBB in these figures), and -.27 for outs.  Multiply each run value by the frequency of the event per ball in play, and you’ll find the run value of balls in play is around -.03 runs.  Add this to the FIP coefficients.  This gives us .36 BB+HBP, 1.44 HR, -.25 non batted ball outs, and 0.00 for balls in play.  Multiply by nine.  Now you have 3.2 for BB+HBP, 13.0 for HR, and -2.2 for non batted ball outs.  And…there you go.  Tango rounds the coefficients for simplicity’s sake, but you’re really never going to see a major discrepancy in the coefficients, at least at the Major League level.

That’s all there is to it.  It’s a linear run estimator that regresses all balls in play to the league average.  100% regression to the mean, which is what Robert “Voros” McCracken did with his original dERA.  Of course, pitching is not that simple- and this is why I’m so bothered by the frequent usage of it and treatment of it as gospel: it is far from perfect.  You see, FIP is merely one component of pitching- sort of like on-base percentage or slugging percentage for hitters.  It works better than ERA as a predictor for next-season ERA, yes, but it’s far from ideal and far from definitive.  Pitchers have different ball in play distributions, and pitchers have different sequencing patterns.  FIP just looks at the basic events and makes a reasonable guess as to how they “should” have performed.  It’s essentially a shorthand version of McCracken’s dERA.

And what of xFIP?

xFIP attempts to regress the one batted ball portion of the equation, home runs, to the league average.  This helps predict next-season ERA pretty well, but that doesn’t mean it works the same for all pitchers.  Again, it’s largely ignoring batted ball types aside from outfield flies.  It wouldn’t surprise me one bit if pitchers like Cain- who induce a lot of popups- have a tendency to suppress their HR/FB rate.  Just looking at some pitchers from 2006-2010, the top 10 pitchers in infield fly rate average a 9.4% HR/FB rate, which is about 2% below the league average (keep in mind this is a crude look; it certainly warrants a deeper investigation).  xFIP takes a simple construct and makes it slightly more complex.

Really, we shouldn’t pay that much attention to FIP and xFIP.  Do they have substance?  Yes, in that they have some predictive value and they’re one thing to look at among many when evaluating a pitcher.  But the simplicity of the formulae prevent them from being more diverse in the pitchers they predict accurately. On average, they work pretty well- but that doesn’t mean that pitchers like Cain fall under that profile.  Cain and pitchers like him are not terribly overrated because they outperform their FIP and xFIP- they’re underrated, because the metric doesn’t account for a number of things that are a part of pitching.

And another thing: there is really no reason whatsoever why FIP should be used as a value metric.  FanGraphs uses it for WAR, and this makes absolutely no sense to me.  If we’re looking for a pitcher’s context-neutral value, we must make an adjustment for the types of balls in play he allows. FanGraphs uses a very complex metric in UZR; why not use tERA for pitchers?  It’s a simple adjustment, and theoretically speaking, it makes a heck of a lot more sense to use that for pitcher value than something that assumes all balls in play are equal.

Zone Rating Runs, 2010

January 30, 2011

As a follower of sabermetrics, I am very much in favor of progressive metrics designed to better value a player’s contributions on the field.  Certain metrics created over the years have fallen out in favor of newer, more accurate ones.  Runs Created, for example, had a nice run- but it has since been supplanted by David Smyth’s BaseRuns and, when it comes to individual hitters, Linear Weights.  The same holds true for fielding metrics.  We first began with fielding percentage, which was improved upon by range factor, which turned into fielding runs, which turned into Zone Rating, which turned into both Ultimate Zone Rating (UZR) and Defensive Runs Saved (DRS).  The difference and usefulness of using BsR over RC is pretty straightforward- one obvious reason is that BsR counts all home runs as exactly one run; RC doesn’t.  It gets a lot more confusing with defense, though, because we’re not really sure what we’re dealing with.

Basic ZR counts the number of balls hit into and around a fielder’s “zone of responsibility,” tracks the plays made, and divides plays by zone chances to estimate the player’s defensive efficiency.  UZR and DRS take this simple framework and make a multitude of adjustments, accounting for things like batted ball speed, the base/out situation, and using smaller sub-zones in order to get a better idea of just how efficient the player was.  Rather than underrating players with exceptional range (because we’re counting both balls in zone and out of zone plays made as zone chances), we’re getting a much more accurate portrayal of the player’s value provided on the field because of the inclusion of so many more variables.

But this might not be true.  Chances are pretty good that we’re really not getting a better idea of a player’s fielding ability; chances are we’re introducing a bunch of noise.  Think of it this way: if a player is standing on the edge of his zone and takes one step over to make a play considered to be outside of his zone, he gets credit for an out of zone play.  Except…all he had to do was move a foot over.  That’s not flashing exceptional range; it’s receiving false credit.  The difference between “in zone” and “out of zone” is difficult to ascertain; this becomes infinitely more problematic when we break down the zones into smaller ones and add things like batted ball speed.  That’s not the only problem, though- there’s a rather large discrepancy between data providers that supposedly track the same thing.

ZR was originally provided by STATS, Inc. before also being provided by Baseball Info Solutions (BIS).  The major difference between the two, on the surface, is that BIS ZR excludes out of zone plays from the numerator and the denominator; leaving us with both defensive efficiency within the zone, along with a number of out of zone chances he made.  This is theoretically more precise, but it is far from practical due to the aforementioned issues with the precise location of the batted balls.  There’s another difference, which makes matters infinitely worse: the two are sometimes miles apart on players.  Much has been made of this when it comes to UZR- you can see some of the differences highlighted in one of Tango’s threads, in which we discover that Andruw Jones was rated +112 runs by BIS UZR (bUZR) and -5 by STATS UZR (sUZR) from 2003-2008.  That’s a remarkable difference.  Carlos Beltran showed 77 runs of difference, and Adam Everett 51 runs.  This is no small difference- remember that approximately ten runs equals a win, and you’re seeing 5-10 win differences over a six-year period.  Defensive metrics cannot be considered reliable when two major information providers, STATS and BIS, give such wildly different outcomes.

I wrote about the state of defensive metrics a while back and stated that I favor ZR over UZR or DRS due to the uncertainty in the metrics.  ZR, by using one large, simple zone, should help minimize the effects of bias- at least, by a little bit.  And to help smooth out the differences between providers, I’ve decided to average the two in order to get a more complete estimate of the player’s defensive efficiency.  Just by looking at the 2010 data, there’s some serious discrepancies that worry me- the average error of estimated chances of qualified players (900+ innings afield) is about 22; the average error of plays made (which should be reasonably easy to determine) is 21, and the average error of runs saved or cost is five runs.  And this is comparing apples to apples with the simplest comparison imaginable.  It’s no wonder that sUZR and bUZR are so remarkably different.

The players that sZR and bZR disagreed on the most in 2010: Hunter Pence (15 runs), Adrian Belte (14), and Gaby Sanchez, Robinson Cano, and Marco Scutaro (13 runs).  In all, there are eleven players in the sample- approximately 9%- that have a difference of ten runs or greater, and another twenty that have a difference of 7-9 runs.  The systems will most likely show varying levels of agreement/disagreement based on position, as well.  This is something I intend on looking deeper into at some point in the future.

The all ZR team for 2010:

1B: Daric Barton, +10

2B: Mark Ellis and Ian Kinsler, +10

3B: Jose Lopez, +16

SS: Brendan Ryan, +17

LF: Juan Pierre, +7

CF: Denard Span, +6

RF: Jay Bruce, +12

You can find the spreadsheet containing data for all players in 2010, including their STATS and BIS chances and plays made, here.

Down on the Farm

January 22, 2011

Prospecting has always intrigued me.  It’s an excruciatingly inexact science due to the fact that it revolves solely around the opinions of men that, while experienced in assessing potential performance, are always hindered by bias.  I think that’s one reason why I’m so enamored by the statistical side of the game- yes, bias exists, but it becomes apparent rather quickly (if you know what you’re doing). Multiple scouts, in theory, are better than one- but since they talk amongst themselves, you begin to introduce the biases of others, so you’re not only looking at the bias of one, but the bias of many. Despite all of these issues, scouting places invaluable, irreplaceable and indispensable information in the hands of men that run Major League Baseball organizations.  They find the kids that need a slight mechanical tweak in their swing to turn them from a fringe hitter into a batting champion; they find the kids that have great results but with mechanics that make them a high-risk player, and they find the kids with raw talent that they can turn into a superstar.  An inexact science, yes, but one that produces much better results than the straight numbers would.

Once these players move from high school or college into the pros, a plethora of information about their performance becomes available to us.  We know how well each player does in the minors, and we can see signs that a player might sink or swim in the Majors.  A low walk rate and a high strikeout rate, for example, might indicate that a player has poor command of the strike zone and may not be able to produce consistent contact at the Major League level.  A player with a lot of stolen bases but a lot of times caught stealing as well might indicate that he still has much to learn.  And a player that walks a lot and demonstrates good power stands a good chance at developing into a useful player, provided he makes enough contact.  Of course, there are things that the numbers will never catch- mechanical tweaks, a new mental approach, etc.- but if handled properly, the numbers can help lend some insight as to which players might be able to produce in The Show.

Back in 1985, the legendary Bill James introduced Major League Equivalencies, known as MLEs for short (yeesh, MLEs have been around longer than I’ve been alive).  You can read up on some of the basics of MLEs by ZiPS creator Dan Szymborski here. The take-home message:

One thing to remember is that MLEs are not a prediction of what the player will do, just a translation of what the major league equivalence of what the player actually did is. This is useful for predictions however, because like, major league statistics, MLEs have strong predictive value. As strong as major league statistics (which was the goal of this).

I’m actually not sure if MLEs have as strong a predictive value as Major League statistics do.  In any case, the important thing to keep in mind is that these are translations; not projections.  That requires something more complex.  It just so happens that one of the publicly available (for a fee of $15, that is) projection systems, Oliver, does just this.  It not only translates based on context of the parks and leagues; it regresses and makes adjustments for aging.  If we want to predict how a player will perform in the Major Leagues based on his Minor League data, this is the way to do it- and Oliver uses a very interesting, theoretically sound (and apparently quite accurate) method:

Oliver calculates league factors by a direct comparison of each player’s performance in each minor league with his performances in the major leagues, while most other systems use a “chaining” process in which only performances in adjacent levels are compared. An example of chaining is when High-A is compared to Double-A, which is compared to Triple-A, which is compared to the major leagues, to get a High-A to majors factor. A problem with chaining is that each additional element in the chain multiplies any selection bias that might be present. A direct comparison, without adjusting for the age of the players at each level, gives a better estimate than chaining of how well the player will perform if and when he gets to the major leagues. Adjusting for age allows an estimate of the player’s true talent now.

Once age and park factors have been accounted for, Oliver calculates the translation factors for each league. Applying these factors to each player’s park-adjusted stats in each league, then summing into a single stat line, produces the player’s MLE for that season.


The values that each player is regressed to are determined by information about the player other than his performance—his age, position, and level played. A 19-year-old in Double-A will be regressed to a higher mean than a 23-year-old at that same level, as the team presumably considered the 19-year-old to have more talent in aggressively promoting him. Similarly, a first baseman will be regressed to a higher home run rate than a shortstop, as we know that, on average, first basemen hit more home runs than shortstops.

Emphasis mine.  Oliver might be a chimp, but he’s an awfully smart one.  Since the Giants have some interesting pieces in the minor leagues, like Brandon Belt and Thomas Neal, I figured it would be worth looking into their projections to see if Oliver likes what he sees.  Since Oliver isn’t free, I won’t be posting any of the raw data- but what I will do is post the player’s projected batting runs per 600 plate appearances (approximately 150 games) their projected fielding values, regressed heavily (also approximately 150 games), and the player’s projected WAR based on FanGraphs’ replacement level- 20 runs per 600 PA.  In other words, a two-win player is exactly average.  Pitcher runs allowed is based on a simple BaseRuns equation and the replacement level (128% of league average) does not differentiate between roles.  I’m excluding all players in Low A and Mid A ball.  First, the pitchers:

The run environment is 4.53 runs per game, and only a few pitchers project to be slightly better than average.  Jason Stoffel was a fourth-round pick out of the University of Arizona, Alex Hinshaw we’re all familiar with, and 21-year old sinkerballer Jorge Bucardo projects to have a 4.47 R/G, which would likely translate into a 4.11 ERA.  I know a number of people that are Bucardo fans, and given the Giants’ ability to develop pitchers, it wouldn’t surprise me one bit if he turned into a fine Major League starter.

Now, the hitters:

That’s not a bad projection for Kieschnick.  He’s a below average Major Leaguer by these estimates, but not a replacement level player.  Oliver doesn’t like Brandon Crawford’s bat one bit- he’s essentially a replacement level hitter- but over the course of a full season projects to be slightly below average.  Oliver loves his glove- it likes him at +20 runs in about 120 games- but I prefer to be extra conservative with defensive estimates until we have more refined and reliable measurements.  A +8 per 150 games sounds about right to me, given the sterling reports on his glovework.  Toolsy and free-swinging Francisco Peguero makes the list, mostly based on the value of his glove, much like Crawford.  Oliver seems to like Brock Bond quite a bit- it expects him to have an above average walk rate (~10%) with moderate contact skills and virtually no power.  If he’s a roughly average hitter as his projections suggest, the Giants might have a decent backup infielder in their system that may have a bit of upside left to him.  Conor Gillaspie grades out to being slightly above average based on his positional value- I’m not sure if his glove is as good as Oliver suggests it is, but he may too prove to be an adequate Major Leaguer.  Oliver seems to think that Johnny Monell would make a fine big league catcher (the position adjustment is too high, so he’ll be overvalued here a bit- catchers rarely play 150 games), but I’m not sure scouts would agree with that assessment.

I think the biggest surprise to me is seeing Thomas Neal rated so highly.  Oliver clearly sees something good in him- it has him as a 115 RC+ Major Leaguer right now, and that’s solidly above average.  I haven’t heard recent scouting reports on him, but he turned in a decent (park-adjusted) batting line of .294/.361/.442 in 2010.  He lost quite a bit of power last year, some of which I assume is due to playing in Richmond, and Oliver expects him to regain some of that in 2011.  If he’s really as good as Oliver thinks he could be- around a 3 win player- they might have their left fielder of the future.

Brandon Belt really comes to no surprise.  If he’s a true talent 3 win player as Oliver suggests, the Giants may benefit from having him start the season in the Majors and playing all season.  A lot relies, of course, on whether or not the scouts think he could make the leap.  Again, none of these projections are gospel- on the contrary, they’re far from it.  But I think they do a decent job at estimating the player’s true talent levels heading in to 2011; if so, the Giants might have some interesting pieces for the future.


*The run environment is based on a weighted average of the last three seasons.  The equation for hitters:

LW = .47*1B + .75*2B + 1.04*3B + 1.41*HR + .33*(BB+HBP) + .18*SB – .46*CS – .28*(AB – H)

For pitchers, the equation can be found here.

Mark Reynolds and Strikeouts

January 15, 2011

I give my quick two cents over at The Hardball Times.  If he implements the changes he wants to achieve, he could see a pretty nice increase in value.