As a follower of sabermetrics, I am very much in favor of progressive metrics designed to better value a player’s contributions on the field. Certain metrics created over the years have fallen out in favor of newer, more accurate ones. Runs Created, for example, had a nice run- but it has since been supplanted by David Smyth’s BaseRuns and, when it comes to individual hitters, Linear Weights. The same holds true for fielding metrics. We first began with fielding percentage, which was improved upon by range factor, which turned into fielding runs, which turned into Zone Rating, which turned into both Ultimate Zone Rating (UZR) and Defensive Runs Saved (DRS). The difference and usefulness of using BsR over RC is pretty straightforward- one obvious reason is that BsR counts all home runs as exactly one run; RC doesn’t. It gets a lot more confusing with defense, though, because we’re not really sure what we’re dealing with.
Basic ZR counts the number of balls hit into and around a fielder’s “zone of responsibility,” tracks the plays made, and divides plays by zone chances to estimate the player’s defensive efficiency. UZR and DRS take this simple framework and make a multitude of adjustments, accounting for things like batted ball speed, the base/out situation, and using smaller sub-zones in order to get a better idea of just how efficient the player was. Rather than underrating players with exceptional range (because we’re counting both balls in zone and out of zone plays made as zone chances), we’re getting a much more accurate portrayal of the player’s value provided on the field because of the inclusion of so many more variables.
But this might not be true. Chances are pretty good that we’re really not getting a better idea of a player’s fielding ability; chances are we’re introducing a bunch of noise. Think of it this way: if a player is standing on the edge of his zone and takes one step over to make a play considered to be outside of his zone, he gets credit for an out of zone play. Except…all he had to do was move a foot over. That’s not flashing exceptional range; it’s receiving false credit. The difference between “in zone” and “out of zone” is difficult to ascertain; this becomes infinitely more problematic when we break down the zones into smaller ones and add things like batted ball speed. That’s not the only problem, though- there’s a rather large discrepancy between data providers that supposedly track the same thing.
ZR was originally provided by STATS, Inc. before also being provided by Baseball Info Solutions (BIS). The major difference between the two, on the surface, is that BIS ZR excludes out of zone plays from the numerator and the denominator; leaving us with both defensive efficiency within the zone, along with a number of out of zone chances he made. This is theoretically more precise, but it is far from practical due to the aforementioned issues with the precise location of the batted balls. There’s another difference, which makes matters infinitely worse: the two are sometimes miles apart on players. Much has been made of this when it comes to UZR- you can see some of the differences highlighted in one of Tango’s threads, in which we discover that Andruw Jones was rated +112 runs by BIS UZR (bUZR) and -5 by STATS UZR (sUZR) from 2003-2008. That’s a remarkable difference. Carlos Beltran showed 77 runs of difference, and Adam Everett 51 runs. This is no small difference- remember that approximately ten runs equals a win, and you’re seeing 5-10 win differences over a six-year period. Defensive metrics cannot be considered reliable when two major information providers, STATS and BIS, give such wildly different outcomes.
I wrote about the state of defensive metrics a while back and stated that I favor ZR over UZR or DRS due to the uncertainty in the metrics. ZR, by using one large, simple zone, should help minimize the effects of bias- at least, by a little bit. And to help smooth out the differences between providers, I’ve decided to average the two in order to get a more complete estimate of the player’s defensive efficiency. Just by looking at the 2010 data, there’s some serious discrepancies that worry me- the average error of estimated chances of qualified players (900+ innings afield) is about 22; the average error of plays made (which should be reasonably easy to determine) is 21, and the average error of runs saved or cost is five runs. And this is comparing apples to apples with the simplest comparison imaginable. It’s no wonder that sUZR and bUZR are so remarkably different.
The players that sZR and bZR disagreed on the most in 2010: Hunter Pence (15 runs), Adrian Belte (14), and Gaby Sanchez, Robinson Cano, and Marco Scutaro (13 runs). In all, there are eleven players in the sample- approximately 9%- that have a difference of ten runs or greater, and another twenty that have a difference of 7-9 runs. The systems will most likely show varying levels of agreement/disagreement based on position, as well. This is something I intend on looking deeper into at some point in the future.
The all ZR team for 2010:
1B: Daric Barton, +10
2B: Mark Ellis and Ian Kinsler, +10
3B: Jose Lopez, +16
SS: Brendan Ryan, +17
LF: Juan Pierre, +7
CF: Denard Span, +6
RF: Jay Bruce, +12
You can find the spreadsheet containing data for all players in 2010, including their STATS and BIS chances and plays made, here.
Prospecting has always intrigued me. It’s an excruciatingly inexact science due to the fact that it revolves solely around the opinions of men that, while experienced in assessing potential performance, are always hindered by bias. I think that’s one reason why I’m so enamored by the statistical side of the game- yes, bias exists, but it becomes apparent rather quickly (if you know what you’re doing). Multiple scouts, in theory, are better than one- but since they talk amongst themselves, you begin to introduce the biases of others, so you’re not only looking at the bias of one, but the bias of many. Despite all of these issues, scouting places invaluable, irreplaceable and indispensable information in the hands of men that run Major League Baseball organizations. They find the kids that need a slight mechanical tweak in their swing to turn them from a fringe hitter into a batting champion; they find the kids that have great results but with mechanics that make them a high-risk player, and they find the kids with raw talent that they can turn into a superstar. An inexact science, yes, but one that produces much better results than the straight numbers would.
Once these players move from high school or college into the pros, a plethora of information about their performance becomes available to us. We know how well each player does in the minors, and we can see signs that a player might sink or swim in the Majors. A low walk rate and a high strikeout rate, for example, might indicate that a player has poor command of the strike zone and may not be able to produce consistent contact at the Major League level. A player with a lot of stolen bases but a lot of times caught stealing as well might indicate that he still has much to learn. And a player that walks a lot and demonstrates good power stands a good chance at developing into a useful player, provided he makes enough contact. Of course, there are things that the numbers will never catch- mechanical tweaks, a new mental approach, etc.- but if handled properly, the numbers can help lend some insight as to which players might be able to produce in The Show.
Back in 1985, the legendary Bill James introduced Major League Equivalencies, known as MLEs for short (yeesh, MLEs have been around longer than I’ve been alive). You can read up on some of the basics of MLEs by ZiPS creator Dan Szymborski here. The take-home message:
One thing to remember is that MLEs are not a prediction of what the player will do, just a translation of what the major league equivalence of what the player actually did is. This is useful for predictions however, because like, major league statistics, MLEs have strong predictive value. As strong as major league statistics (which was the goal of this).
I’m actually not sure if MLEs have as strong a predictive value as Major League statistics do. In any case, the important thing to keep in mind is that these are translations; not projections. That requires something more complex. It just so happens that one of the publicly available (for a fee of $15, that is) projection systems, Oliver, does just this. It not only translates based on context of the parks and leagues; it regresses and makes adjustments for aging. If we want to predict how a player will perform in the Major Leagues based on his Minor League data, this is the way to do it- and Oliver uses a very interesting, theoretically sound (and apparently quite accurate) method:
Oliver calculates league factors by a direct comparison of each player’s performance in each minor league with his performances in the major leagues, while most other systems use a “chaining” process in which only performances in adjacent levels are compared. An example of chaining is when High-A is compared to Double-A, which is compared to Triple-A, which is compared to the major leagues, to get a High-A to majors factor. A problem with chaining is that each additional element in the chain multiplies any selection bias that might be present. A direct comparison, without adjusting for the age of the players at each level, gives a better estimate than chaining of how well the player will perform if and when he gets to the major leagues. Adjusting for age allows an estimate of the player’s true talent now.
Once age and park factors have been accounted for, Oliver calculates the translation factors for each league. Applying these factors to each player’s park-adjusted stats in each league, then summing into a single stat line, produces the player’s MLE for that season.
The values that each player is regressed to are determined by information about the player other than his performance—his age, position, and level played. A 19-year-old in Double-A will be regressed to a higher mean than a 23-year-old at that same level, as the team presumably considered the 19-year-old to have more talent in aggressively promoting him. Similarly, a first baseman will be regressed to a higher home run rate than a shortstop, as we know that, on average, first basemen hit more home runs than shortstops.
Emphasis mine. Oliver might be a chimp, but he’s an awfully smart one. Since the Giants have some interesting pieces in the minor leagues, like Brandon Belt and Thomas Neal, I figured it would be worth looking into their projections to see if Oliver likes what he sees. Since Oliver isn’t free, I won’t be posting any of the raw data- but what I will do is post the player’s projected batting runs per 600 plate appearances (approximately 150 games) their projected fielding values, regressed heavily (also approximately 150 games), and the player’s projected WAR based on FanGraphs’ replacement level- 20 runs per 600 PA. In other words, a two-win player is exactly average. Pitcher runs allowed is based on a simple BaseRuns equation and the replacement level (128% of league average) does not differentiate between roles. I’m excluding all players in Low A and Mid A ball. First, the pitchers:
The run environment is 4.53 runs per game, and only a few pitchers project to be slightly better than average. Jason Stoffel was a fourth-round pick out of the University of Arizona, Alex Hinshaw we’re all familiar with, and 21-year old sinkerballer Jorge Bucardo projects to have a 4.47 R/G, which would likely translate into a 4.11 ERA. I know a number of people that are Bucardo fans, and given the Giants’ ability to develop pitchers, it wouldn’t surprise me one bit if he turned into a fine Major League starter.
Now, the hitters:
That’s not a bad projection for Kieschnick. He’s a below average Major Leaguer by these estimates, but not a replacement level player. Oliver doesn’t like Brandon Crawford’s bat one bit- he’s essentially a replacement level hitter- but over the course of a full season projects to be slightly below average. Oliver loves his glove- it likes him at +20 runs in about 120 games- but I prefer to be extra conservative with defensive estimates until we have more refined and reliable measurements. A +8 per 150 games sounds about right to me, given the sterling reports on his glovework. Toolsy and free-swinging Francisco Peguero makes the list, mostly based on the value of his glove, much like Crawford. Oliver seems to like Brock Bond quite a bit- it expects him to have an above average walk rate (~10%) with moderate contact skills and virtually no power. If he’s a roughly average hitter as his projections suggest, the Giants might have a decent backup infielder in their system that may have a bit of upside left to him. Conor Gillaspie grades out to being slightly above average based on his positional value- I’m not sure if his glove is as good as Oliver suggests it is, but he may too prove to be an adequate Major Leaguer. Oliver seems to think that Johnny Monell would make a fine big league catcher (the position adjustment is too high, so he’ll be overvalued here a bit- catchers rarely play 150 games), but I’m not sure scouts would agree with that assessment.
I think the biggest surprise to me is seeing Thomas Neal rated so highly. Oliver clearly sees something good in him- it has him as a 115 RC+ Major Leaguer right now, and that’s solidly above average. I haven’t heard recent scouting reports on him, but he turned in a decent (park-adjusted) batting line of .294/.361/.442 in 2010. He lost quite a bit of power last year, some of which I assume is due to playing in Richmond, and Oliver expects him to regain some of that in 2011. If he’s really as good as Oliver thinks he could be- around a 3 win player- they might have their left fielder of the future.
Brandon Belt really comes to no surprise. If he’s a true talent 3 win player as Oliver suggests, the Giants may benefit from having him start the season in the Majors and playing all season. A lot relies, of course, on whether or not the scouts think he could make the leap. Again, none of these projections are gospel- on the contrary, they’re far from it. But I think they do a decent job at estimating the player’s true talent levels heading in to 2011; if so, the Giants might have some interesting pieces for the future.
*The run environment is based on a weighted average of the last three seasons. The equation for hitters:
LW = .47*1B + .75*2B + 1.04*3B + 1.41*HR + .33*(BB+HBP) + .18*SB – .46*CS – .28*(AB – H)
For pitchers, the equation can be found here.
I give my quick two cents over at The Hardball Times. If he implements the changes he wants to achieve, he could see a pretty nice increase in value.
I’ve been pretty darn busy recently, so I’ve been unable to write as often as I’d like. I have something pretty cool/fun in the works that should generate some discussion about the upcoming year, but until that’s finished, I thought I’d link to this lovely article on Giants’ first base prospect Brandon Belt. Some of the highlights:
Seeing Brandon Belt take his cuts last year prompted San Jose Giants hitting coach Gary Davenport to recall another graceful and gifted left-handed batter: Will Clark.
Davenport mentioned this to Clark one day when the six-time All-Star, now a special assistant for the big league Giants, visited the organization’s high Class A affiliate.
Clark quickly disagreed.
“He has a better swing than I did,” Clark said.
Uh…wow. That’s some pretty high praise for a kid that just retooled his swing last year, and if Belt can approach anything near what Clark produced early in his career with the Giants, they’ve got a pretty special player on their hands. Then there’s this:
Belt will be under considerable scrutiny as he enters his first Major League Spring Training next month. The reigning World Series champions believe that he has a legitimate chance to earn a spot in the lineup at either first base or left field (Aubrey Huff almost surely will occupy the spot Belt doesn’t). General manager Brian Sabean has emphasized that Belt will begin the season in the Minors to continue his development if he doesn’t win a starting job with San Francisco.
All that really matters to me is that Belt is playing every day no matter the position or level. He has very little experience in the outfield and he’s supposedly a stellar defender at first, so I imagine shifting Huff to left makes more sense than sticking Belt there. There’s really no need to put a rookie at a position he has little to no experience at while he’s making the adjustments to big league pitching- he has enough on his mind. There’s no doubt in my mind that he has the ability to be a good defender in left- he has good instincts, solid speed and a great arm- but I’d rather start him at the position he’s most comfortable with. And I imagine that’s what the Giants plan on doing. Anyways, I like the way the Giants are handling this. There is a part of me that would prefer he start in Fresno, if only to avoid starting his Super 2 clock early, but if the Giants feel he can make an immediate impact, then it makes sense for him to be in the Bigs. This kid could be a real asset in the lineup- with solid power and fantastic plate discipline, he and Buster Posey could anchor the middle of the lineup for years to come. And if Pablo Sandoval can regain his form, the Giants will have a young, formidable middle of the order.
The projection systems love Belt. CAIRO projects Belt to hit .267/.357/.452 with 15 homers 35 doubles and 10 triples per 550 AB, ZiPS has Belt at .266/.357/.440 with 15/34/9 per 550, and Oliver likes him the best- it projects him to hit .284/.365/.481 with 20/34/7 per 550 AB. That’s a phenomenal projection for a kid that’s yet to hit in the Big Leagues. The scouts love him and the forecasting systems love him too- all that’s left is for Belt to live up to those expectations. And it sounds like he has the mental fortitude to do so.
I present some numbers over at THT Live. Buster was a +4 in 662 innings…that’s somethin’ special.
Courtesy of Dan Szymborski. I have to say, it’s nice to see Brandon Belt get so much love from the projection systems- CAIRO likes him, Oliver likes him, and ZiPS is a fan as well. I just hope that his Minor League performance can translate as well as the systems think it will, and it should be interesting to see what PECOTA thinks of him. Also encouraging is Pablo Sandoval’s projection. I think he may be one of the most difficult players in baseball to project, as a lot of non-baseball variables will determine his performance in 2010. At some point I imagine I’ll try and average all of the projections and calculate estimated win values. I think that would make for an interesting post.
Every once in a while I see people discussing the merits of having a good infield defense to back up the Giants’ pitching staff. I am admittedly one of those supporters- I’m a pretty firm believer in 1) keeping the ball on the ground, no matter the dimensions of the park you’re playing in, and 2) having a solid defense in the infield to convert those ground balls into outs. That’s why I’m a sucker for defensive-minded shortstops- I’ve always really liked Adam Everett, I loved Omar Vizquel, and I currently have a slightly irrational man-crush on Seattle’s Brendan Ryan and Texas’ Elvis Andrus. Sometimes I hear someone mention that the strikeout-flyball tendencies of the Giants’ pitching staff renders their infield support…well, not unimportant, but not something that necessarily has to be prioritized. So I thought I’d delve into the matter and see what I could find.
Using Baseball Info Solutions’ batted ball data (courtesy of FanGraphs), I split each pitching staff’s batted balls in play (defined as TBF – HR – BB – HBP – K) by their ground ball to fly ball ratio and compared this to the league average (both batted balls allowed per batter faced and the league GB/FB tendencies). This is based on three years of data- consider that to be something of an arbitrary cutoff point. I wanted to get a feel for the tendencies of the pitching staffs; obviously, pitchers switch teams- but I think enough pitchers have stayed put to where it will minimize the effect of team-switching. Anyways, this is what I found:
Pretty cool, no? Over the last three years, the Giants’ pitching staff has on average allowed 242 less ground balls than the average Major League team. That’s 62 less grounders induced than the runner-up Cubs. The Cardinals, as expected (pitching coach Dave Duncan has a reputation for turning pitchers into ground ball machines) have induced a great deal more grounders than the second-best team, the Braves (by 105). Overall, the Giants have induced less balls in play than any other Major League team.
That said, I’m not too concerned about the state of the Giants’ infield defense- if any team could afford to have mediocre to below average defense in the infield, it would most certainly be the Giants. This doesn’t mean that the Giants should go out and acquire all-bat no-glove players for their infield positions, but it does mean that the pitching staff effectively minimizes the impact of a infielder’s lack of range. That said, I don’t think the Giants will be hurt too much with Miguel Tejada at shortstop and Pablo Sandoval at third base.