Skip to content

Down on the Farm

January 22, 2011

Prospecting has always intrigued me.  It’s an excruciatingly inexact science due to the fact that it revolves solely around the opinions of men that, while experienced in assessing potential performance, are always hindered by bias.  I think that’s one reason why I’m so enamored by the statistical side of the game- yes, bias exists, but it becomes apparent rather quickly (if you know what you’re doing). Multiple scouts, in theory, are better than one- but since they talk amongst themselves, you begin to introduce the biases of others, so you’re not only looking at the bias of one, but the bias of many. Despite all of these issues, scouting places invaluable, irreplaceable and indispensable information in the hands of men that run Major League Baseball organizations.  They find the kids that need a slight mechanical tweak in their swing to turn them from a fringe hitter into a batting champion; they find the kids that have great results but with mechanics that make them a high-risk player, and they find the kids with raw talent that they can turn into a superstar.  An inexact science, yes, but one that produces much better results than the straight numbers would.

Once these players move from high school or college into the pros, a plethora of information about their performance becomes available to us.  We know how well each player does in the minors, and we can see signs that a player might sink or swim in the Majors.  A low walk rate and a high strikeout rate, for example, might indicate that a player has poor command of the strike zone and may not be able to produce consistent contact at the Major League level.  A player with a lot of stolen bases but a lot of times caught stealing as well might indicate that he still has much to learn.  And a player that walks a lot and demonstrates good power stands a good chance at developing into a useful player, provided he makes enough contact.  Of course, there are things that the numbers will never catch- mechanical tweaks, a new mental approach, etc.- but if handled properly, the numbers can help lend some insight as to which players might be able to produce in The Show.

Back in 1985, the legendary Bill James introduced Major League Equivalencies, known as MLEs for short (yeesh, MLEs have been around longer than I’ve been alive).  You can read up on some of the basics of MLEs by ZiPS creator Dan Szymborski here. The take-home message:

One thing to remember is that MLEs are not a prediction of what the player will do, just a translation of what the major league equivalence of what the player actually did is. This is useful for predictions however, because like, major league statistics, MLEs have strong predictive value. As strong as major league statistics (which was the goal of this).

I’m actually not sure if MLEs have as strong a predictive value as Major League statistics do.  In any case, the important thing to keep in mind is that these are translations; not projections.  That requires something more complex.  It just so happens that one of the publicly available (for a fee of $15, that is) projection systems, Oliver, does just this.  It not only translates based on context of the parks and leagues; it regresses and makes adjustments for aging.  If we want to predict how a player will perform in the Major Leagues based on his Minor League data, this is the way to do it- and Oliver uses a very interesting, theoretically sound (and apparently quite accurate) method:

Oliver calculates league factors by a direct comparison of each player’s performance in each minor league with his performances in the major leagues, while most other systems use a “chaining” process in which only performances in adjacent levels are compared. An example of chaining is when High-A is compared to Double-A, which is compared to Triple-A, which is compared to the major leagues, to get a High-A to majors factor. A problem with chaining is that each additional element in the chain multiplies any selection bias that might be present. A direct comparison, without adjusting for the age of the players at each level, gives a better estimate than chaining of how well the player will perform if and when he gets to the major leagues. Adjusting for age allows an estimate of the player’s true talent now.

Once age and park factors have been accounted for, Oliver calculates the translation factors for each league. Applying these factors to each player’s park-adjusted stats in each league, then summing into a single stat line, produces the player’s MLE for that season.


The values that each player is regressed to are determined by information about the player other than his performance—his age, position, and level played. A 19-year-old in Double-A will be regressed to a higher mean than a 23-year-old at that same level, as the team presumably considered the 19-year-old to have more talent in aggressively promoting him. Similarly, a first baseman will be regressed to a higher home run rate than a shortstop, as we know that, on average, first basemen hit more home runs than shortstops.

Emphasis mine.  Oliver might be a chimp, but he’s an awfully smart one.  Since the Giants have some interesting pieces in the minor leagues, like Brandon Belt and Thomas Neal, I figured it would be worth looking into their projections to see if Oliver likes what he sees.  Since Oliver isn’t free, I won’t be posting any of the raw data- but what I will do is post the player’s projected batting runs per 600 plate appearances (approximately 150 games) their projected fielding values, regressed heavily (also approximately 150 games), and the player’s projected WAR based on FanGraphs’ replacement level- 20 runs per 600 PA.  In other words, a two-win player is exactly average.  Pitcher runs allowed is based on a simple BaseRuns equation and the replacement level (128% of league average) does not differentiate between roles.  I’m excluding all players in Low A and Mid A ball.  First, the pitchers:

The run environment is 4.53 runs per game, and only a few pitchers project to be slightly better than average.  Jason Stoffel was a fourth-round pick out of the University of Arizona, Alex Hinshaw we’re all familiar with, and 21-year old sinkerballer Jorge Bucardo projects to have a 4.47 R/G, which would likely translate into a 4.11 ERA.  I know a number of people that are Bucardo fans, and given the Giants’ ability to develop pitchers, it wouldn’t surprise me one bit if he turned into a fine Major League starter.

Now, the hitters:

That’s not a bad projection for Kieschnick.  He’s a below average Major Leaguer by these estimates, but not a replacement level player.  Oliver doesn’t like Brandon Crawford’s bat one bit- he’s essentially a replacement level hitter- but over the course of a full season projects to be slightly below average.  Oliver loves his glove- it likes him at +20 runs in about 120 games- but I prefer to be extra conservative with defensive estimates until we have more refined and reliable measurements.  A +8 per 150 games sounds about right to me, given the sterling reports on his glovework.  Toolsy and free-swinging Francisco Peguero makes the list, mostly based on the value of his glove, much like Crawford.  Oliver seems to like Brock Bond quite a bit- it expects him to have an above average walk rate (~10%) with moderate contact skills and virtually no power.  If he’s a roughly average hitter as his projections suggest, the Giants might have a decent backup infielder in their system that may have a bit of upside left to him.  Conor Gillaspie grades out to being slightly above average based on his positional value- I’m not sure if his glove is as good as Oliver suggests it is, but he may too prove to be an adequate Major Leaguer.  Oliver seems to think that Johnny Monell would make a fine big league catcher (the position adjustment is too high, so he’ll be overvalued here a bit- catchers rarely play 150 games), but I’m not sure scouts would agree with that assessment.

I think the biggest surprise to me is seeing Thomas Neal rated so highly.  Oliver clearly sees something good in him- it has him as a 115 RC+ Major Leaguer right now, and that’s solidly above average.  I haven’t heard recent scouting reports on him, but he turned in a decent (park-adjusted) batting line of .294/.361/.442 in 2010.  He lost quite a bit of power last year, some of which I assume is due to playing in Richmond, and Oliver expects him to regain some of that in 2011.  If he’s really as good as Oliver thinks he could be- around a 3 win player- they might have their left fielder of the future.

Brandon Belt really comes to no surprise.  If he’s a true talent 3 win player as Oliver suggests, the Giants may benefit from having him start the season in the Majors and playing all season.  A lot relies, of course, on whether or not the scouts think he could make the leap.  Again, none of these projections are gospel- on the contrary, they’re far from it.  But I think they do a decent job at estimating the player’s true talent levels heading in to 2011; if so, the Giants might have some interesting pieces for the future.


*The run environment is based on a weighted average of the last three seasons.  The equation for hitters:

LW = .47*1B + .75*2B + 1.04*3B + 1.41*HR + .33*(BB+HBP) + .18*SB – .46*CS – .28*(AB – H)

For pitchers, the equation can be found here.

4 Comments leave one →
  1. aGIANTman permalink
    January 23, 2011 6:42 AM

    I really enjoyed this post. I have seen many lists of prospect rankings for the Giants, but most of the lists seem to just replicate the consensus ad nauseum. This saber-friendly analysis shows real independence and is grounded in something objective to give it credibility. There are many surprises on this list. Many thanks.

  2. January 26, 2011 4:42 PM

    Very nice post. As the other poster noted, very original.

    Thanks for the explanation about Oliver. Baseball Forecaster does something similar regarding age, not sure about leagues. They also believe that you can’t really use the stats below AA to judge what an equivalent MLB stat line would be, so they don’t do anything for the lower minors. But I find their analysis to be pretty good.

    I must admit that I’m shocked that Belt is not valued more highly for his defense. JT Snow said he had the best glove in the system, and most discussions I’ve seen praise his offense. That would only make him look much much better. :^)

    You seem to know your way around WAR, I have a question that’s been bouncing around my head for a while. WAR assumes a neutral average run environment, which typically is around 10 runs. At least I see 10 runs being used when converting runs added into wins. Perhaps that is a short hand, not sure.

    Now, of course, the Giants offense does not rate out very well relative to the league because they are not that good, though last season they were much closer to average than they were in prior seasons. However, because they play with their pitching, their run environment is much different from teams with poor pitching or even average pitching. Heck, Giants pitching has been 1/2 the past couple of seasons. Shouldn’t that make the Giants hitter that much more valuable, with the lowered run environment, in terms of WAR?

    Now I understand that to compare WAR from player to player, one has to neutralize all the data so that they are comparable and relative to a replacement level player. But still, this point bothers me for some reason. Are you able to shed any light on this?

    • triplesalley permalink*
      January 26, 2011 8:00 PM

      Thanks for coming by!

      Re: Belt’s defense- it’s hard to really tell. A player that is aesthetically pleasing in the field will usually receive high praise- diving stops and flashy plays get high marks, even if it’s oftentimes a result of a lack of range. So it actually is possible that Belt isn’t an elite defender at first, despite looking great there.

      That being said, I would still trust the scouting reports on Belt’s defense rather than the numbers due to a few reasons: the utter lack of reliability with batted ball data, ESPECIALLY from the Minor Leagues, and the issue of sample size (which becomes even more prevalent with first base, as it sees the least amount of opportunities of all INF/OF). Plus, the metrics won’t account for things like scoops. So he’s likely being underrated there as well.

      The “10 runs to 1 win” is a rule of thumb; it depends entirely on the run environment. I use a pythagenpat-derived formula to estimate runs per win: .75*RPG+2.75, where RPG are the total runs scored per game. If I remember right, the MLB RPW last year was 9.3. For the Giants, who were in a 7.9 RPG environment, it would take approximately 8.7 runs per added win. So if the Giants were to add a +10 run hitter, it would likely have a slightly different impact on their added wins. In terms of WAR, though, we use the standard league RPW rather than his own team’s. The effect of the run environment on the player is accounted for by the park adjustments.

      I hope that helps!

  3. January 26, 2011 4:45 PM

    Oh, and based on the work by Baseball Forecaster, I at least believe in their way of calculating MLEs as a way to discern future MLB performance. They use that as a guide to who the top prospects are, and I have done well in my fantasy leagues following their advice on up and comers.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: