Wednesday, December 03, 2008

Cleveland Indians Sabermetrics 101: BABIP

Inspired by this post at Beyond the Boxscore, here's my first "homework assignment" for Saber-Friendly Blogging 101.

Batting Average on Balls In Play, BABIP, is essentially batting average for everything except strikeouts and walks. I'll let the article above, Wikipedia, and the Sabermetric Wiki give you the details.

BABIP for hitters relies on many factors and will vary from player to player. But for pitchers, BABIP always seems to fall in the .290 to .310 range. What does this mean? If a pitcher is widely outside of that range one year, you can expect them to regress back to those numbers the following year, and their overall performance should follow. (I'm sure there are cases of certain pitchers having consistently high or low BABIP numbers, but I don't know of any offhand.)

So, how did Indians pitchers fare in 2008? To find out, you can check The Hardball Times or Fangraphs. Or, if you'd rather do the work yourself (or, like me, didn't find out about the THT and Fangraphs page until after doing the work), you need to start with their 2008 stats.

If you're not up to the task of setting up a MySQL database of baseball stats, you can simply copy and paste them from Baseball-Reference's 2008 Indians page into Excel. To determine At Bats, I took BFP (Batters Faced by Pitcher) minus Bases on Balls and Hit By Pitch. (I assumed IBB totals were already included in IBB.) I also ignored Sacrifice Flies because that information wasn't readily available. My results were almost identical to those from the Hardball Times. Fangraphs' were a little different, but as the Sabermetric Wiki mentions, there are several variations to the formula.

Rich Rundles19.3851.80
Tom Mastny89.34410.80
Edward Mujica157.3286.75
Rafael Betancourt284.3115.07
Zach Jackson221.3105.60
Masahide Kobayashi229.3064.53
Rafael Perez288.3043.54
Cliff Lee852.3012.54
Jeremy Sowers491.3015.58
Jensen Lewis260.3003.82
Aaron Laffey369.2944.23
Fausto Carmona470.2945.44
Jake Westbrook131.2623.12
Anthony Reyes129.2591.83
Scott Lewis91.2222.63
Jonathan Meloan5.0000.00

I included At Bats in this table to illustrate a point: in general, as at bats increased, the pitcher's numbers moved more to the .290-.310 range. Rich Rundles and Jon Meloan only faced a handful of batters, so their numbers can largely be ignored. But you could almost argue the same for Scott Lewis and Tom Mastny.

Now, this table is good news for Tom Mastny and Ed Mujica, and even Rafael Betancourt and Zach Jackson to some extent. All posted high ERA and high BABIP. But their BABIP should go down in 2009, and their other stats should improve as a result. Conversely, Anthony Reyes and Scott Lewis will probably see their spectacular 2008 numbers fall back to earth. Jake Westbrook will probably see a decline as well, once he's finally healthy.

If BABIP holds true, that middle group should stay about the same. That's great news for Cliff Lee, Rafael Perez, and Jensen Lewis, as well as Aaron Laffey and Masahide Kobayashi to some extent. But it's also bad news for Jeremy Sowers and Fausto Carmona.

But BABIP is by no means a be-all, end-all predictor. For example, while Rafael Betancourt was on the edge of the expected BABIP range, his ERA was abnormally high (for him) due to a lingering injury that kept him from throwing his fastball, which is his best pitch. And while Cliff Lee was right in the middle in terms of BABIP, he'll still be hard-pressed to repeat the phenomenal year he had in 2008. Still, his BABIP numbers do show that 2008 wasn't entirely luck, and that Lee should still do very well in 2009.