Tag Archives: Mike Trout

Hit-Ball and the Best Player Present

A few months ago, a brilliant xkcd comic inspired a challenge within the science world: explain a complex scientific topic using only the 1,000 most common words in the English language. My first thought was, of course, “Why not try this with baseball?”. Using theUp-Goer Five Editor, I attempted to tell the story of last year’s AL MVP race. Here is the result (originally published on Beyond the Box Score):

Every year, people that write about hit-ball decide who they think was the best hit-ball player. Sometimes it’s easy to decide because there is one player that is better than every other player and everyone knows it. Last year was not one of those times. There was one player that was better than every other player, but many people were confused about this. Many people thought that a different player was better than every other player, even though he was not.

People thought that this player was better than every other player because he beat every other player in Three Numbers: times getting a hit for every time he had a chance to get a hit, times hitting the ball over the wall, and times getting another player to touch the last bag. For many years, these Three Numbers have been very important to people that watch hit-ball, so people really like when someone leads everyone in all three…

Read the rest on Beyond the Box Score

Tagged , , ,

Watching Baseball Without Stats

What would baseball be like without stats? I don’t mean advanced stats – I mean ALL stats. All of them. No batting average. No RBI. No HR. No WAR. Nada. What if we watched baseball the same way that we do now, but no one kept track of results, at least outside of their own minds? What if broadcasts didn’t flash stat lines on the bottom of the screen, announcers didn’t mention them in their commentary, and they didn’t appear in newspapers?

I wonder how well we would be able to judge the value of a player in that case. It might be a little easier with pitchers, since their performance comes in bunches. We can tell when a pitcher is doing badly pretty easily, because we can see the batters getting hits and coming around to score all within a few minutes. But it would still be difficult to compare pitchers over an entire season, wouldn’t it? Sure, we can tell when a pitcher has a bad game, but what if one pitcher had 10 bad games in a season and another had 8 bad games in a season? Would we be able to remember that? Would we be able to remember how bad those games were? Probably not.

It’s even harder for hitters. They only come up every nine plate appearances, not to mention the break for the other team to bat. And they have 600+ plate appearances in a season. If we didn’t keep track of stats, we would have to remember all of these plate appearances. Well, maybe not all of them, but at least a vast majority if we wanted to be confident about how a player performed.

As an analogy, imagine watching someone roll an uneven die (as in, each number does not come up at the same rate) 600 times and then guessing how many times a 1 or 2 came up. But you only see about 4 or 5 rolls a day, and they come every half hour or so.  How well do you think you would be able to estimate the percentage that a 1 or 2 came up? Could you get within 2%? 5%?

Say you estimated that a 1 or a 2 came up 32% of the time. In actuality, they came up 28% of the time. You would probably be proud at getting close, especially since you know it’s not an even die. 32% sure doesn’t seem too far away from 28%. Well, you just said that a .280 hitter was a .320 hitter. When given Neil Walker, you said Ryan Braun. And that’s when you were within 4% of being correct. What if you were off by 8%? That’s certainly possible, given the length of time in which you were given to remember these rolls. Then you would basically be mistaking the league leader in batting average for Mark Teixeira. You would have absolutely no authority to make a claim about the rolls or players.

The above example was the equivalent of only paying attention to one batter and only caring about hits. When you add in the rest of the players and the multitude of possible outcomes, how could you possibly remember enough to judge one player better than another without stats? Sure, you could probably tell the difference between the pitcher and Miguel Cabrera, but could you tell the difference between Mike Trout and Miguel Cabrera? Maybe you’d be able to tell that Cabrera hit more home runs and you could definitely conclude that Trout stole more bases. You’d likely be able to tell that Trout was a better fielder, but it would be virtually impossible to tell who was a better hitter. After all, if you were off by, say, 2% on both players’ batting average in opposite directions, suddenly there’s a 40 point difference! You’d be saying that Cabrera hit .350 versus Trout’s .306, or Trout hitting .346 to Cabrera’s .310. If either one of those scenarios were true, the MVP race would probably be unanimous.

This is not my way of saying that stats are all that matters. This is my way of saying that stats are, at the very least, absolutely necessary if we want to measure value. The season is just too long to use only your eyes and memory. There’s no way you could possibly distinguish between a .330 hitter and a .320 hitter, or someone with 45 home runs and someone with 40 home runs, or someone with a 3.20 ERA and someone with a 2.80 ERA. Yet all those distinctions are important. If we want to measure value, we have to use stats. And we all do use stats every day during the season. We see a player’s batting average and home run total, at the very least, every game. That influences what we think of a player. I’m not saying that watching Miguel Cabrera or Mike Trout wouldn’t be impressive to watch without stats, but that we would have no way of knowing just how impressive they are.

Tagged , , , ,

A Short and Sweet Argument for Mike Trout

This post is about the AL MVP race. I’m sure you’re tired of reading about this, but I’m going to write about it anyway. I’ll try to make it quick.

Take a look at this table:

1B 2B 3B HR SB2 CS2 SB3 CS3
Mike Trout 116 26 8 30 43 3 6 1
Miguel Cabrera 121 40 0 44 3 1 1 0

SB2 are stolen bases of second, and SB3 are stolen bases of third.

That leads to these lines:

Mike Trout 0.324 0.397 0.561 0.958
Miguel Cabrera 0.331 0.394 0.608 1.002

But stolen bases of second are really just turning singles into doubles, right? I know there are differences, but let’s just assume that that’s what they are. In the same way, stealing third turns a double into a triple. So using those stolen base totals, we’re going to turn 43 of Mike Trout’s singles into doubles (and 3 of his singles into outs), as well as turn 6 of his doubles into triples (and 1 of his doubles into an out). We’ll do the same with Miguel Cabrera.

That leaves of with this:

Mike Trout 0.327 0.400 0.662 1.062
Miguel Cabrera 0.331 0.394 0.616 1.010

As you can see, Trout now has an OPS that is well higher than Cabrera’s OPS. Interesting, right?

Of course, this was not a very great method of taking stolen bases into account. A single and a stolen base is obviously not the same as a double because of the whole driving in runs thing. But I’d say it’s probably pretty close in value. wOBA (the metric that WAR uses) does a much better job of properly valuing stolen bases, but for those who prefer traditional stats, I thought this might appeal to you. Even if we drop down the value of stolen bases a bit to account for the slight difference, Trout and Cabrera end up with a very similar OPS. And then, of course, we should take defense into account, which Trout probably wins by a large margin.

What do you think? Is this at all convincing for the Cabrera supporters among you?

Tagged , , , ,

Is September Really More Important?

Through all the craziness of the AL MVP debates in the past week, I’ve heard one claim argued over and over without much rebuttal. It is this: Miguel Cabrera has been a monster in September, and because games mean more in September, Miguel Cabrera’s performance during September should be given extra weight.

Now this makes sense intuitively, for the same reason that WPA (Win Probability Added) makes sense intuitively. A win or a loss in September is going to alter a team’s chances of making the playoffs much more than a win or loss in April would. If we had a game-wide equivalent to WPA, games in September would have a higher leverage index than games in April, just like plays in the 9th inning have more significance than plays in the 1st inning.

But wait a second. That’s not entirely true. In blowout games, plate appearances that occur in early innings when the score is close have a higher leverage index than plate appearances in late innings when one team is behind by a lot. For example, the leverage index of the first play of the game is 0.9, which is slightly less important than average. But the leverage index when the away team is down by 4 runs in the top of the ninth is 0.4 (and would be much lower with an 8-run defecit).

Similarly, the games that the Astros play this month mean nothing. They are mathematically eliminated from the playoffs, so a win and a loss have the same effect on their chances of making the playoffs: 0%. However, in the first game of the season, though the result didn’t mean much, it meant something. So for the Astros, a player that performed well in April and was average the rest of the time is more valuable than a player that performed well only in September and was average the rest of the time.

There’s a problem with that logic, though. If the Astros won an extra game in April, it would have the exact same effect as if they won an extra game in September instead. If they won the game in April, they might have the illusion of having a better chance, but assuming the rest of the results stayed the same, it wouldn’t mean any more than a September win would.

The same is true for a single game. Imagine a game in which the home team wins 1-0, the one run being a solo homer. For the purposes of WPA, a homer hit in the first inning would have much less value than a walk-off homer hit in the bottom of the 9th. The 9th inning home run increased the team’s chances of winning by close to 50%, while the first inning home run probably only increased it about 10% (just estimating here, though the exact values are out there somewhere).

But did the 9th inning home run really matter more than the 1st inning home run? Either way, the home run made the difference in the game. There may have been more pressure on the batter in the 9th, and more celebration afterwards, but a solo homer would have been equally important as far as the end result is concerned.

The reason WPA doesn’t give as much credit to the 1st inning homer is uncertainty. In the first inning, we don’t know if the game will end 12-2 or 1-0, so we just take the aggregrate of those scenarios and calculate the importance of that home run accordingly. But in the 9th, there’s much less uncertainty about how the game will end, and no uncertainty about how a home run will affect the result. Yet looking back on the game afterwards, we can see that a first inning home run would have had the same effect on the game as a walk-off.

With this in mind, doesn’t it seem unfair to reward the player who hit the walk-off more than the player who hit the first inning home run. They both contributed equal value to the team, and should be rewarded accordingly. Similarly, while these September games feel more important because there is less uncertainty, why should we weigh September performance more than April performance?

Yes, Miguel Cabrera has come through in a big way in September, but if we switched his September and May, the Tigers would be in the EXACT same position as they are now. The narrative would be different – people would be saying that Cabrera is choking when the needs him most – but the end result would be the same. So instead of looking at month by month numbers or WPA (though Trout is winning in WPA), we should look at overall numbers. Yes, the clutch factor should be taken into consideration, but don’t use WPA as the be-all and end-all of clutch hitting (and please, I beg you, don’t use RBI either).

Tagged , , , , ,