Evaluating Defense

added 7/10/2005 by Scott Barzilla

There is a common debate among the baseball community about whether you can evaluate defense by watching it or by charting it statistically. Last time, I talked about Brad Ausmus and Adam Everett offensively and said I would table defense for later. The reason I did that is because evaluating players defensively is much more difficult than evaluating their offensive numbers.

I’ve had people tell me that evaluating offensive numbers for a catcher is a waste of time because it is a defensive position. Folks, that is by far the silliest thing I’ve ever heard. That would be akin to saying that you shouldn’t evaluate the defense of a third baseman because it is an offensive position. The last time I checked, players hit AND field. Ignoring half of what they do (in this case, hitting it is more significant) is just plain stupid.

Problem One: Ranking Fielders

Ranking fielders with the naked eye is rife with bias. The first bias we must contend with is the “home town bias”. Yes, some people can overcome this to a certain extent, but there is a very real pressure to do this. If you are an avid fan of the Astros you will watch 162 games a season. When you watch Brad Ausmus, Adam Everett, or Willy Taveras on a daily basis there is a tendency to overvalue what they do defensively. The same can be said for fielders that are perceived as bad.

The important part of the home town bias that we can’t avoid is that we don’t watch the other teams play 162 games a season. The common response to this has been “I played baseball and I’ve watched it for years. I know what I’m talking about.” Really? A few nights ago Brad Ausmus finished a game by making a nifty play in front of home plate. “You see, that proves he is a great defensive catcher. If you played baseball you would understand how difficult that play was.” I did play baseball and I know I would not have been able to make that play.

The problem with that line of thinking is that we aren’t comparing Brad Ausmus to a Little League catcher, high school catcher, college catcher, or beer league softball catcher. We are comparing him to major league catchers. We are talking about the top one or two percent in the world. The question everyone should ask themselves is how many big league catchers could have made that play. Unless you’ve seen every catcher regularly you can have no idea what the answer to that question is relying completely on eyewitness accounts.

The second bias we have to overcome is the “Web gem bias”. If a player makes a spectacular play we tend to think he is a better fielder. If he makes a lot of them then we tend to think he is a great fielder. Many players have won Gold Gloves based on this bias. The baseball beat writers vote for the Gold Glove award, so I would say they are people that “have watched a lot of games and ‘know’ what they are talking about”. Guys like Ken Griffey Jr., Jim Edmonds, Rey Ordonez, and Derek Jeter have all won the award despite having numbers that are less than stellar.

Before you dismiss the numbers imagine if we evaluated offensive players the same way, Suddenly, players that 500 foot home runs would be “great” home run hitters while those that barely cleared the fence would be merely average. Suddenly, the number of times a player hit a home run would become less and less important.

I have been chastised by these folks for relying on numbers. “There are lies, damn lies, and statistics” they often say. What these people are conveniently forgetting is that they use numbers too. The same people that say, “I know great defense when I see it” also quote error and fielding percentage rates to prove their point. So, what am I exactly being chastised for? I simply use different numbers than these folks. The simple fact is that opinions are like noses: everyone has one. Some think using numbers or “proof” to justify an opinion is a sign of weakness. I say it is a sign of understanding that a discussion cannot progress unless evidence is provided.

What numbers should we use? I agree that numbers can be deceptive if you rely on only one, so I use as many useful numbers as I can find. As I promised earlier, I will look at where Brad Ausmus and Adam Everett rank in these different categories as of end of the San Diego Padre series.

Ausmus                          Everett
FPCT = .995 (9 out of 25)       FPCT = .981 (6T out of 25)
PB   =    3 (10T out of 25)     RF   = 4.21 (21 out of 25)
CERA = 3.30 (2 out of 25)       ZR   = .848 (9 out of 25)
CS%  = .308 (12 out of 25)

Let’s start with Brad Ausmus. I’ve used the four statistics that correspond with what we expect a catcher to do. Fielding percentage is the most obvious statistic, but it isn’t particularly relevant because there isn’t much dispersion among catchers in terms of errors. In other words, catchers don’t make a lot of errors typically. Secondly, we want catchers to block the plate. Again, this statistic is not particularly relevant because the dispersion is not there. However, we find that Ausmus is only above average in those categories.

The primary function of a catcher in most people’s mind is to call a good game. Ausmus has the second best catcher ERA among catchers that qualified, but how valid is this statistic? How many other catchers could have a great ERA with Roger Clemens, Roy Oswalt, Andy Pettitte, and Brad Lidge on their pitching staff? Yet, when the pitchers laud him AND the numbers laud him then we should trust the numbers. Finally, Ausmus is almost dead average in the percentage at which he throws out would be base-stealers.

Individually these numbers are not enough to rely on, but as a group they paint the picture that Ausmus is still a good defensive catcher, but he is no longer a great defensive catcher. However, if you listen to the pundits, Ausmus is a great defensive catcher and you shouldn’t worry about his lack of offense.

You’ve noticed that I’ve used different numbers for Everett. That is one of the challenges of evaluating players defensively, you cannot use the same numbers at every position because some of them are irrelevant. For Everett, I’ve used fielding percentage, range factor, and zone rating. Range factor is simply the total number of plays a fielder makes per nine innings. Zone rating is the percentage of balls a player successfully fields that are hit into his fielding zone.

Range factor and zone rating often times conflict because there are biases in the number of opportunities that fielders have. This is another bias we have to contend with. A shortstop on a team with strikeout/fly ball pitchers will not look as good as a shortstop on a team with groundball pitchers. Zone rating does a good job of factoring out that bias. If we ignore Everett’s range factor then we see that he is in between above average and good defensively.

Problem Two: Assigning Value

Okay, Adam Everett and Brad Ausmus are good fielders at the very least. What exactly does this mean? I would venture to say this is a bigger problem than the problem of ranking fielders. Even if someone is so smart that they can overcome all of the biases I listed earlier, there is no possible way they can accurately assign value to this without using statistics.

There are those that would say that the game is fifty percent fielding and fifty percent hitting. That philosophy is as antiquated as saying the world is flat. First of all, pitchers still have a lot to say about how many runs his team gives up. Secondly, there are eight fielders (not counting the pitcher) on the field and basic common sense dictates that not all of these positions have the same significance or difficulty. However, in order to best answer this I have to reference what I said earlier.

When you look at assigning value to fielding you have to look at either what is called the margin (worst fielder at that position) or norm (average fielder at that position). Personally, I prefer looking at the norm because the margin is more theoretical while the norm is more part of the reality of baseball. Over the past 130 years, players have been moving closer and closer to the norm. As coaching, conditioning, and sheer athleticism has improved, the difference between a bad fielding catcher or shortstop and a great fielding catcher or shortstop has diminished.

Those that laud Ausmus and Everett over almost all others conveniently forget that most of the other regular players at that position are also competent or even better. Just like when people say “we lost 200 RBIs when Kent and Beltran left”, those that say “we shouldn’t worry about offense with catcher and shortstop because they are so important defensively” are misunderstanding a simple concept I call “replacement value”. It’s almost as if they think you’re going to replace Ausmus with Mike Piazza or Everett with Jose Offerman.

With the state of baseball as it is, it is more likely that you will replace them with someone that is around the norm. Since that is likely the case and since we agree that Ausmus and Everett are at least “good” we should look at how two different sabermetricians look at the relationship between offense and defense. I should point out before I move on to these numbers that both of these sabermetricians used radically different approaches. So, it’s not as if I’m just piling on here with people who say the same thing.

Bill James win shares approach is based on comparing players to the margin I mentioned earlier. Pete Palmer’s batter runs and fielding runs are based on comparing players to the norm. So, what you are about to see is how two radically different sabermetrical models look at the relative importance of offense and defense for the typical catcher. In order to do this I will look at Ausmus along with the players generally regarded as the best offensive and defensive players at their position. Everett has not been playing long enough to compare him accurately.

                  BR    FR*   OWS    DWS**
Brad Ausmus     -167    82     42     57 		
Mike Matheny    -183    29      9     35	
Mike Piazza      433   -76    207     48
Jorge Posada      98    18     60     23
Ivan Rodriguez    83   164    114     92 
Total            264   217    432    255

* through 2003. ** through 2001

Before I analyze these numbers let me express them in a format that will make more sense. These players have had careers of varying lengths, but it is more important that we recognize what they represent. Piazza and Posada represent catchers that have reputations as all-hit, no-glove. Matheny and Ausmus have the reputation of being all-glove, no hit. Then, I-Rod represents a catcher that is known for being good at both. Here we see the same numbers on a per season basis.

              BR     FR     OWS    DWS
Ausmus     -15.2    7.5     4.7    6.3	
Matheny    -20.3    3.2     1.3    5.0
Piazza      39.4   -6.9    23.0    5.3	
Posada      14.0    2.6    12.0    4.6
Rodriguez    6.4   12.6    10.4    8.4 
Total       24.3   19.0    51.4   29.6

Some people say they understand the game, but miss the simple fact evaluating hitting is very different than evaluating fielding. Team defense is 50% of the game, but that 50% has to start from the total team down to the individual. Offense can be evaluated with a great deal of accuracy from the individual on up. When I say offense is more significant than defense we need to look inside the numbers. The gap between the best offensive player on this list and the worst was an average of almost 60 runs a season under Palmer’s model or nearly 22 win shares under James’ model.

On the other hand, the total difference between fielding runs was 19.5 runs while the difference in win shares was 3.8 win shares. That means that the gap between offense is three to five times larger than the gap between defense. Of course, we haven’t looked at the gap between the best and worst and the norm.

The mean for batter runs is 4.86 which means that the range for that statistic is 54.6. The mean for fielding runs is 3.8, so the total range for that statistic is 19.5. In other words, if we express Palmer’s metric as they meant to be expressed we still see that evaluating offense is three times as significant as defense. So, we can all agree that defense is important, but I definitely would not agree that they have the same significance in evaluating a player.

This doesn’t seem right at first glance. Catcher is a defensive position. At this point, I would point out two things: the difference in significance only grows when you go to positions that are traditionally “offensive” positions. More importantly, there seems to be this growing sentiment out there that you look at offense based on lineup order instead of position. This is where that logic seriously breaks down. The idea is that Mike Piazza is a good hitter that happens to play catcher and not a catcher that happens to be a good hitter.

Folks, Mike Piazza may not be a great or even average defensive catcher, but make no mistake, he’s a catcher. Mets fans that watched him brutally play first base can attest to that. The “good hitter that happens to play….” concept implies that we could put Cecil Fielder at shortstop to take advantage of his power. I hope we all see this line of thinking as silly. Instead, we look at the catcher universe and rate players offensively and defensively. If we’re lucky, we can agree on a value system where we can start ranking catchers on a complete basis.

Putting it all together

So, we have to compete with the idea that you cannot evaluate offense and defense and combine them. If this is the case then hundreds of scouts in basketball, hockey, and baseball have been wasting their time. They may not use fancy numbers but they all have to combine it somehow to make a final evaluation. So, to say you cannot include Ausmus’ offensive shortcomings in his evaluation is like saying the same thing in the case of Dennis Rodman. This isn’t to say that you don’t want either on your team, but you can’t ignore the offensive shortcomings either.

Let me show you four different overall evaluation systems from the sabermetrical community and let you decide. Two of them you have already seen. Another belongs to a good buddy of mine and the last is my own. You don’t have to know how they work necessarily except to understand that each uses different methodology and makes different assumptions.

        Batter/Fielder Wins
Mike Piazza            42.0
Ivan Rodriguez         34.7
Jorge Posada           16.0
Brad Ausmus             2.9
Mike Matheny           -7.6

                 Win Shares
Mike Piazza             295	
Ivan Rodriguez          265
Jorge Posada            155
Brad Ausmus             127
Mike Matheny             76

Matt Souders
             Adjusted Value
Mike Piazza          155.05	
Ivan Rodriguez       150.80 
Jorge Posada          95.73
Brad Ausmus           71.49
Mike Matheny          55.24

               Career Index
Mike Piazza             750
Ivan Rodriguez          706
Jorge Posada            378
Brad Ausmus             130
Mike Matheny            -13

So, all four systems agree on the order of the five catchers and come reasonably close to agreeing on the gap between the players even though we have four different rating systems based on different assumptions. So, can all of us be wrong? I suppose that’s possible, but thank about that for a second. I don’t mind if you say I’m off my rocker here, but you would be saying that all four of us are wrong. I’m sorry to say this, but the chances of that happening are remote.

From here, the usual insult is thrown forth that “I am saying I know more than Astros management.” In essence, that statement usually is followed with a “these people have been doing this for years while you’re just a boring stat geek that needs to get friends and a girl friend.” That is immaterial. The assumption that we know more than Astros management is based on the assumption that the Astros think the same way as these individuals. That is not necessarily the best of assumptions to make. There are a number of different factors that go into choosing a player.

Besides, Bill James has been consulting for the Red Sox for three seasons now and they have been to an ALCS and have won a World Series. How far off of his rocker can he possibly be? Truth be told, all we are doing is disagreeing with management on some decisions and ways of thinking. I guess that could be arrogant, but if that is the case than the vast majority of us are arrogant.

Scott Barzilla is the author of “Checks and Imbalances” and “The State of Baseball Management.”