Last Saturday, Shawn Goldman made this BCB post arguing that the numbers suggested that Carlos Silva should be the one going to the bullpen when Carlos Zambrano returned.
Shawn wrote his post before Silva's great outing on Saturday was completed (and now that we know that Tom Gorzelanny will be the one moved to the bullpen), and predictably, many of the comments after such a dominant outing were along the lines of "How could you say that after he pitched so well?"
Unfortunately, that made the discussion degenerate into the "stats vs. observation" (for lack of a better term) argument that we've seen all too often on this site. One of the reasons I asked Shawn to make posts on statistical analysis is that I think some have the perception that I am "against" using advanced metrics. Nothing could be further from the truth. There is a place for using advanced statistical analysis in baseball. There is also, I believe, a place for using scouting, inside knowledge, and yes, hunches. These things should work in tandem, not against each other.
With that in mind I thought I'd ask Shawn to have a dialogue with me about this particular instance (the use of the numbers to say that Silva should be the one put in the pen), and also a more general discussion of how statistics are used and should be used. Follow me after the jump for our exchange. And please -- since one of the points of making this post is to try to make dialogue on this topic less contentious -- try to keep your comments on topic, and no personal attacks. Thanks.
AL: You posted recently about the use of statistical measures to determine who should be sent back to the bullpen when Carlos Zambrano was returned to the rotation. Your chart and the numbers posted indicated that Carlos Silva should be the one, primarily based (at least the way I understood it, and correct me if I'm wrong) on the "in-season ZIPS projection" of his season ERA, which stood at the time at 5.16 (I understand it's changed now based on his fine outing on Saturday).
SHAWN: A couple quick clarifications. My decision-making process was largely based on Silva's 2010 numbers. And that's part of where the debate sometimes breaks down. In a case like Silva's, where he is with a new team and a new pitching coach, and has apparently taken a different approach to batters, there may be good reason to doubt the projections. However, just because there may be a reason for a change in performance level doesn't mean all his improvement in his W-L record and ERA are real. Both of those statistics are heavily dependent on luck and the performance of the players around him. So I also posted the FIP and xFIP numbers, because they do as good a job of any at removing the team and luck contexts from his ERA and W-L stats. Both before Silva's latest gem and after it, one thing is consistent: Silva has been quite lucky in terms of getting good defense, avoiding hits on balls in play, and having good run support. Given his reputation for being a poor influence on the clubhouse in Seattle, I didn't put much stock in his ability to improve the performance level of his teammates. So it was based on those numbers (FIP and xFIP, not the projections) that I was suggesting Silva be the one to get the "bullpen demotion". They were highly suggestive that Silva's W-L record and his low ERA were largely luck-driven. However, after Saturday, Silva's FIP and xFIP dropped dramatically, primarily due to his high strikeout totals and lack of any walks. As a result, his 2010 luck- and team-adjusted numbers are now in line with the other starters even if they're higher than his ERA. This means even if his luck is "even" going forward, he's a decent bet to continue pitching well. And, importantly, even though his projections are still worse than the other starters, I'd keep him in the rotation now that his 2010 performance seems more "real".
AL: What is this particular projection based on? Why would it have projected Silva to be so bad, after he had posted a third of a season's worth of good-to-excellent outings? Is it possible that the projection system doesn't take into account the possibility that a player might have changed his entire approach and might be very different than his past numbers?
SHAWN: Projection systems don't take these sorts of things into account. But what they are able to do is take the entirety of a player's career, including the current one, and take some sort of weighted average them together, usually with more recent years and essentially try to answer the following question: "Is Silva's 2010 improvement real, or the product of luck and a small sample size?" The longer he keeps up his success the more his projections will match his 2010 performance. And I'd argue the projections do a much better job of answering questions such as these than anyone on this blog... and probably a much worse job of answering these questions than Lou Piniella, Larry Rothschild, and the other scouts and coaches in the Cubs (or any other) organization.
AL: It appears to me that when you or other statistical analysts post charts and tables of this nature and draw conclusions from them, that you treat them as absolutes (i.e. "Silva is at the bottom of this list, thus he should be the one dropped.") Is this true, or would you consider other factors?
SHAWN: I always consider other factors. But I often don't put much stock in them, so they often end up as a "tiebreaker" in my analysis. This is not because I think these other factors are irrelevant or unimportant... rather, it is because I don't trust my own ability to evaluate them. In this particular case, I thought it was a justifiable move to keep him in the bullpen, but based on the only thing I consider myself an expert in (the numbers), the decision was clear. If the team had something else from their scouts suggesting otherwise, I'd probably defer to their judgment (but would be adamant that Silva, however improved he may be, has still been pretty lucky). That decision, even just using numbers, is a little murkier after Silva's outstanding start on Saturday.
AL: I'm interested in starting a discussion of statistical analysis here, not necessarily as "opposed" to other analysis, because there really shouldn't be "sides" in this sort of debate, but I want people who don't understand advanced metrics and their use to understand them better, and also possibly to get those of you who rely on advanced metrics to think about factors other than the numbers. Your thoughts?
SHAWN: These are the types of discussions that drive me here to BCB, because they present an opportunity to clarify things and defuse some of these contentious debates. I think a large part of the community has misperceptions of these stats, and to be honest that's often due to snarkiness and overreactions from those of us that apply them. But it sometimes also results from people assuming us "statheads" think teams should make decisions strictly off a spreadsheet. Nothing could be further from the truth. The truth is something closer to "based on what we know as fans -- which is largely just the numbers -- the most probable outcome is X and thus the best course of action is Y". I hope discussions such as these can help clarify these issues. We don't reject scouts. We just aren't scouts and don't have access to them... thus, geekery.
AL: You said, "I'd argue the projections do a much better job of answering questions such as these than anyone on this blog... and probably a much worse job of answering these questions than Lou Piniella, Larry Rothschild, and the other scouts and coaches in the Cubs (or any other) organization."
Given the fact that many here have different opinions about Lou or Larry or the other scouts and coaches in the organization (or, as you note, any other organization), what would be the best way for the Cubs -- or any -- organization to utilize statistical analysis alongside scouting or personal observation? If you were put in charge of doing this for an actual team, how would you approach the baseball people with your findings?
SHAWN: OK, let's assume for a moment that I was just hired for a statistical position with the Cubs. The first thing I'd do is ask them the most important question: "What is it that you want to know that you do not know or wish you knew better?" And I think that's the key. Stats can answer a lot of questions, but they work best when they're answering the questions the people in the personnel department want/need answered. If they have a REALLY talented numbers-cruncher, they should even be able to ask that person questions that publicly-available stats can't currently get at, but in theory could... and then have that person go about developing the tools and theory needed to answer the question being asked, or maybe find out how to answer it more accurately/precisely.
As far as the Cubs go, they get a lot of flak from time to time for not utilizing advanced metrics. However, I think they've come a long way in that regard. They're not as forward thinking as I'd like them to be, but I think they have a better grasp of some of the main conclusions of the "sabermetric revolution" than they did at the outset of the 2000's. I don't know if that's Lou's influence, a change in Hendry's perceptions of the game, or some combination of both. And I'm bigger fans of both Lou and Hendry than I think most are at this point. Perhaps more than anything else, I'm extremely anxious to see who the next GM/manager are for the team. I think someone that has the current Cubs resources in terms of talent and payroll could build a tremendous winning tradition with the right combination of scouting and statistical tools.
AL: You said, "I think a large part of the community has misperceptions of these stats, and to be honest that's often due to snarkiness and overreactions from those of us that apply them."
What, then, would be the best way to correct the misperception and to eliminate the "snarkiness and overreactions", as you put it?
SHAWN: I think doing what we're doing now is an excellent start... asking questions and answering them in a honest, open manner. I don't think either side of these debates is ultimately at fault. We're debating things on the internet, which is a place filled with equal parts snark, overreaction, misunderstanding, and bloated ego.