All stats are not equal. Some of the more ignorant opponents of the sabermetrics or fancy-stats revolutions tend to characterize advanced stats like the obscure numbers Twins broadcaster Wally Holland pulls out in the movie Little Big League: “Lou, by the way, has hit .416 lifetime versus Hanley in the month of September in even years, so that certainly bodes well for this at-bat!”
That’s a stat, sure. But it doesn’t bode well for the at-bat, nor is it useful whatsoever beyond illustrating that variance is cuh-raaaaazy.
Proper understanding of sports statistics and analytics means understanding that there are different categories of stats, and media members often mislead you if you’re not paying attention.
So what are these categories? In Day 2 of the 365-day BlogForAYear project, I try to parse it out.
Think about Little Big League or kind of weird player facts you see on baseball video boards. Real example from last night at PNC Park: “Travis Snider has gone 8-for-21 (.381) with two home runs in his 10 games played on Mondays this season.” That stat is obviously trivial; you will never see Pirates manager Clint Hurdle explain his lineup card for next Monday’s game by saying, “Well we trust Snider to put up great numbers on Mondays. The guy never gets bummed out that the weekend is over.”
When it’s useful:
It is fine to toss out notes of trivia, especially during television and radio broadcasts of games. Sports are entertainment. They are many other things, but they are entertainment, and finding little nuggets within the numbers adds to the fun of it. Jayson Stark traffics in the strange-but-awesome stats that pop up in baseball, and it’s a very fun way to look at the game.
When it’s harmful:
TV broadcasts very often present trivia stats as if they were evaluative or trend indicators. For example, you might hear during an NHL game this season: “Evgeni Nabokov has great career numbers against Columbus: 20-5-3, .932 save percentage, 1.79 goals against average, better than he has against any other team. He really seems to have the Blue Jackets’ number, doesn’t he?”
(Note: Fake example in that I’ve never heard this said, but the numbers are real.)
The issue here is not one of small sample size per se. That’s 29 games of NHL action to contend with, and lord knows we draw judgments on goalies around Christmas when they are about 29 starts into their season.
Instead, consider the context of the sample: most of these games come from (A) when Nabokov was a better goalie and a Vezina candidate, (B) when the Columbus Blue Jackets were largely locked in the Western Conference basement with no sunlight and everyone put up good numbers against them, and most importantly (C) when the Blue Jackets players and Nabokov’s teammates were completely different individuals than we see today.
These are the trouble spots: the stats that sound like they are indicative of what we will see in tonight’s Lightning-Jackets game, but are really just frivolous or nothing more than “a neat little fact.” Now, I’m not opposed to frivolity; I have more than 52,000 tweets. But fans, and especially sports gamblers, must be wary of broadcasters presenting trivia that could be interpreted as a more substantive stat.
2. Story Stats
These are the box score stats. They show up in the newspaper or the online game recap to tell you how the game was won.
“[Geno] Smith, responsible for 11 turnovers over the first four games, played mistake-free and threw three touchdown passes while completing 16-of-20 passes for 199 yards in the first road victory of his young career.” — Jets-Falcons recap from Monday, Oct. 7
When it’s useful:
There is absolutely a story in that game recap. Geno Smith put up poor stats in the previous games but played better to lead the Jets to victory. Beautiful! Perfect for a game recap. As long as you realize the stats represent “this is how Geno Smith led the Jets to a win” and not “this is why Geno Smith is a good quarterback who is turning things around,” you’re doing it right.
Scoring two goals in a game, going 8-for-13 in a series with 6 RBI, averaging 28 points per game during this postseason… all examples of stats that tell the story of a player having success and being a part of his team’s wins. The numbers construct their own little narrative, and that’s useful.
How did the Lakers win last night? “Oh, Nick Young just went off. 41 points, 14-of-23 from the field, 6-for-11 from beyond the arc. He was insane!” Cool, got it.
Robots aren’t taking your job, sports recap writers, but they’ll try. Robots never sleep.
When it’s harmful:
It only took me until Day 2 of 365 for me to use the xkcd comic.
The problem comes in the post-game shows and the newspaper columns — TV analysts and writers take a one-game performance or stat line and use it to judge a player.
Worst of these are the narratives of “clutch,” and these seem to pop up in every sport. Make a couple late mid-range buckets? Clutch shooter! A pair of game-winning singles? Clutch hitter! Lead a few 4th-quarter comebacks? Clutch quarterback! We as a nation had an honest-to-God national conversation about Tim Tebow because of the flimsy narrative device of “clutch.” Derek Jeter’s brand is built on being “Captain Clutch.” It’s why he has this list of gorgeous ladies notched into his bedpost and you do not.
For years, the line in baseball was that there is no such thing as clutch. Nate Silver wrote in 2008 that “clutch hitting ability exists,” but admits the data proving it may be better defined as “smart situational hitting” than some sort of mental strength. I haven’t looked too far into the arguments in other sports, but there’s a reason NBA savant Zach Lowe writes about “clutch” in quotation marks.
Yet there is a reason that “clutch” and other story-stats-as-narrative-tools propagate.
“There is a strain of journalism as hero worship, a strain that asks us to believe that sports are tests of character, that those who come through at key moments of the game have reached down deep inside themselves and found the strength and courage to succeed. I don’t want to get into that.” — Bill James, The Hardball Times Annual 2008
The upshot of James’ look at whether a clutch hitter exists or not? “We don’t know.” You should use the same kind of skepticism when a media member presents a story stat as a referendum on a player’s ability in crunch time.
3. Evaluative Stats
When they’re analyzed the right way, advanced metrics can be proper evaluations of a player’s skill level. In the absence of a scouting report, these numbers can indicate that a player is great, above-average, average, below-average or poor. This is analytics.
When it’s useful:
I have this photo from a Nate Silver lecture saved in my phone. It comes from his must-read book The Signal and the Noise.
Break it down. Advanced metrics are strong evaluation tools when they have quantity. The concept of “puck luck” in hockey stems from the idea that a player scoring a goal (or being denied one) is defined largely by unexpected bounces and turns of the puck. It’s not all about skill.
The effects of puck luck can be smoothed out with a large enough sample. Take Jarome Iginla’s stats from this five-year sample.
Year Goals/Game Points/Game 05-06 .43 .82 06-07 .56 1.34 07-08 .61 1.20 08-09 .43 1.09 09-10 .43 .92
Iginla’s true talent in that five-year period is not .82 points per game and it’s not .61 points per game. But when you pull it all together, you have a player you can expect to score about 1.05 points and .48 goals per game. And wouldn’t you know it, in the 2010-11 season, Iginla averaged 1.05 points and .52 goals per game. Take a large sample and your data almost always becomes more reliable.
For quality and variety, you want to make sure the player’s stats are being put up:
- against both good and bad opponents (strength of schedule metrics are quite common these days)
- in offensive-friendly and defensive-friendly venues (this mostly applies to baseball and football)
- with different groups of teammates if possible (especially in basketball, hockey and soccer, where the ball and puck flow through many players).
Helpfully, you don’t need a degree in applied mathematics to synthesize all these factors. Guys and gals who do possess such degrees have dumped the numbers into a science machine to spit out a wonderful invention: projections!
I include projections in evaluative stats category because they are based entirely on the evaluative stats and factors mentioned above. Biff the Sabermetrician doesn’t have a Grays Sports Almanac; all he has is a database of what has happened in the past and some algorithms.
The NFL has KUBIAK projections. MLB has PECOTA and ZiPS and a bunch of others. The NHL has VUKOTA. The NBA has SCHOENE, the folks on Twitter tell me. They aren’t just for forecasts; use these projections as part of your evaluation of a player.
When it’s harmful:
Never! Advanced metrics are the best!
Well, mostly, players don’t want to hear about it. They don’t really care about their WAR or their Corsi or their DVOA. And in most cases, they don’t need to care. The players themselves are inadvertent data collectors in most cases. Yasiel Puig’s job is to hit the ball hard, not to worry about his BABIP. But his general manager should care very much about BABIP and all the other metrics when considering the value of a contract extension.
If you’re a baseball fan, you don’t need to understand or even subscribe to sabermetrics. You can totally enjoy the game without it, and people have been doing so for a century. It’s fine! But fans need to understand that general managers and baseball operations staff do subscribe and use advanced metrics to make decisions. If you want to criticize their moves, start reading up the evaluative stats or I will chastise you on Twitter. And I’m very good at it. That shirt looks stupid on you.
This last group of stats doesn’t fit too neatly into any of the other three categories. Anyone who has ever played pickup basketball knows the feeling of being “in the zone” like you can’t miss, or on the other side, feeling totally out of sorts. Therefore: trends!
When it’s useful:
A goalie maintaining a 140-minute shutout streak is kind of trivia and kind of a story, but it also indicates that he could be in a groove of goaltending, however much you want to put stock into how long the streak is likely to continue.
An opposite baseball example: third baseman Pedro Alvarez has committed 23 throwing errors this season (or one throwing error every four games), and no sane person watching his throws would regress those numbers or draw on a larger sample size and expect those error numbers to go down. He simply looks like a player who can’t make a throw from third base.
Just as we recognize slumps, we can see when a player looks better than he usually does. We now theorize that the “hot hand” in basketball really does exist, per a study by three Harvard graduates. When the smarties controlled for the increasingly difficult shots taken by the “hot hand” player (you can read why in the study), a hot shooter feels “from 1.2 to 2.4 percentage points in increased likelihood of making a shot.”
It’s not much, but it’s not nothing.
When it’s harmful:
During my first draft of this post, I included only three kinds of player stats but eventually felt trends were just barely worthy enough to get their own category.
However, we must be careful not to overrate the effects of a hot hand or a hot bat. The Thunder wouldn’t give the last shot to Jeremy Lamb over Kevin Durant just because Lamb made his three previous shots. A hot hand is not an unstoppable hand.
That fact doesn’t stop writers and broadcasters from using too many small-sample-size stats to draw large conclusions. Always be on the lookout for numbers that have arbitrary endpoints like “in the last 63 games” or “since May 5.” Chances are the media member is cutting off at the perfect spot on a game log in order to make his or her point. Those aren’t trends, they’re cherry-picking.
A final note on trends: they are usually not as good a signal of future performance as projections are. Mitchel Lichtman studied the reliability of season stats compared to projections, and found that using projections can fight our recency bias. “Until we get into the last month or two of the season, season-to-date stats provide virtually no useful information once we have a credible projection for a player.”
Billy Butler’s having a rough year? He’ll probably come back from it. Nelson Cruz is hitting over his head? He’ll probably come back down to earth. You don’t know much about advanced metrics? Keep reading my blog, I’ll try to help.