One particular finding of a new analysis of baseball fielding prowess will make Red Sox fans gleeful: Yankees captain Derek Jeter, a three-time Gold Glove winner, ranked dead last among major league shortstops.
That gem emerged from an exhaustive dissection of every play during the 2002-05 seasons - almost half a million plays in all.
The statistician behind the work admits he's a "huge Red Sox fan" - he flew up from Philadelphia for the victory parade after the Sox won the 2004 World Series. But, Shane Jensen insists, "I haven't built that into my modeling."
In fact, Jensen's scientific credentials are solid: He earned his PhD from Harvard (he was pulled into the Red Sox Nation vortex while writing his thesis on Statistical Techniques for Examining Gene Regulation). He's now an assistant professor of statistics at the University of Pennsylvania's Wharton School.
The fielding performance model he and his Wharton colleagues developed as a hobby is rigorous enough that it was featured yesterday during the annual meeting in Boston of the world's largest general interest scientific organization, the American Association for the Advancement of Science.
And to show that the results aren't tilted in favor of his favorite team, Jensen notes that the analysis found that Boston's beloved Mike Lowell ranked as the fifth-worst third baseman, contrary to his reputation for fielding proficiency.
"I'm very surprised he would come out so low by our measure," Jensen said in an interview last week. "I watch a lot of Red Sox games and he seems to do relatively well at his position."
Fielding ability is notoriously difficult to quantify. Errors are subjective and, along with assists and putouts, fail to adequately capture differences in players' range and other nuances of the game. So statisticians have tried to devise measures to overcome these shortcomings.
Jensen's method calculates the probability that the average fielder would have caught a fly ball or a line drive, or gloved a grounder, hit to any spot on the field. The analysis results in an overall rating for each player, expressed as the total number of runs that the player saved or cost his team over the course of the entire season, compared with the average player at his position.
In Jeter's case, his subpar fielding cost the Yankees almost 14 runs a year between 2002 and 2005 - the most runs any player at any position cost his team. In contrast, Alex Rodriguez, now Jeter's teammate, was the second-ranked shortstop when he played for Texas in 2002 and 2003, saving his team more than 10 runs a season. But the Yankees shifted A-Rod to third when they acquired him before the 2004 season.
"The Yankees appear to have one of the best shortstops in the major leagues playing out of position in deference to the worst shortstop," Jensen said.
Jeter also was rated the worst shortstop last season using a fielding analysis developed by Baseball Musings blogger David Pinto of Longmeadow, former chief researcher for ESPN's "Baseball Tonight." "Jeter is always at the bottom," Pinto said yesterday at the scientists' meeting, where he discussed his Probabilistic Model of Range. (Pinto's model also shows Coco Crisp was the best center fielder in 2007, Curt Schilling the worst fielding pitcher.)
Jensen's analysis began with raw data obtained from Baseball Info Solutions on the location, speed, and type of every batted ball, and whether it resulted in an out. The company laboriously records information from video of every game and publishes its own player scores in The Fielding Bible.
From the data, Jensen plotted a curve for each fielder, showing, for every spot on the field, the probability that he would make the play on balls hit there. He then mathematically compared each player's curve to the "aggregate" curve for his position - a measure of the overall performance of all fielders. The size of the gap between the curves showed how much better or worse a fielder was than the average player.
Finally, Jensen weighted the data to take into account the frequency that balls were hit to various locations on the field, and to adjust for the "run consequence" of balls hit to different locations - for example, a ball hit down the line that gets by the first baseman likely will result in a double or triple, while a ball that sneaks by on his right likely will lead to a single.
Jensen said his "SAFE" method slices up the field more finely - into a grid of 4-by-4-foot squares - than other fielding analyses, making it easier to discern smaller differences among players. But it has limitations: It assumes that players at each position start each play at the identical spot, ignoring defensive shifts for batters like David Ortiz and the fact that some players or their managers make wiser decisions than others about positioning. The system also does not take into account throwing ability, and differences among ballparks.
Jensen said he is starting to adjust his data for field differences (something Pinto's model already does), but when he factored the Green Monster into the rating for Manny Ramírez - the second-worst left fielder - "he still comes out as terrible."
Findings for some other current and former Red Sox players were mild surprises: Even when his notoriously weak arm was not considered, Johnny Damon turned out to have been the 10th-worst center fielder; and Kevin Millar was just slightly below average as a first baseman, better than his reputation.
For some players, Jensen's results were exactly what you would expect, however. Doug Mientkiewicz was the second-best first baseman, Mo Vaughn the second-worst. And Wily Mo Peña took honors as the least competent right fielder.
"No shock there," said Jensen.
Gideon Gil can be reached at email@example.com.