Category Archives: analysis

Mr. Consistency

Who are the most consistent scorers in the NBA? This is a question of some interest for those who participate in fantasy leagues, as consistency might be a virtue in determining the value of a player on your roster. For various reasons, a player might be worth more to you if they score 20 points every game, rather than alternate between 10 and 30 every other game. Further, some measure of consistency may highlight a player’s ability to impose their will on a game: a player able to get his scoring in, regardless of the opposition, could be said to be more of a game-defining player.

I’ve managed to estimate, for players since the 86-87 season, each individual’s mean points per 48 minutes, as well as the standard deviation of said statistic, and thus the coefficient of variation (sd/mean) and 95% confidence interval. Here’s a spreadsheet of the top (634) players in the league, by mean pts/48, sorted by coefficient of variation. Thus, the players at top could be said, in some way, to be more consistent scorers than those at the bottom.

Most consistent scorers, 1986-2008

Below is another way to view the same question. Using each player’s mean and standard deviation pts/48, along with the sample size, we can construct a 95% confidence interval for our estimate of their true mean. In the graphic linked below, each player is ranked by their mean pts/48, and the x-axis indicates how they fare under this measure of scoring. Each mean is surrounded by a line indicating the 95% confidence interval. This means, essentially, that we can be 95% sure that the player is within the span of their colored line. For players with smaller samples or greater variance, the error bars will be wider.

NBA Pts/48 min means with error bars

As you can see, some players have no error bars at all–this means that they only have one observation. Others’ error bars go down past zero. This means that we can be 95% sure that their mean pts/48 is in a range that includes zero, which doesn’t tell us very much. Anyway, here is the same graphic, for the 2007-08 season only:

Note that Carl Landry (#73) has a greater variance than most players around him, but he ranks as a better per-48 scorer than Shaquille O’Neal.

Finally, here’s a regular-season 2007-08 graphic for players’ MEV (or model-estimated value, using regression-derived regression weights like those seen here). Landry does even better here (18th), in terms of his mean, but his confidence interval is very large. This estimate suggests, though, that at worst, he’s about as good as Odom, Andre Miller, and Kirilenko; while at best, he is in rarified air. Keep in mind that this is still just a 95% confidence interval, so statistically, there’s still a 1 in 20 chance the true mean isn’t even in this interval. All should be taken with a grain of salt. One of the things I like most about this presentation is that it’s a per-minute stat, which controls for playing time (although not pace), but still reminds us that estimates for those players with little playing time should be taken with large grains of salt, and might not really mean much of anything. Josh McRoberts, for example, is probably not the 406th, much less the 6th, most valuable player in the NBA, even though his simple arithmetic mean indicates as much–his confidence interval reminds us of this, while maintaining the simple ordering.

I suppose this is also the public debut of any sort of official MEV ordering for 2007-08. I’d be interested to hear what people thought about this… this is something similar to Berri’s estimates, but I think the weightings are a little more appropriate. Let me know in the comments if they seem, at least, per-minute, to be reasonable estimates and orderings of player value.

Just how bad was the free throw discrepancy in Game 2?

An inquisitive reader asked me to examine just how bad was the home/away free throw attempt discrepancy in game 2. Using regular season data from 1986-2008, I calculated home fta – away fta, finding a maximum of 41 (on 02/06/93 DEN v DAL) and a minimum of -41 (on 11/19/03 NYK v LAL). The mean discrepancy in favor of the home team is 1.364, and the standard deviation is 10.254. Below is a plot of the empirical density, with Game 2 indicated with a red line:

Game 2’s difference, 38-10=28, is 2.598 standard deviations above the mean. If my counting is correct, this puts the game in the 0.994 percentile of games in terms of pro-home free throw attempt discrepancy. What I cannot tell you, unfortunately, is whether it was the officiating, or the play, that lead to the difference.

Credit where credit is due

Last night’s game was awesome–I admit a pro-Celtics bias on account of a pro-Kevin Garnett bias, but also and anti-Kobe Bryant bias. When claiming objectivity, it’s always a good thing to clear potential biases up front, to let the reader know who they’re dealing with… That said, Garnett had a pretty good game last night, and Bryant had a pretty bad game… Garnett would have been even better without that cold streak–the only player who missed more shots was Kobe, who missed 17! Using the results of a huge, awesome linear regression the results of which have not yet been made public (although it’s very similar to the coefficients seen here), I derive the following from last nights box scores:

Player min pts MEV PVC PtC Credit
Kevin Garnett 40.65 24 22.33 0.250 25.94 0.279
Paul Pierce 31.07 22 18.49 0.207 21.48 0.231
Pau Gasol 41.47 15 19.39 0.246 20.52 0.221
Derek Fisher 40.82 15 18.98 0.241 20.09 0.216
Ray Allen 43.95 19 16.85 0.189 19.57 0.210
Rajon Rondo 35.03 15 14.92 0.167 17.34 0.186
Lamar Odom 39.02 14 12.90 0.163 13.65 0.147
Kobe Bryant 41.87 24 11.06 0.140 11.71 0.126
Vladimir Radmanovic 17.05 5 9.86 0.125 10.43 0.112
Leon Powe 9.32 4 6.85 0.077 7.96 0.086
Sam Cassell 12.97 8 5.40 0.061 6.28 0.067
P.J. Brown 21.20 2 5.08 0.057 5.90 0.063
Sasha Vujacic 26.52 8 4.17 0.053 4.41 0.047
Ronny Turiaf 12.38 5 1.74 0.022 1.84 0.020
Jordan Farmar 7.18 2 0.86 0.011 0.91 0.010
Kendrick Perkins 23.02 1 0.63 0.007 0.73 0.008
Luke Walton 13.70 0 -0.05 -0.001 -0.06 -0.001
James Posey 22.80 3 -1.39 -0.016 -1.61 -0.017
Totals 480 186 168.05 2.000 187.08 2.012

MEV is the term for model-estimated value or point difference created, using only the regression weights. PVC is percent of valuable contributions, which is each player’s part of total team MEV. PtC is points created, which scales MEV values according to actual team and opponent scoring, to roughly account for those factors unmeasured by the box score, and Credit is, essentially, the amount of a win each player should be credited for. MEV and PtC are intended to account for both offensive and defensive contributions, that is, the player’s contribution to his own team’s scoring, and his defense preventing his opponent’s scoring. Boston’s total team Credit was 1.114, and LA’s was 0.898, based on the number of points each scored. It appears as though Boston was able to do to Kobe what they did in the regular season. Note that Posey, despite a timely three, actually hurt his team some: his two turnovers and three personal fouls effectively cancelled out his two steals, while his two defensive rebounds and three points could not compensate sufficiently for four missed shots. However, this is only based on box scores… he may have had tremendous unmeasured defense which I cannot capture, since his plus/minus was +3. It’s interesting to compare my metrics with plus/minus figures: Kobe was -13 for the game…

Carrying the burden

There has been some discussion lately as to whether the Lakers are better when Bryant scores a lot versus when he facilitates others’ scoring. I thought I’d look at the game-by-game data to investigate: The correlation between Bryant’s percentage of team total field goals attempted and point differential is -.539; between Bryant’s percentage of team total assists and point differential is -.609. This is inconclusive, but it indicates that when Bryant does a lot of the scoring, or a lot of the passing (i.e. the team relies on him increasingly exclusively), the team does poorly. This is likely because Bryant’s statistical load-carrying results from his teammates playing poorly, and when they do so, they are more likely to lose. One other interesting finding is that the correlation between Bryant’s assists/field goal attempts and point differential is 0.312–implying that as Bryant’s game shifts more toward facilitating (ignoring his statistics relative to team totals), his team does better. For comparison, the same correlation for Derek Fisher is -0.032 (essentially insignificant), Gasol is 0.123, Odom is 0.198, Kevin Garnett is 0.302, Paul Pierce is 0.023, Ray Allen is 0.152, and Rajon Rondo is 0.127. To the extent that anything can be gleaned from such simple correlations, we might take away that Bryant’s facilitating is very important to Laker success.

I thought I would also take a look at how different players’ contributions affected team outcomes. To do this I used PVC (percent of valuable contributions) for each game for six different players, and plotted that against team scoring differential (the colors of the dots are what type of game they played):

Gasol gives us a fairly small sample size, but it would appear that he mostly contributes about 15% of his team’s valuable contributions, and their success changes little when he does more.

Odom appears to have a “sweet spot” in the middle of his PVC range–if he has a poor game or plays low minutes, or if he has to carry the burden, the team falters.

It appears that as Bryant carries and increasing amount of the load for the Lakers, they do poorly. This might mean that if Bryant makes it all about himself, his teammates play badly, or it might mean that in games in which the Lakers are faring poorly, Bryant attempts to take over–the causal arrow in this, and all of the other graphs, is very cloudy.

Allen’s PVC appears to have little to do with team success.

The trend is somewhat ambiguous, but it appears that Pierce has bigger games when the Celtics play well.

Like  Bryant’s, this is another fairly obvious downward trend. My intepretation is that the high PVC games are those in which Garnett’s teammates fail to show up, and thus he carries the load. With little or no help, he cannot win the game on his own.

Let me know if these graphs hold any more insight for you, or if I’m reading them wrong, or if they mesh with your subjective notions. If I can get the data, I might look at Michael Jordan’s numbers, to see whether or not he really could take over games and lead his team to victory.

Edit: Here’s Jordan’s Chicago years, regular season post-1986-87 (my data only goes back that far). He shows a similar pattern, although his PVC goes substantially higher–he had some big games. The sparseness of the data on the high end makes it tough to make firm conclusions. Note also that one problem with interpreting all of these is that in huge blowouts (either for or against), the starters are often taken out early, leading to a diminished PVC. This could be fairly heavily influencing the trends here.

Double edit: You might be interested in seeing the results of the first game, in terms of individual contribtions, which I’ve tabulated here.

Triple edit: Since it’s apparently been lost on one of our commentors, I should mention that the fit lines are loess smoothers, and were not, in fact, drawn in MS Paint.

Hardaway : Mourning : Brown :: Wade : O’Neal : Haslem

Seriously. The first trio, in 1996-97, had the winningest regular season in Heat history. The second trio in 2004-05, had the second winningest. Each threesome was responsible for exactly 31.2 wins, and each member of the two trios produced in approximately the same proportion to the other two, while their playing styles across the analogous pairings were remarkably similar: Wade and Hardaway? Both perimeter-producing scorers. O’Neal and Mourning? Capable scorers, dominant interior defenders and rebounders. Brown and Haslem? Valuable rebounding/defending bigs.

The parallels continue into 97-98 and 05-06: each team loses about six or seven wins, and each team’s big three maintain the same order, while producing slightly fewer BoxScores. Very interesting… perhaps a knowledgeable Heat fan can tell me whether the latter team was explicitly modeled on the former… I haven’t seen anything like this in reviewing any other BoxScore team histories. Another interesting parallel, perhaps less exciting to Heat fans, is that the most recent incarnation of the team managed only as many wins as the original Heat team in 1989.

Please do comment if you notice anything of interest which I have overlooked, especially along the lines of whether the orderings within years seem to be accurate in terms of value, as well as things like whether or not Hardaway’s 1997 season could really be the most valuable season in Heat history, and if Mourning’s 2000 season was really more valuable than any of Shaq’s years.

Note: Since this post was published, the Winshares formula has undergone some revisions of some substantive import. To see the most current iteration and accurate tables and graphs, please see the BoxScores page.

Choosing the MVP, geometrically

To begin, here are is my pick/prediction for the 2008 NBA MVP award: Chris Paul of the New Orleans Hornets. Second most valuable is Kobe Bryant, followed by LeBron James and Paul Pierce. How did I decide this? Read on…

I have discussed the concept of Winshares previously in this space, and I believe that this measure is the most parsimonious and theoretically satisfying way to estimate player value. If you are unfamiliar with the construction, here is the formula:

  • valuable contributions = pts + as*2 + tr + st + bk – to
  • winshares = (valuable contributions / team valuable contributions) * team wins

The very simple motivating theory is that each player is responsible for some fraction of his team’s success (and here I define success as winning, plain and simple–value is a separate concept from quality or talent, and value in athletics is commonly gauged by game outcomes and the contribution of individuals thereto). The better the player doing the contributing, the more successful the team, and so contributions should be weighted by team success to reward those players whose efforts result in winning.

Picture a team with one player who contributes substantially more than his teammates (say, Minnesota with Al Jefferson, or Cleveland with LeBron James). It stands to reason that win or lose, that player deserves a large share of the credit for that team’s outcomes. Now picture a team for which valuable contributions are more evenly made (say, Chicago, Sacramento, or Boston). It similarly stands to reason that credit for the success of those teams ought to be more evenly attributed to the several players who contribute.

This means that a great player doing all the work for an otherwise very poor team should be worth about the same amount, in terms of wins, as a great player doing a smaller part of the work for an otherwise very good team. This makes sense, both are great players, so both should be able to generate similar levels of success. LeBron James should be approximately as valuable as Kevin Garnett, since although the quality of their teammates is different, so is the amount they are required to contribute to their teams’ success.

So this is how I arrived at my formulation of player value: essentially add up all the good things a player has done for his team, and divide that by the total number of good things his team did. Multiply this percentage by the number of team wins, and there you have it–a per-player number of Winshares.

Now, there are several downsides to this operationalization. It takes no account of intangibles, or anything besides basic boxscore statistics. Kevin Garnett’s incredible intensity defensive leadership doesn’t count in this formulation (except as they are expressed in the boxscore–no doubt they contributed to team wins), so Paul Pierce comes through as slightly more valuable. Keep in mind, however, that this (Pierce for MVP) is what Garnett himself has told us all year long, and also keep in mind that this is not a per-minute or per-possession measure. Garnett played 2329 minutes to Pierce’s 2873, a substantial difference. Garnett had less time to add wins, even though he may have been more valuable per-minute than Pierce. However, for the MVP award, the focus ought to be on total value over the season, not player quality or efficiency. I am as big a Garnett fan as anyone, but no one would argue that injured Gilbert Arenas has been more valuable to the Wizards this year than Jamison or Butler, even if he is more valuable in some per-minute sense (though this is questionable).

The other problem with Winshares is that it does not take into account the specific possessions, minutes or games in which the valuable contributions came. I’m working on this, but in the meantime, you’ll want to use something like plus/minus figures if this is what you’re looking for. This disadvantage is most marked in attempting to measure the value of players traded during the season, but let’s face it–it is unlikely that an MVP-level player will be traded in the midst of an MVP-type season, and it’s even more unlikely that a player who was traded in the midst of the season would be in the running for MVP.

Any questions or critiques on this methodology are welcome, please feel free to leave a comment, but I submit that as far as elegance, parsimony, accessibility, and theoretical validity, Winshares as measured here are an optimal conceptualization of value.

After all that, here is the payoff: I’ve constructed a visualization depicting each player’s value in Winshares: their percent of valuable contributions is depicted on the vertical axis, and team success along the horizontal. Multiplying these two figures together results in Winshares, and each player is listed with their Winshare value and represented as a rectangle, the area of which is exactly proportional to his value. (Color is derived from my favorite way to capture playing type–the RGB scorer/perimeter/interior quasi-trichotomy.)

In a new twist, I’ve got it set up in a Google-Maps-style interface, so you can get as big a picture or as much detail as you’d like. Enjoy! (You’ll probably want to zoom in when the page first loads…)

Winshare Area Graph:

If that’s not the coolest, most straightforward way to envision basketball value, I don’t know what is!

No-brainer of the year

As you might have heard, the Sixth man of the year is Manu Ginobili of the San Antonio Spurs. Apparently, it was pretty much unanimous, too. It is interesting–from an “institutions matter” standpoint, if you wanted a player on your team to win sixth man of the year every year, you’d just pick your best player, make him sit for the first minute, and then sub him in and play him for regular starter playing time. This is, in effect, what the Spurs are doing with Ginobili, and given that he’s their second (or possibly third) best player, he’s essentially a shoo-in for the award. I thought it might be interesting to look at best sixth men according to my favorite value metric, Winshares, and so here is a plot of percentage of games started versus Winshares, for all players who started less than 100% of the games in which they played:

Ginobili sticks out like a sore thumb. In fact, the only player higher on the Winshares dimension in that graph is LeBron James, who started merely 74 of his 75 games played. A reasonable criteria for qualifying as a  sixth man, say, starting less than half of your games, quickly eliminates James, leaving Ginobili as the no-brainer choice. I’ll leave you with a Winshares ranking of the top ten players who started less than half of their games:

Player Team startpct Winshares
ginobili,manu san 0.311 9.25
terry,jason dal 0.415 6.30
barbosa,leandro pho 0.134 5.97
diaw,boris pho 0.244 5.90
scola,luis hou 0.476 5.72
outlaw,travis por 0.073 5.17
millsap,paul uta 0.024 5.10
maxiell,jason det 0.085 5.09
posey,james bos 0.027 5.07
turiaf,ronny lal 0.269 4.44