# Category Archives: analysis

## Mr. Consistency

Who are the most consistent scorers in the NBA? This is a question of some interest for those who participate in fantasy leagues, as consistency might be a virtue in determining the value of a player on your roster. For various reasons, a player might be worth more to you if they score 20 points every game, rather than alternate between 10 and 30 every other game. Further, some measure of consistency may highlight a player’s ability to impose their will on a game: a player able to get his scoring in, regardless of the opposition, could be said to be more of a game-defining player.

I’ve managed to estimate, for players since the 86-87 season, each individual’s mean points per 48 minutes, as well as the standard deviation of said statistic, and thus the coefficient of variation (sd/mean) and 95% confidence interval. Here’s a spreadsheet of the top (634) players in the league, by mean pts/48, sorted by coefficient of variation. Thus, the players at top could be said, in some way, to be more consistent scorers than those at the bottom.

Most consistent scorers, 1986-2008

Below is another way to view the same question. Using each player’s mean and standard deviation pts/48, along with the sample size, we can construct a 95% confidence interval for our estimate of their true mean. In the graphic linked below, each player is ranked by their mean pts/48, and the x-axis indicates how they fare under this measure of scoring. Each mean is surrounded by a line indicating the 95% confidence interval. This means, essentially, that we can be 95% sure that the player is within the span of their colored line. For players with smaller samples or greater variance, the error bars will be wider.

NBA Pts/48 min means with error bars

As you can see, some players have no error bars at all–this means that they only have one observation. Others’ error bars go down past zero. This means that we can be 95% sure that their mean pts/48 is in a range that includes zero, which doesn’t tell us very much. Anyway, here is the same graphic, for the 2007-08 season only:

Note that Carl Landry (#73) has a greater variance than most players around him, but he ranks as a better per-48 scorer than Shaquille O’Neal.

Finally, here’s a regular-season 2007-08 graphic for players’ MEV (or model-estimated value, using regression-derived regression weights like those seen here). Landry does even better here (18th), in terms of his mean, but his confidence interval is very large. This estimate suggests, though, that at worst, he’s about as good as Odom, Andre Miller, and Kirilenko; while at best, he is in rarified air. Keep in mind that this is still just a 95% confidence interval, so statistically, there’s still a 1 in 20 chance the true mean isn’t even in this interval. All should be taken with a grain of salt. One of the things I like most about this presentation is that it’s a per-minute stat, which controls for playing time (although not pace), but still reminds us that estimates for those players with little playing time should be taken with large grains of salt, and might not really mean much of anything. Josh McRoberts, for example, is probably not the 406th, much less the 6th, most valuable player in the NBA, even though his simple arithmetic mean indicates as much–his confidence interval reminds us of this, while maintaining the simple ordering.

I suppose this is also the public debut of any sort of official MEV ordering for 2007-08. I’d be interested to hear what people thought about this… this is something similar to Berri’s estimates, but I think the weightings are a little more appropriate. Let me know in the comments if they seem, at least, per-minute, to be reasonable estimates and orderings of player value.

## Just how bad was the free throw discrepancy in Game 2?

An inquisitive reader asked me to examine just how bad was the home/away free throw attempt discrepancy in game 2. Using regular season data from 1986-2008, I calculated home fta – away fta, finding a maximum of 41 (on 02/06/93 DEN v DAL) and a minimum of -41 (on 11/19/03 NYK v LAL). The mean discrepancy in favor of the home team is 1.364, and the standard deviation is 10.254. Below is a plot of the empirical density, with Game 2 indicated with a red line:

Game 2’s difference, 38-10=28, is 2.598 standard deviations above the mean. If my counting is correct, this puts the game in the 0.994 percentile of games in terms of pro-home free throw attempt discrepancy. What I cannot tell you, unfortunately, is whether it was the officiating, or the play, that lead to the difference.

## Credit where credit is due

Last night’s game was awesome–I admit a pro-Celtics bias on account of a pro-Kevin Garnett bias, but also and anti-Kobe Bryant bias. When claiming objectivity, it’s always a good thing to clear potential biases up front, to let the reader know who they’re dealing with… That said, Garnett had a pretty good game last night, and Bryant had a pretty bad game… Garnett would have been even better without that cold streak–the only player who missed more shots was Kobe, who missed 17! Using the results of a huge, awesome linear regression the results of which have not yet been made public (although it’s very similar to the coefficients seen here), I derive the following from last nights box scores:

 Player min pts MEV PVC PtC Credit Kevin Garnett 40.65 24 22.33 0.250 25.94 0.279 Paul Pierce 31.07 22 18.49 0.207 21.48 0.231 Pau Gasol 41.47 15 19.39 0.246 20.52 0.221 Derek Fisher 40.82 15 18.98 0.241 20.09 0.216 Ray Allen 43.95 19 16.85 0.189 19.57 0.210 Rajon Rondo 35.03 15 14.92 0.167 17.34 0.186 Lamar Odom 39.02 14 12.90 0.163 13.65 0.147 Kobe Bryant 41.87 24 11.06 0.140 11.71 0.126 Vladimir Radmanovic 17.05 5 9.86 0.125 10.43 0.112 Leon Powe 9.32 4 6.85 0.077 7.96 0.086 Sam Cassell 12.97 8 5.40 0.061 6.28 0.067 P.J. Brown 21.20 2 5.08 0.057 5.90 0.063 Sasha Vujacic 26.52 8 4.17 0.053 4.41 0.047 Ronny Turiaf 12.38 5 1.74 0.022 1.84 0.020 Jordan Farmar 7.18 2 0.86 0.011 0.91 0.010 Kendrick Perkins 23.02 1 0.63 0.007 0.73 0.008 Luke Walton 13.70 0 -0.05 -0.001 -0.06 -0.001 James Posey 22.80 3 -1.39 -0.016 -1.61 -0.017 Totals 480 186 168.05 2.000 187.08 2.012

MEV is the term for model-estimated value or point difference created, using only the regression weights. PVC is percent of valuable contributions, which is each player’s part of total team MEV. PtC is points created, which scales MEV values according to actual team and opponent scoring, to roughly account for those factors unmeasured by the box score, and Credit is, essentially, the amount of a win each player should be credited for. MEV and PtC are intended to account for both offensive and defensive contributions, that is, the player’s contribution to his own team’s scoring, and his defense preventing his opponent’s scoring. Boston’s total team Credit was 1.114, and LA’s was 0.898, based on the number of points each scored. It appears as though Boston was able to do to Kobe what they did in the regular season. Note that Posey, despite a timely three, actually hurt his team some: his two turnovers and three personal fouls effectively cancelled out his two steals, while his two defensive rebounds and three points could not compensate sufficiently for four missed shots. However, this is only based on box scores… he may have had tremendous unmeasured defense which I cannot capture, since his plus/minus was +3. It’s interesting to compare my metrics with plus/minus figures: Kobe was -13 for the game…

## Carrying the burden

There has been some discussion lately as to whether the Lakers are better when Bryant scores a lot versus when he facilitates others’ scoring. I thought I’d look at the game-by-game data to investigate: The correlation between Bryant’s percentage of team total field goals attempted and point differential is -.539; between Bryant’s percentage of team total assists and point differential is -.609. This is inconclusive, but it indicates that when Bryant does a lot of the scoring, or a lot of the passing (i.e. the team relies on him increasingly exclusively), the team does poorly. This is likely because Bryant’s statistical load-carrying results from his teammates playing poorly, and when they do so, they are more likely to lose. One other interesting finding is that the correlation between Bryant’s assists/field goal attempts and point differential is 0.312–implying that as Bryant’s game shifts more toward facilitating (ignoring his statistics relative to team totals), his team does better. For comparison, the same correlation for Derek Fisher is -0.032 (essentially insignificant), Gasol is 0.123, Odom is 0.198, Kevin Garnett is 0.302, Paul Pierce is 0.023, Ray Allen is 0.152, and Rajon Rondo is 0.127. To the extent that anything can be gleaned from such simple correlations, we might take away that Bryant’s facilitating is very important to Laker success.

I thought I would also take a look at how different players’ contributions affected team outcomes. To do this I used PVC (percent of valuable contributions) for each game for six different players, and plotted that against team scoring differential (the colors of the dots are what type of game they played):

Gasol gives us a fairly small sample size, but it would appear that he mostly contributes about 15% of his team’s valuable contributions, and their success changes little when he does more.

Odom appears to have a “sweet spot” in the middle of his PVC range–if he has a poor game or plays low minutes, or if he has to carry the burden, the team falters.

It appears that as Bryant carries and increasing amount of the load for the Lakers, they do poorly. This might mean that if Bryant makes it all about himself, his teammates play badly, or it might mean that in games in which the Lakers are faring poorly, Bryant attempts to take over–the causal arrow in this, and all of the other graphs, is very cloudy.

Allen’s PVC appears to have little to do with team success.

The trend is somewhat ambiguous, but it appears that Pierce has bigger games when the Celtics play well.

LikeĀ  Bryant’s, this is another fairly obvious downward trend. My intepretation is that the high PVC games are those in which Garnett’s teammates fail to show up, and thus he carries the load. With little or no help, he cannot win the game on his own.

Let me know if these graphs hold any more insight for you, or if I’m reading them wrong, or if they mesh with your subjective notions. If I can get the data, I might look at Michael Jordan’s numbers, to see whether or not he really could take over games and lead his team to victory.

Edit: Here’s Jordan’s Chicago years, regular season post-1986-87 (my data only goes back that far). He shows a similar pattern, although his PVC goes substantially higher–he had some big games. The sparseness of the data on the high end makes it tough to make firm conclusions. Note also that one problem with interpreting all of these is that in huge blowouts (either for or against), the starters are often taken out early, leading to a diminished PVC. This could be fairly heavily influencing the trends here.

Double edit: You might be interested in seeing the results of the first game, in terms of individual contribtions, which I’ve tabulated here.

Triple edit: Since it’s apparently been lost on one of our commentors, I should mention that the fit lines are loess smoothers, and were not, in fact, drawn in MS Paint.

## Hardaway : Mourning : Brown :: Wade : O’Neal : Haslem

Seriously. The first trio, in 1996-97, had the winningest regular season in Heat history. The second trio in 2004-05, had the second winningest. Each threesome was responsible for exactly 31.2 wins, and each member of the two trios produced in approximately the same proportion to the other two, while their playing styles across the analogous pairings were remarkably similar: Wade and Hardaway? Both perimeter-producing scorers. O’Neal and Mourning? Capable scorers, dominant interior defenders and rebounders. Brown and Haslem? Valuable rebounding/defending bigs.

The parallels continue into 97-98 and 05-06: each team loses about six or seven wins, and each team’s big three maintain the same order, while producing slightly fewer BoxScores. Very interesting… perhaps a knowledgeable Heat fan can tell me whether the latter team was explicitly modeled on the former… I haven’t seen anything like this in reviewing any other BoxScore team histories. Another interesting parallel, perhaps less exciting to Heat fans, is that the most recent incarnation of the team managed only as many wins as the original Heat team in 1989.

Please do comment if you notice anything of interest which I have overlooked, especially along the lines of whether the orderings within years seem to be accurate in terms of value, as well as things like whether or not Hardaway’s 1997 season could really be the most valuable season in Heat history, and if Mourning’s 2000 season was really more valuable than any of Shaq’s years.

Note: Since this post was published, the Winshares formula has undergone some revisions of some substantive import. To see the most current iteration and accurate tables and graphs, please see theĀ BoxScores page.