Category Archives: graphics

Predicting the future, by analogy

Many times before, I’ve posted network diagrams which I suggest highlight objective similarities between athletes, according only to their statistical production. I’ve also noted that one of the most common discussions, especially around the draft and its aftermath, is that which attempts to identify which current or past professional player is most similar to which draftee. This is done, I believe, to convey some idea of playing style, but also, I think, to convey some idea of an individual’s potential. If a collegiate or recent draft pick gets compared to Michael Jordan instead of Zan Tabak, it means that the comparer thinks the rookie is more of a scoring wing player than a non-scoring center type, and that he has the potential to be a very good player in the NBA, rather than a very good player in Europe.

Thus, I thought it would be useful to do this same sort of comparison, but statistically, rather than subjectively. The main problem I encountered is that one cannot just add a college player’s statistics to a database of pros, match them, and expect the results to be valid. A player who scores 28 ppg in college could turn out to be a prolific scorer in the NBA, but he may also turn out to be Adam Morrison. Even comparisons of two players’ statistics across NCAA teams, I would submit, is shaky, given that college teams are so variable in terms of playing styles and abilities. Nevertheless, that it what I have chosen to do: Compare the collegiate statistical profile’s of some of this year’s draftees to those of other recent draftees, and suggest the inference, by analogy, that their professional careers will be similar to those whom their college careers match. I understand that this is fraught with tenuous connections and weak connections, but given my personal data limitations and relative lack of patience and time, this is what I’ve come up with:


Statistical Proximity of Selected NCAA Basketball Players [pdf]

Incidentally, player vertices are scaled according to their per-game MEV (Model-Estimated Value-similar to the calculation for BoxScores), and colors are according to the Playing Style Trichotomy outlined here. I find it interesting that the algorithm matches Michael Beasley with Kevin Durant, who just had a ROY season. Derrick Rose isn’t directly connected to anyone spectacular, though he is only two degrees of separation from Chris Paul, which is good company. OJ Mayo is tied to Ben Gordon, who is off to a promising start in the NBA, and Rodney Stuckey is most closely matched to Dwyane Wade (perhaps the Pistons used similar methodology in making their pick). Anyway, I’m sure many of you will gain greater insight from the graphic than my own descriptions, so please fill me in with a comment.

Mr. Consistency

Who are the most consistent scorers in the NBA? This is a question of some interest for those who participate in fantasy leagues, as consistency might be a virtue in determining the value of a player on your roster. For various reasons, a player might be worth more to you if they score 20 points every game, rather than alternate between 10 and 30 every other game. Further, some measure of consistency may highlight a player’s ability to impose their will on a game: a player able to get his scoring in, regardless of the opposition, could be said to be more of a game-defining player.

I’ve managed to estimate, for players since the 86-87 season, each individual’s mean points per 48 minutes, as well as the standard deviation of said statistic, and thus the coefficient of variation (sd/mean) and 95% confidence interval. Here’s a spreadsheet of the top (634) players in the league, by mean pts/48, sorted by coefficient of variation. Thus, the players at top could be said, in some way, to be more consistent scorers than those at the bottom.

Most consistent scorers, 1986-2008

Below is another way to view the same question. Using each player’s mean and standard deviation pts/48, along with the sample size, we can construct a 95% confidence interval for our estimate of their true mean. In the graphic linked below, each player is ranked by their mean pts/48, and the x-axis indicates how they fare under this measure of scoring. Each mean is surrounded by a line indicating the 95% confidence interval. This means, essentially, that we can be 95% sure that the player is within the span of their colored line. For players with smaller samples or greater variance, the error bars will be wider.

NBA Pts/48 min means with error bars

As you can see, some players have no error bars at all–this means that they only have one observation. Others’ error bars go down past zero. This means that we can be 95% sure that their mean pts/48 is in a range that includes zero, which doesn’t tell us very much. Anyway, here is the same graphic, for the 2007-08 season only:

Note that Carl Landry (#73) has a greater variance than most players around him, but he ranks as a better per-48 scorer than Shaquille O’Neal.

Finally, here’s a regular-season 2007-08 graphic for players’ MEV (or model-estimated value, using regression-derived regression weights like those seen here). Landry does even better here (18th), in terms of his mean, but his confidence interval is very large. This estimate suggests, though, that at worst, he’s about as good as Odom, Andre Miller, and Kirilenko; while at best, he is in rarified air. Keep in mind that this is still just a 95% confidence interval, so statistically, there’s still a 1 in 20 chance the true mean isn’t even in this interval. All should be taken with a grain of salt. One of the things I like most about this presentation is that it’s a per-minute stat, which controls for playing time (although not pace), but still reminds us that estimates for those players with little playing time should be taken with large grains of salt, and might not really mean much of anything. Josh McRoberts, for example, is probably not the 406th, much less the 6th, most valuable player in the NBA, even though his simple arithmetic mean indicates as much–his confidence interval reminds us of this, while maintaining the simple ordering.

I suppose this is also the public debut of any sort of official MEV ordering for 2007-08. I’d be interested to hear what people thought about this… this is something similar to Berri’s estimates, but I think the weightings are a little more appropriate. Let me know in the comments if they seem, at least, per-minute, to be reasonable estimates and orderings of player value.

Improving Brand’s Image

If the BoxScores methodology and the Scorer-Perimeter-Interior playing style trichotomy is new to you, you can read about them here and here.

For the latest in my series of BoxScore team histories, this week we turn to the LA Clippers–a storied franchise whose latest chapters have not been the most gripping. First things first, though, check out 38 illustrious seasons of Clippers franchise history:

Clippers Franchise History

The first thing to attend to upon first appraisal is just how productive Elton Brand is. Perhaps it is because he is oft-injured, but it seems as though Brand gets very little attention from the national sports media. Given the mostly mediocre level of talent with which he is surrounded, his productivity is impressive. Though Chris Kaman and Maggette are solid players, Brand has been adding 8-12 wins in his healthier seasons. Especially as Thornton develops in to a more well-rounded and more productive player (his very pink color in the graphic indicates that his rookie season focused much more on shooting than anything else), a healthy Elton Brand poises the Clippers for a return to the form of 2005-06. Though you would not know it from the amount of media coverage he receives, Brand is 29th among active players in BoxScores per 82 games, and his 05-06 was the second most productive season in Clippers history. One has to go back to Bob McAdoo in 1974-75 to find a more valuable Clippers season.

Other observations that stand out from a cursory overview of this graphical Clippers history: Interesting how Danny Manning went from a pretty well-rounded game, with somewhat of a passing/stealing bent (1990, 91, 92–note his fairly neutral greenish color, not that unlike Quinton Ross this past season), to focusing much more on scoring (1993 especially, his pinkish hue is much more like Quentin Richardson’s style). Is Manning’s transformation from Quinton-like to Quentin-like entirely a function of Mark Jackson’s strong season of perimeter play? I find it interesting to see how player’s styles evolve over the years, not only as they age, but in response to what their team needs for them to do.

Please let me know in the comments this portrayal of Clippers history meshes with your more subjective memories of it. Are the players ranked correctly within seasons? (e.g. in 2008, was Maggette more valuable in terms of wins than Kaman, followed by Mobley, Thornton, and Thomas?) Was Brand’s best year really 2005-06? What do you pick up by looking at the colors that I might have missed?

Ray Allen, finals MVP?

He certainly seems to have found his stroke in this series, and has been arguably the most consistent of the Big Six… here are the individual contributions to last night’s game:

tm Player MP PTS MEV PVC PtC Credit G/B
bos Ray Allen 48.00 19 25.69 0.293 29.34 0.312 4.69
lal Lamar Odom 39.00 19 24.86 0.301 26.57 0.283 4.70
bos Paul Pierce 42.17 20 18.07 0.206 20.64 0.220 2.26
lal Kobe Bryant 43.35 17 19.01 0.230 20.31 0.216 2.08
bos Kevin Garnett 37.15 16 16.66 0.190 19.03 0.202 2.23
bos James Posey 25.47 18 12.84 0.146 14.66 0.156 3.07
lal Trevor Ariza 8.72 6 11.65 0.141 12.45 0.132 12.44
lal Pau Gasol 37.98 17 11.20 0.136 11.96 0.127 1.82
lal Derek Fisher 25.33 13 9.98 0.121 10.67 0.113 2.57
lal Vladimir Radmanovic 26.58 10 9.55 0.116 10.21 0.109 2.66
bos Eddie House 24.57 11 8.51 0.097 9.72 0.103 2.50
bos Rajon Rondo 17.02 5 4.90 0.056 5.60 0.060 2.73
bos P.J. Brown 14.53 3 2.07 0.024 2.36 0.025 1.51
bos Leon Powe 9.05 3 0.80 0.009 0.91 0.010 1.15
bos Kendrick Perkins 13.25 2 0.57 0.006 0.65 0.007 1.19
lal Jordan Farmar 21.42 3 0.57 0.007 0.61 0.006 1.11
bos Tony Allen 2.23 0 0.52 0.006 0.59 0.006 #DIV/0!
lal Luke Walton 3.78 3 -0.70 -0.008 -0.74 -0.008 0.81
lal Ronny Turiaf 10.02 0 -1.12 -0.014 -1.20 -0.013 0.57
lal Sasha Vujacic 23.82 3 -2.48 -0.030 -2.65 -0.028 0.76
bos Sam Cassell 6.57 0 -2.90 -0.033 -3.31 -0.035 0.00
Totals 480 188 170.25 2.000 188.38 2.004 2.24

What a game it was! I admit to assuming a Laker win and tuning out. I never expected such a comeback! Allen, Pierce and Garnett all showed up last night, and Bryant missed his usual umteen shots. Posey and Ariza both had pretty big games, but Vujacic fell off the map.

For games like these, I like putting together a graph of team points over time, which is something you can find elsewhere, but I add some useful information to mine, and calculate what I call “approximate domination,” (although I could probably come up with a better name). Essentially, you take the area under the winning team’s curve below, and subtract the area under the losing team’s curve. This is the average lead, compounded by second, over the course of the game. To me, it represents the extent of the closeness of a game better than final score.

As is hopefully evident, the Lakers dominated this game, but the Celtics drastically increased their slope right after LA took their biggest lead. The black trendline is Laker’s scoring, and green is Celtics. The sparklines at bottom represent individual scores, height-scaled in accordance with how many points were made–ftm, f2m, or f3m. Dotted lines indicate points at which the teams were tied, and solid vertical lines indicate the largest deficit and lead for the winning team. As you can see, despite winning by six, the Celtics trailed by an average of eleven points for the duration of the game. I think this is somewhat informative, although, unfortunately, somewhat time-consuming to calculate… what do you think?

The Grizzlies were good, once

And not too long ago, either. It may seem hard to believe today, but the Memphis team once featured actual good players and won more than half of their games. In 03-04, everybody on the team upped their production by about 50%, and James Posey came to the team, playing the most productive basketball of his career, by far. It’s always interesting to me to find players who are colored gray, indicating that their playing style is very close to the league average–that is, their propensity to score, play perimeter ball, and do interior things is very much in line with some theoretical “average player”–Battier in 02, Posey in 04, and Miller in 06 played in this vein and found success in the role. Also of note is the contrast between a player like Rudy Gay, who in 2007-08 was about as scoring-oriented a player as you will find, and one like the slate-blue versions of Battier and even Gasol, whose color pegs them as essentially opposite to a scoring-oriented player like Gay. Rudy may be good, and have lots of potential (quality younger players often start out with a focus on shooting), but I always have had a soft spot for those scorer’s-opposite types like Battier.

Grizzlies franchise history

What do you think? Was Posey the catalyst for the Grizzlies’ success? Did Battier’s and Jones’ departure spell the end? Does the order of the players within each season appear right? Is Abdur-Rahim one of the best players in Vancouver/Memphis history? Please let me know your expert opinions in the comments.