Category Archives: graphics

Predicting the future, by analogy

Many times before, I’ve posted network diagrams which I suggest highlight objective similarities between athletes, according only to their statistical production. I’ve also noted that one of the most common discussions, especially around the draft and its aftermath, is that which attempts to identify which current or past professional player is most similar to which draftee. This is done, I believe, to convey some idea of playing style, but also, I think, to convey some idea of an individual’s potential. If a collegiate or recent draft pick gets compared to Michael Jordan instead of Zan Tabak, it means that the comparer thinks the rookie is more of a scoring wing player than a non-scoring center type, and that he has the potential to be a very good player in the NBA, rather than a very good player in Europe.

Thus, I thought it would be useful to do this same sort of comparison, but statistically, rather than subjectively. The main problem I encountered is that one cannot just add a college player’s statistics to a database of pros, match them, and expect the results to be valid. A player who scores 28 ppg in college could turn out to be a prolific scorer in the NBA, but he may also turn out to be Adam Morrison. Even comparisons of two players’ statistics across NCAA teams, I would submit, is shaky, given that college teams are so variable in terms of playing styles and abilities. Nevertheless, that it what I have chosen to do: Compare the collegiate statistical profile’s of some of this year’s draftees to those of other recent draftees, and suggest the inference, by analogy, that their professional careers will be similar to those whom their college careers match. I understand that this is fraught with tenuous connections and weak connections, but given my personal data limitations and relative lack of patience and time, this is what I’ve come up with:

Statistical Proximity of Selected NCAA Basketball Players [pdf]

Incidentally, player vertices are scaled according to their per-game MEV (Model-Estimated Value-similar to the calculation for BoxScores), and colors are according to the Playing Style Trichotomy outlined here. I find it interesting that the algorithm matches Michael Beasley with Kevin Durant, who just had a ROY season. Derrick Rose isn’t directly connected to anyone spectacular, though he is only two degrees of separation from Chris Paul, which is good company. OJ Mayo is tied to Ben Gordon, who is off to a promising start in the NBA, and Rodney Stuckey is most closely matched to Dwyane Wade (perhaps the Pistons used similar methodology in making their pick). Anyway, I’m sure many of you will gain greater insight from the graphic than my own descriptions, so please fill me in with a comment.


Mr. Consistency

Who are the most consistent scorers in the NBA? This is a question of some interest for those who participate in fantasy leagues, as consistency might be a virtue in determining the value of a player on your roster. For various reasons, a player might be worth more to you if they score 20 points every game, rather than alternate between 10 and 30 every other game. Further, some measure of consistency may highlight a player’s ability to impose their will on a game: a player able to get his scoring in, regardless of the opposition, could be said to be more of a game-defining player.

I’ve managed to estimate, for players since the 86-87 season, each individual’s mean points per 48 minutes, as well as the standard deviation of said statistic, and thus the coefficient of variation (sd/mean) and 95% confidence interval. Here’s a spreadsheet of the top (634) players in the league, by mean pts/48, sorted by coefficient of variation. Thus, the players at top could be said, in some way, to be more consistent scorers than those at the bottom.

Most consistent scorers, 1986-2008

Below is another way to view the same question. Using each player’s mean and standard deviation pts/48, along with the sample size, we can construct a 95% confidence interval for our estimate of their true mean. In the graphic linked below, each player is ranked by their mean pts/48, and the x-axis indicates how they fare under this measure of scoring. Each mean is surrounded by a line indicating the 95% confidence interval. This means, essentially, that we can be 95% sure that the player is within the span of their colored line. For players with smaller samples or greater variance, the error bars will be wider.

NBA Pts/48 min means with error bars

As you can see, some players have no error bars at all–this means that they only have one observation. Others’ error bars go down past zero. This means that we can be 95% sure that their mean pts/48 is in a range that includes zero, which doesn’t tell us very much. Anyway, here is the same graphic, for the 2007-08 season only:

Note that Carl Landry (#73) has a greater variance than most players around him, but he ranks as a better per-48 scorer than Shaquille O’Neal.

Finally, here’s a regular-season 2007-08 graphic for players’ MEV (or model-estimated value, using regression-derived regression weights like those seen here). Landry does even better here (18th), in terms of his mean, but his confidence interval is very large. This estimate suggests, though, that at worst, he’s about as good as Odom, Andre Miller, and Kirilenko; while at best, he is in rarified air. Keep in mind that this is still just a 95% confidence interval, so statistically, there’s still a 1 in 20 chance the true mean isn’t even in this interval. All should be taken with a grain of salt. One of the things I like most about this presentation is that it’s a per-minute stat, which controls for playing time (although not pace), but still reminds us that estimates for those players with little playing time should be taken with large grains of salt, and might not really mean much of anything. Josh McRoberts, for example, is probably not the 406th, much less the 6th, most valuable player in the NBA, even though his simple arithmetic mean indicates as much–his confidence interval reminds us of this, while maintaining the simple ordering.

I suppose this is also the public debut of any sort of official MEV ordering for 2007-08. I’d be interested to hear what people thought about this… this is something similar to Berri’s estimates, but I think the weightings are a little more appropriate. Let me know in the comments if they seem, at least, per-minute, to be reasonable estimates and orderings of player value.

Improving Brand’s Image

If the BoxScores methodology and the Scorer-Perimeter-Interior playing style trichotomy is new to you, you can read about them here and here.

For the latest in my series of BoxScore team histories, this week we turn to the LA Clippers–a storied franchise whose latest chapters have not been the most gripping. First things first, though, check out 38 illustrious seasons of Clippers franchise history:

Clippers Franchise History

The first thing to attend to upon first appraisal is just how productive Elton Brand is. Perhaps it is because he is oft-injured, but it seems as though Brand gets very little attention from the national sports media. Given the mostly mediocre level of talent with which he is surrounded, his productivity is impressive. Though Chris Kaman and Maggette are solid players, Brand has been adding 8-12 wins in his healthier seasons. Especially as Thornton develops in to a more well-rounded and more productive player (his very pink color in the graphic indicates that his rookie season focused much more on shooting than anything else), a healthy Elton Brand poises the Clippers for a return to the form of 2005-06. Though you would not know it from the amount of media coverage he receives, Brand is 29th among active players in BoxScores per 82 games, and his 05-06 was the second most productive season in Clippers history. One has to go back to Bob McAdoo in 1974-75 to find a more valuable Clippers season.

Other observations that stand out from a cursory overview of this graphical Clippers history: Interesting how Danny Manning went from a pretty well-rounded game, with somewhat of a passing/stealing bent (1990, 91, 92–note his fairly neutral greenish color, not that unlike Quinton Ross this past season), to focusing much more on scoring (1993 especially, his pinkish hue is much more like Quentin Richardson’s style). Is Manning’s transformation from Quinton-like to Quentin-like entirely a function of Mark Jackson’s strong season of perimeter play? I find it interesting to see how player’s styles evolve over the years, not only as they age, but in response to what their team needs for them to do.

Please let me know in the comments this portrayal of Clippers history meshes with your more subjective memories of it. Are the players ranked correctly within seasons? (e.g. in 2008, was Maggette more valuable in terms of wins than Kaman, followed by Mobley, Thornton, and Thomas?) Was Brand’s best year really 2005-06? What do you pick up by looking at the colors that I might have missed?

Ray Allen, finals MVP?

He certainly seems to have found his stroke in this series, and has been arguably the most consistent of the Big Six… here are the individual contributions to last night’s game:

tm Player MP PTS MEV PVC PtC Credit G/B
bos Ray Allen 48.00 19 25.69 0.293 29.34 0.312 4.69
lal Lamar Odom 39.00 19 24.86 0.301 26.57 0.283 4.70
bos Paul Pierce 42.17 20 18.07 0.206 20.64 0.220 2.26
lal Kobe Bryant 43.35 17 19.01 0.230 20.31 0.216 2.08
bos Kevin Garnett 37.15 16 16.66 0.190 19.03 0.202 2.23
bos James Posey 25.47 18 12.84 0.146 14.66 0.156 3.07
lal Trevor Ariza 8.72 6 11.65 0.141 12.45 0.132 12.44
lal Pau Gasol 37.98 17 11.20 0.136 11.96 0.127 1.82
lal Derek Fisher 25.33 13 9.98 0.121 10.67 0.113 2.57
lal Vladimir Radmanovic 26.58 10 9.55 0.116 10.21 0.109 2.66
bos Eddie House 24.57 11 8.51 0.097 9.72 0.103 2.50
bos Rajon Rondo 17.02 5 4.90 0.056 5.60 0.060 2.73
bos P.J. Brown 14.53 3 2.07 0.024 2.36 0.025 1.51
bos Leon Powe 9.05 3 0.80 0.009 0.91 0.010 1.15
bos Kendrick Perkins 13.25 2 0.57 0.006 0.65 0.007 1.19
lal Jordan Farmar 21.42 3 0.57 0.007 0.61 0.006 1.11
bos Tony Allen 2.23 0 0.52 0.006 0.59 0.006 #DIV/0!
lal Luke Walton 3.78 3 -0.70 -0.008 -0.74 -0.008 0.81
lal Ronny Turiaf 10.02 0 -1.12 -0.014 -1.20 -0.013 0.57
lal Sasha Vujacic 23.82 3 -2.48 -0.030 -2.65 -0.028 0.76
bos Sam Cassell 6.57 0 -2.90 -0.033 -3.31 -0.035 0.00
Totals 480 188 170.25 2.000 188.38 2.004 2.24

What a game it was! I admit to assuming a Laker win and tuning out. I never expected such a comeback! Allen, Pierce and Garnett all showed up last night, and Bryant missed his usual umteen shots. Posey and Ariza both had pretty big games, but Vujacic fell off the map.

For games like these, I like putting together a graph of team points over time, which is something you can find elsewhere, but I add some useful information to mine, and calculate what I call “approximate domination,” (although I could probably come up with a better name). Essentially, you take the area under the winning team’s curve below, and subtract the area under the losing team’s curve. This is the average lead, compounded by second, over the course of the game. To me, it represents the extent of the closeness of a game better than final score.

As is hopefully evident, the Lakers dominated this game, but the Celtics drastically increased their slope right after LA took their biggest lead. The black trendline is Laker’s scoring, and green is Celtics. The sparklines at bottom represent individual scores, height-scaled in accordance with how many points were made–ftm, f2m, or f3m. Dotted lines indicate points at which the teams were tied, and solid vertical lines indicate the largest deficit and lead for the winning team. As you can see, despite winning by six, the Celtics trailed by an average of eleven points for the duration of the game. I think this is somewhat informative, although, unfortunately, somewhat time-consuming to calculate… what do you think?

The Grizzlies were good, once

And not too long ago, either. It may seem hard to believe today, but the Memphis team once featured actual good players and won more than half of their games. In 03-04, everybody on the team upped their production by about 50%, and James Posey came to the team, playing the most productive basketball of his career, by far. It’s always interesting to me to find players who are colored gray, indicating that their playing style is very close to the league average–that is, their propensity to score, play perimeter ball, and do interior things is very much in line with some theoretical “average player”–Battier in 02, Posey in 04, and Miller in 06 played in this vein and found success in the role. Also of note is the contrast between a player like Rudy Gay, who in 2007-08 was about as scoring-oriented a player as you will find, and one like the slate-blue versions of Battier and even Gasol, whose color pegs them as essentially opposite to a scoring-oriented player like Gay. Rudy may be good, and have lots of potential (quality younger players often start out with a focus on shooting), but I always have had a soft spot for those scorer’s-opposite types like Battier.

Grizzlies franchise history

What do you think? Was Posey the catalyst for the Grizzlies’ success? Did Battier’s and Jones’ departure spell the end? Does the order of the players within each season appear right? Is Abdur-Rahim one of the best players in Vancouver/Memphis history? Please let me know your expert opinions in the comments.

Carrying the burden

There has been some discussion lately as to whether the Lakers are better when Bryant scores a lot versus when he facilitates others’ scoring. I thought I’d look at the game-by-game data to investigate: The correlation between Bryant’s percentage of team total field goals attempted and point differential is -.539; between Bryant’s percentage of team total assists and point differential is -.609. This is inconclusive, but it indicates that when Bryant does a lot of the scoring, or a lot of the passing (i.e. the team relies on him increasingly exclusively), the team does poorly. This is likely because Bryant’s statistical load-carrying results from his teammates playing poorly, and when they do so, they are more likely to lose. One other interesting finding is that the correlation between Bryant’s assists/field goal attempts and point differential is 0.312–implying that as Bryant’s game shifts more toward facilitating (ignoring his statistics relative to team totals), his team does better. For comparison, the same correlation for Derek Fisher is -0.032 (essentially insignificant), Gasol is 0.123, Odom is 0.198, Kevin Garnett is 0.302, Paul Pierce is 0.023, Ray Allen is 0.152, and Rajon Rondo is 0.127. To the extent that anything can be gleaned from such simple correlations, we might take away that Bryant’s facilitating is very important to Laker success.

I thought I would also take a look at how different players’ contributions affected team outcomes. To do this I used PVC (percent of valuable contributions) for each game for six different players, and plotted that against team scoring differential (the colors of the dots are what type of game they played):

Gasol gives us a fairly small sample size, but it would appear that he mostly contributes about 15% of his team’s valuable contributions, and their success changes little when he does more.

Odom appears to have a “sweet spot” in the middle of his PVC range–if he has a poor game or plays low minutes, or if he has to carry the burden, the team falters.

It appears that as Bryant carries and increasing amount of the load for the Lakers, they do poorly. This might mean that if Bryant makes it all about himself, his teammates play badly, or it might mean that in games in which the Lakers are faring poorly, Bryant attempts to take over–the causal arrow in this, and all of the other graphs, is very cloudy.

Allen’s PVC appears to have little to do with team success.

The trend is somewhat ambiguous, but it appears that Pierce has bigger games when the Celtics play well.

LikeĀ  Bryant’s, this is another fairly obvious downward trend. My intepretation is that the high PVC games are those in which Garnett’s teammates fail to show up, and thus he carries the load. With little or no help, he cannot win the game on his own.

Let me know if these graphs hold any more insight for you, or if I’m reading them wrong, or if they mesh with your subjective notions. If I can get the data, I might look at Michael Jordan’s numbers, to see whether or not he really could take over games and lead his team to victory.

Edit: Here’s Jordan’s Chicago years, regular season post-1986-87 (my data only goes back that far). He shows a similar pattern, although his PVC goes substantially higher–he had some big games. The sparseness of the data on the high end makes it tough to make firm conclusions. Note also that one problem with interpreting all of these is that in huge blowouts (either for or against), the starters are often taken out early, leading to a diminished PVC. This could be fairly heavily influencing the trends here.

Double edit: You might be interested in seeing the results of the first game, in terms of individual contribtions, which I’ve tabulated here.

Triple edit: Since it’s apparently been lost on one of our commentors, I should mention that the fit lines are loess smoothers, and were not, in fact, drawn in MS Paint.

The Road to the NBA Finals

Using a modification of the BoxScores formula, and applying it to single games, I’ve developed an estimate of Points Created (which accounts for both points scored by the player’s team (positive), and points scored by their opponent (negative)) for each player on both NBA finalists for each game through the end of the Conference Finals. The result is a very accurate estimate of player contributions to victory in each game–the big stars often rise to the top, but in an 82-game season, followed by a 15- or 20-game playoff run, role players often make the difference. Bryant, Gasol and Odom often lead their team’s production, but even Walton and Radmanovic have lead the Lakers to victory. The Celtics appear even more balanced in this sense–while Garnett, Pierce, and Allen often top the Points Created table, Rondo and Perkins often play huge roles–even leading their team to victory in the fifth game of the Conference Finals against Detroit.

Points Created by Game: Boston

Points Created by Game: LA

One of the more interesting things about this representation is the different types of games each player has–indicated here by coloration. Games in which a player primarily contributes with scoring are redder, those in which he does “interior” things–blocks and rebounds–are bluer, and those with mainly perimeter contributions–assists and steals–are greener. Most games fail to fall easily into a single category, and so color combinations identify the degree to which each player’s performance can be characterized in each of these three ways. Thus, a purple box indicates a scoring performance by an interior player, a yellow box shows scoring/perimeter, etc. It is interesting to note that these colors often reflect commonly held notions about playing style–Bryant is often relatively red, leading his team offensively, and occassionally green, when he passes out assists or locks down on defense. Garnett is every possible color in the graphic, revealing his multifaceted ability to do whatever his team needs.

Below is a crudely compiled comparison of the teams’ head to head matchups during the regular season (click through to get a top-to-bottom matchup of both teams’ entire seasons). Boston won both meetings decisively, though the Lakers had not yet gained Pau Gasol (though they did have Bynum, who will not play in the finals). Garnett and Pierce had huge second games, creating 37.9 and 34.5 points, respectively. Bryant’s performance in the first was below his average, and in the second was dismal–focusing mainly on scoring (orange/pink coloration), and failing to create much difference between the teams. I would be interested to hear any comments you might have, or things you notice in the graphics. There is a ton of information to glean here, and I haven’t begun yet to absorb much of it.

Both Teams’ Seasons