Estimating team chemistry

I posted recently to introduce a new method of characterizing basketball playing styles, which I call the SPI Style Trichotomy. The advantage of this methodology is, among other things, that it is an objective, performance-based means of characterizing player type that offers substantially more nuance and accuracy than the traditional position adjectives.

Well, today I’m going to take a step back from this seamless, continuous spectrum perspective, and impose some order in order to investigate the value of each playing style. Since the SPI characterizations themselves are productivity- and value-independent, it may be of interest to see the degree to which employing a player who plays a given style can add to team success. My first step was to identify, for each player-season, which of seven arbitrary playing style categorizations they most closely match. A quick look at the SPI Spectrum Graphic indicates that I’ve already “named” six spokes–each of the pure SPI styles, plus their opposites. For this post (and possibly into the future), I will refer to these six spoke-categories as (counter-clockwise from the 3 o’clock position) Pure Scorer, Perimeter Scorer, Pure Perimeter, Scorer’s Opposite (though catchier, “Defender” is too bold, and inaccurate), Pure Interior, and Interior Scorer. Note that I could have made any number of categories here, and that one of the positives of the SPI System is the lack of such arbitrary distinctions–nevertheless, for the purposes of running a regression, I’ve categorized them. Each player’s SPI numbers were used to identify the spoke to which they are closest, and for a given season, this is the category into which that player is lumped. To the six already mentioned above, I added a seventh identifier, “Mixed,” for those players who were closer to the center of the diagram than any of the six style archetypes. To give an idea of the results of the sorting, here is a table presenting the top 50 players for each archetype:

Exemplars of each SPI7 Style

The ranking was derived by summing each player’s BoxScores over seasons during which he was classified under a given archetype–thus, this isn’t a “Best-ever” list, necessarily–just a list of familiar players and their categorization.

Following this categorization process, I calculated, for each team-season, the sum of minutes played by players fitting in to each of the seven categories. Thus, the 07-08 Blazers featured 8,307 minutes of playing time from Interior Scorers–coming mostly from Aldridge, Outlaw and Webster, but with contributions from James Jones and Von Wafer. In fact, the team sums are pretty interesting in and of themselves, so I added that table as a second sheet to the Google Doc linked above: SPI7 Team Sums.

From here, I ran very basic regression analysis. I was hoping to identify the (relative) value of a minute played by each archetype. Thus, I regressed the team minute sums on team win totals (from the 52-53 season, onward, except 1999). This is a very simplistic analysis, but it yielded interesting results (in the variable names, SS is the Pure Scoring, SP is Perimeter Scoring, etc.):

Residuals:
Min       1Q   Median       3Q      Max
-31.4014  -8.4507   0.7324   8.6521  33.1204

Coefficients:
Estimate Std. Error t value Pr(>|t|)
SSmin 0.0013482  0.0002170   6.213  7.2e-10 ***
SPmin 0.0016794  0.0001653  10.161  < 2e-16 ***
PPmin 0.0024116  0.0001986  12.142  < 2e-16 ***
PImin 0.0027014  0.0002147  12.580  < 2e-16 ***
IImin 0.0026275  0.0002129  12.344  < 2e-16 ***
ISmin 0.0019478  0.0001317  14.785  < 2e-16 ***
MMmin 0.0019719  0.0001493  13.204  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 11.92 on 1170 degrees of freedom
(3 observations deleted due to missingness)
Multiple R-Squared: 0.9217,     Adjusted R-squared: 0.9212
F-statistic:  1967 on 7 and 1170 DF,  p-value: < 2.2e-16

Each coefficient is significant, and we may gain some insight by comparing the magnitude of these coefficients. The least valuable archetype (in this extremely superficial analysis which should be taken with several hundred salt grains), is the Pure Scorer, who adds 0.0013 wins per additional minute played. The most valuable (surprisingly?) are the Scorer’s Opposite types–Kevin Garnett, Shane Battier, Kirilenko, etc. who add roughly double that number of wins per minute played. The rest you can figure out easily from the regression output. As a biased observer, with my own subjective preferences, I like these results a lot: one-dimensional scorers, adored by causal fans, but disdained by me, are identified as less valuable than the glue guys and lockdown defenders, etc. who focus on things other than scoring (although as Garnett and Barkley show, they can score, too). Keep in mind that this output is somewhat hastily done and only somewhat less hastily thought-through, but the results are certainly interesting.

Another question regarding these playing types might concern the combinations of types which are most effective. From a team-building standpoint, when considering the draft, trades, or free agent acquisitions, such an investigation might prove useful. Using the same set of data as above, I ran another regression, this time using only the interactions of each team’s minutes-by-type sums. In other words, instead of seven independent variables, there are now 21: one for each combination of archetypes, Pure Perimeter/Scoring Interior, Scorer’s Opposite/Pure Scorer, etc. The interaction means that the minutes for each of the two categories are multiplied together, and this is the value included in the regression. The output is as follows:

Coefficients:
Estimate Std. Error t value Pr(>|t|)
SSmin:SPmin -1.956e-07  9.312e-08  -2.100 0.035909 *
SSmin:PPmin -5.331e-08  1.227e-07  -0.434 0.664067
SSmin:PImin  5.307e-07  1.203e-07   4.413 1.11e-05 ***
SSmin:IImin  5.360e-07  7.759e-08   6.908 8.12e-12 ***
SSmin:ISmin  3.934e-07  5.915e-08   6.652 4.46e-11 ***
SSmin:MMmin  1.263e-07  1.066e-07   1.184 0.236560
SPmin:PPmin -1.104e-08  9.102e-08  -0.121 0.903440
SPmin:PImin  5.573e-07  5.492e-08  10.146  < 2e-16 ***
SPmin:IImin  4.618e-07  4.015e-08  11.499  < 2e-16 ***
SPmin:ISmin  3.647e-07  3.967e-08   9.195  < 2e-16 ***
SPmin:MMmin  2.264e-07  6.264e-08   3.614 0.000314 ***
PPmin:PImin  6.224e-07  1.213e-07   5.132 3.37e-07 ***
PPmin:IImin  4.720e-07  9.427e-08   5.007 6.39e-07 ***
PPmin:ISmin  2.489e-07  8.016e-08   3.105 0.001948 **
PPmin:MMmin  4.288e-07  6.213e-08   6.902 8.43e-12 ***
PImin:IImin -3.740e-07  1.114e-07  -3.358 0.000811 ***
PImin:ISmin  1.617e-08  9.902e-08   0.163 0.870291
PImin:MMmin  1.937e-07  9.729e-08   1.991 0.046721 *
IImin:ISmin  1.150e-07  7.674e-08   1.498 0.134291
IImin:MMmin  2.654e-07  8.736e-08   3.038 0.002432 **
ISmin:MMmin  2.622e-07  6.672e-08   3.930 9.00e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 12.15 on 1156 degrees of freedom
(3 observations deleted due to missingness)
Multiple R-Squared: 0.9196,     Adjusted R-squared: 0.9182
F-statistic:   630 on 21 and 1156 DF,  p-value: < 2.2e-16

Note: If you thought the above regression was methodologically shaky, this one is even worse! But, nevertheless, it’s interesting to look at. Here, the coefficients are much more difficult to interpret, so I would recommend focusing mainly on whether or not they are significant (indicated by *s), and the sign attributed to the variable. It appears as though PP/PI combinations are especially fruitful, while PI/II is a deadly combination… Anyway, that’s more than enough for one post, but please feel free to add your own insights, and especially your criticisms!

About these ads

2 responses to “Estimating team chemistry

  1. Great work as usual.

    Two questions arise for me:

    1) In looking at this, I also wonder if it would be useful to distinguish between types of pure scorers. For example, Glen Rice and Dominique Wilkins were very different types of pure scorer (3 pt threat vs. slasher). Do you think there would be much difference in value between the two? Could they be distinguished by adding 3pa to the equations?

    2) Since conventional wisdom holds that you need three impact players to win, I also wonder about the effects of adding a third player to the combinations (aside from adding more work for you). ;) For example, scorer’s opposite (Detlef Shrempf/Kareem) and (Shawn Kemp/AC Green) might not be very good pairs, but as a trio with a pure perimeter player (Gary Payton, Magic Johnson) they can be extremely successful.

    I suppose in each case there was a strong pair anyway, pp/ii or pp/pi though….

    A lot to chew on here…

  2. Q: Certainly it is useful to distinguish between scorer types, and adding in three point attempts is something I have considered doing. The problem is that it shifts almost all guard-types waayyy over to the Pure Perimeter position–very few players are left as Scoring types at all. The addition of three-pointers only seems to reinforce differences between “bigs” and “guards” rather than distinguishing amongst scorers very well.

    Also, FWIW, on the SPI graphic, which has a lot more nuance than these seven arbitrary categories, you can see a small difference between for example, Rice (more on the perimeter side) and Wilkins (more on the interior side), which perhaps attests to their primary function as scorers, and secondary nature as a particular type of scorer.

    As for the three player sets, I did run that regression, and I might post it, but I didn’t find the results all that interesting, to tell you the truth… Thanks for your comments, per normal.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s