BoxScores: Player contributions to team success

Note: Since this post was published, the Winshares formula has undergone some revisions of some substantive import, as well as a renaming. To see the most current iteration and accurate tables and graphs, please see the BoxScores page.

This post is a lengthy discussion of the theory and methodology behind the Winshares player value metric. If you are already familiar enough with Winshares, or are impatient, read the “In brief” section just below, and then you might want to skip ahead to the payoff graphics at the very end of this post. As always, comments and criticisms are encouraged!

In brief

Winshares are a statistic developed to estimate a player’s value in terms of wins. Combining individual statistics with team performance, Winshares allocate credit for team wins according to each team member’s contributions to team total production. As of the end of the 2007-08 regular season, Winshares are calculated as follows:

winshr = (val / team val) * team wins

val = pts – fgx*0.5603802 – ftx*0.9345311 + as*0.7697530 + or*0.8709732 + dr*0.7111727 + st*0.9190908 + bk*0.9495596 – to*0.8473544 – pf*0.7729732

Motivation

Why create yet another statistic that attempts to reduce all of player value to one number? Especially when there are so many other good and widely accepted measures already in use? Because the theory is sound, the operationalization is elegant, and the results appear valid.

Why use boxscore stats, ignoring plus/minus and everything that modern science now knows about possessions and efficiency, especially since defense is so poorly captured and other statistics, like assists, are arbitrary? Because boxscore stats go back to the beginning of professional basketball. Plus/minus is extremely data-intensive to calculate, and we have no way of getting that kind of data for most historical games. I’m ignoring possessions, and not emphasizing defense, because it is my belief that comparing one player’s boxscore stats to those of his team gives a reasonable estimate of player contributions–sometimes overestimating, other times underestimating, but on average, getting it approximately right. Mostly, though, calculating Winshares is possible as long as the same stats are tracked for all players on a team, and we know how many times the team won–meaning it can be applied very generally.

Why even try to use statistics to measure player value? You can’t capture that with a number! There is much to be said on both sides of this issue. I am of the opinion that statistics ought to be considered within a larger context of other data, qualitative and quantitative. However, I do feel strongly that numbers have a lot to tell us–they allow us the hope of greater objectivity, and therefore possibly less subjective, more accurate assessments. When applied identically to all players, Winshares will adjudicate “fairly,” paying no attention to max contracts, shoe endorsements, nicknames, or “intangibles.” Intangibles are tricky–they may indeed be part of player value, but they are also, by definition immeasurable, and may therefore expand to fill the role required of them? Was your favorite player not voted league MVP? Certainly they failed to consider his intangibles, which would have easily put him over the top…

Why are Winshares measured in that specific way? Don’t you know that linear weights are no good, or that assists are worth much more than you give them credit for? Read on…

Theory

Imagine a cooperative grocery store, owned by those who work there. At the end of one year, the store’s revenues exceed its expenditures by a large margin, and the workers are to be paid out of this surplus. One concept of fairness might dictate that a worker who worked p% of the total man-hours for that year ought to receive p% of the surplus. Arguably, he contributed p% of whatever effort determined whether or not the store would succeed, and should be rewarded accordingly. A worker working a large number of hours could be said to have contributed more to the store’s success or failure than another who only worked one shift a month–if the store profits by a large margin, that employee should receive a larger share of the windfall, just as if the store loses money, that employee should be held culpable for a larger share of the deficit.

Now imagine another similar store competing in the same market. Its surplus at the end of the year is twice that of the first store. Is it possible to compare the value, in terms of surplus, of employees from the two different stores? I would argue that it is possible: if pay is allocated in the same manner in both stores, with worker i in store j receiving payment in proportion to his labor contribution, the worker who receives the highest paycheck is the most valuable. That is, if pay is equal to worker man-hours over store total man-hours times store surplus, we can compare employees across any two firms in the same market.

But wait–what if some employees are more efficient workers than others? What if Alice can generate three times the revenue that Bob can generate in the same number of hours? Doesn’t our payment formula then overpay Bob and under-reward Alice, and doesn’t this complicate yet again the comparison across firms? Yes it does, and so we might try to find better measures of worker contributions to the surplus. Perhaps we could keep statistics on the number of cans shelved, or the number of transactions tendered, or the number of smiles flashed–if we could figure out even just the relative value of each of these things (that is, not necessarily how they each translate into surplus, but whether one smile is worth two cans shelved, etc.), then we are back on track. It doesn’t matter whether or not we can measure exactly how much revenue is brought in by each additional shelve stocked (although this would be interesting and useful), but if we know that it’s worth more (by some scalar factor) to clean the bathroom than it is to check receipts at the door, we can still estimate each workers contribution to the total amount of valuable work being done at the store.

This analogy carries over very well to sports, and specifically here, to basketball. A player who plays fully 1/5th of total team minutes played (that is 48 minutes per game for 82 games) ought to be credited with approximately 1/5th of his team’s success or failure–both of which can be measured in terms of wins. Using minutes to assess contributions runs into the same problem as in the stores above–they say nothing about efficiency–and as such, it is useful to find other statistics that more accurately estimate contributions to team success. The statistics employed in Winshares are boxscore stats, such as points, rebounds, assists, missed shots, etc. These are imperfect measures, but to the extent their relative value can be assessed, they may be useful in estimating each player’s contribution.

Calculation

Unfortunately, this relative evaluation is very difficult. It is often claimed by more “sophisticated” observers of the game that most fans fail to look past point-per-game numbers, giving infinitely more weight to scoring than to any other contributions. Yet, it is exceedingly difficult to identify just what the appropriate weights might be. Multiple regression analysis yields somewhat unsatisfactory results when applied in a straightforward manner–typically finding, for example, that offensive rebounds are actually detrimental to team success. Other work, including that done by Berri and Hollinger, is much more thorough, but leaves something to be desired (a topic which has been covered better elsewhere than can be possibly done by this author in this exposition).

As for Winshares, it would be disingenuous to claim that the ideal and true set of values has been found, but it is my belief that the reasoning is sound, and the results pass the “laugh test,” that is, given a subjective assessment of the sport, the relative importance of each boxscore statistic seems to be, at the very least, in the right order.

To identify the weights used, we may begin with a simple but strong assumption: the most valuable “good things” are those that opponents are most resistant to allowing, and thus are relatively rare, while the most detrimental “bad things” are those that a player is most trying to avoid, and thus are similarly relatively rare. With this in mind, I present counting sums for each of 8? boxscore counting stats from 1979-80 through 2007-08 (which I call the Modern era, characterized by the introduction of the three point shot to NBA play):

pts fgx* ftx* as or dr st bk to pf
6384067 2806562 417958 1469912 823716 1843893 516530 322015 974500 1449354

* field goals missed and free throws missed

Dividing each of these totals by the sum of the totals (17,008,507), we arrive at the following frequencies:

pts fgx ftx as or dr st bk to pf
0.37535 0.16501 0.0246 0.08642 0.0484 0.10841 0.0304 0.0189 0.0573 0.08521

Normalizing these frequencies to that of points, we get:

pts fgx ftx as or dr st bk to pf
1 0.43962 0.0655 0.23025 0.129 0.28883 0.0809 0.0504 0.1526 0.22703

Then, subtract each of the above from 1, so we are placing more weight on the rarer occurances, and set the points coefficient to 1, because the ultimate aim of all defense is to prevent scoring, and the ultimate aim of all offense is to score:

pts fgx ftx as or dr st bk to pf
1 0.56038 0.9345 0.76975 0.871 0.71117 0.9191 0.9496 0.8474 0.77297

Assign positivity and negativity according to whether each is helpful or deleterious to team success, and we arrive at a set of scalars for estimating valuable contributions (often abbreviated val):

val = pts – fgx*0.5603802 – ftx*0.9345311 + as*0.7697530 + or*0.8709732 + dr*0.7111727 + st*0.9190908 + bk*0.9495596 – to*0.8473544 – pf*0.7729732

Any player’s val less than zero is then set to zero, but val is rarely a large negative number. Compared to the difficulty of valuable contribution assessment, the final steps in Winshare calculation are extremely simple: merely find each player’s percent contribution to his team’s total sum of valuable contributions from all players, and multiply this by team wins:

winshr = (val / team val) * team wins

We are left with an estimate of individual player value that combines individual contributions and team success, and allocates the most credit to those players who did the most to win the most. There is just one adjustment made to allow comparisons across all NBA seasons: for seasons prior to the official distinction between offensive and defensive rebounds, the formula is adjusted to incorporate total rebounds in their stead.

Discussion

The first thing to note is that as we apply the formula increasingly further back in time, we might become somewhat less certain of its absolute accuracy as the boxscore statistics on which it is based drop from the official record. Thus, for the very earliest years of the BAA, we might not be as confident in our estimate as for most years since, but the results are still very compelling, and seem to hold up to scrutiny despite the relative dearth of data. One of the merits of Winshares as a measure is that it is relatively flexible across a variety of situations, relying as it does on player percent contributions, which can almost always be measured in some manner.

Another caveat is to bear in mind that Winshares is a season-cumulative statistic, and so the ceiling varies by the number of games played in a season. Winshares for the strike-shortened season of 1998-99 are much lower than other contemporary seasons, due to the fact that all teams won fewer games than they normally would have. Adjustments can easily be made, however, by finding per-game or per-minute Winshare rates, and making comparisons at that level. This helps, too, in determining the impact of an injured player, given that he has played fewer games. However, the initial impetus for constructing Winshares was to estimate player value in terms of wins, and this is best done on a season-cumulative scale.

One thing done relatively poorly by Winshares in its current iteration is measurement of the value of players traded during the season. To do this completely accurately, it would be useful to isolate only the games the player appeared in for each of his several teams, looking at individual statistics and team wins within those sub-season units. However, this sort of analysis requires data not generally available in convenient form, and truly, the logical extension of this idea is fairly well captured by the plus/minus statistic. As it stands, Winshares still does a relatively good job (subjectively assessed) in measuring traded players’ value, but it is something worth noting.

Winshares in application

Often understanding is best achieved through application, and so I present

The Top 1,000 Winshare Seasons

covering the NBA, ABA, and BAA from 1946-2008. Keep in mind the above caveats about data availability, especially for seasons prior to 1951-52. In a similar vein, here is a list of

The Top 100 Winshare Careers

again, this is cumulative across the entirety of each player’s career, and so players with longevity are advantaged. I have included games played in this listing, to allow the reader to make his or her own adjustments.

Finally, every player, every team played for, 2007-08 season.

Geometric representation

One of the more useful ways to conceptualize Winshares is as player percent valuable contributions * team success. This has a particularly interesting expression in geometric terms, where Winshares can be thought of as the area of the rectangle created by multiplying valpct by team wins. The following series of visualizations depicts Winshares as a geometric comparison of player value. The color scheme is based on playing style–more detail on this classification may be found here.

2007-08 NBA: Chris Paul edges out Kobe Bryant as most valuable player according to Winshares, Kevin Garnett and Paul Pierce turn in stellar seasons for the Celtics, and LeBron James carries a huge load for his team, and is rewarded in terms of Winshares, if not in post-season success.

1986-87 NBA: A season featuring more all-time greats than perhaps any other (as noted here), we see Larry Bird and Magic Johnson at the height of their rivalry, Michael Jordan and Hakeem Olajuwon coming into their own, and too many other star players to even mention.

1971-72 NBA & ABA (combined): Classic Lakers and Celtics teams, a young Dr. J, Kareem’s greatest year, an almost-as-great year from Artis Gilmore, and countless other NBA past greats.

Sacramento Kings Franchise History: This storied franchise didn’t quite make the playoffs in a very competitive 2007-08 Western Conference, but its history is littered with greats such as Oscar Robertson and Chris Webber.

Advertisements

3 responses to “BoxScores: Player contributions to team success

  1. Chris Lawnsby

    I’m aware of many of the criticisms levied against Hollinger’s PER, but I haven’t read anything criticizing Berri.

    I’d be interested in knowing what the specific critiques of Berri’s work are, and where I could find them.

  2. Pingback: Reading is Great! Tuesday’s NBA Rumors, Breaking News, and Blog Links - EmptyTheBench.com

  3. The Berri Wins Produced community and the +/- supporters have both championed their own rating systems, at times at the expense of the other. Many of the Berri criticisms can be found at the APBRmetrics forum.

    One of the simplest, most intuitive criticisms of his work, especially for people not intimately familiar with regression analysis, is that he overvalues rebounding. Think of some good rebounders, go over to his site, and I think you will find that on the whole they are overvalued in comparisons to “equal” players that don’t board as well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s