I have in this space previously discussed how to find how similar any two players are, based solely on their boxscore statistics, and attempted, to some extent, to justify myself theoretically. Now, to unveil the results: For my dataset of all modern (1979-2007) NBA players, I subsetted the top 500 according to the formula (min^(10/9))/gp, which is a kind-of weighted minutes-per game statistic that values both playing time and longevity. Thus, I could extract some of the best (admittedly measured poorly, by playing time) younger players, and a good number of veterans at the same time. I summed their career statistics across the entire time period, and ran them through the distance finding algorithm discussed in the previous post. This resulted in a matrix of distances, which I offer to you here as a 501 x 501 cell .csv file, which I’ve zipped to about 1.3 MB:
However, I’ve also got a selected subset (due to size considerations) of comparisons posted to Google Docs, and it should be sortable, but not editable:
Now, for the punchline: a method such as this can be used to give us new insights. If we accept that the comparisons it makes are valid in general, then we may be able to accept the comparisons that surprise us. For example, if the matching algorithm tells us that the players most statistically similar to Michael Jordan are Kobe Bryant, LeBron James, Tracy McGrady, Dwyane Wade, Vince Carter, Clyde Drexler, and Paul Pierce, I would be tempted to accept the validity of such comparisons. Thus, I would argue that I should be willing to accept the conclusion that the player most similar to Jordan is none of these, but rather, Chris Mullin (who is of course frequently compared to Larry Bird, seeing as they are both Caucasian, but to whom I have never heard Jordan compared).
To conclude, I urge you to play around with both the Google Spreadsheet and the entire .csv matrix on your own. Please let me know if you find the comparisons to ring generally true, and if so, whether there were any that surprised you.