You Are Here > Basketball-Reference.com > About > Hall of Fame Probability

Hall of Fame Probability

Introduction

What statistics or accomplishments have the Hall of Fame voters deemed to be most important? This question can be answered using a technique called logistic regression. The logistic regression model is a binary response model where the response is classified as either a "success" (in this case, being elected to the Hall of Fame) or a "failure" (not being elected to the Hall of Fame). One or more predictor variables are selected and the resulting model can be used to predict the probability of a success given certain values of the predictor(s).

Building the Model

For the Hall of Fame problem, I tried to use as many predictor variables as I could think of, but I did not use statistics that have not been kept for most of the NBA's history (e.g., steals). My player pool consisted of players who had played a minimum of 400 NBA games and had been eligible for at least one Hall of Fame election. After trying numerous models, my final model had eight predictor variables:

  1. height (in inches)
  2. last season indicator (1 if 1959-60 or before, 0 otherwise)
  3. NBA points per game
  4. NBA rebounds per game
  5. NBA assists per game
  6. NBA All-Star game selections
  7. NBA MVP award shares
  8. NBA championships won

All of the predictors listed above were significant at the 0.05 level except for NBA MVP award shares, which had a P-value of 0.07. However, every NBA MVP award winner who is eligible for the Hall of Fame has been elected, so I thought it was important to keep that term. Other than height, all of the predictors had positive coefficients. ABA statistics, honors, and championships were not important predictors of Hall of Fame status, which is why I only used NBA statistics in my final model. I don't like ignoring the ABA statistics, but that's what the voters have apparently done. Keep in mind that my goal was not to determine who should be in the Hall of Fame, but rather who is likely to be in the Hall of Fame.

The table below gives the parameter estimates of the coefficients for each of the seven predictors:

height                          -0.20518
last season indicator            4.21609
NBA points per game              0.45098
NBA rebounds per game            0.37523
NBA assists per game             0.39329
NBA All-Star game selections     0.48684
NBA MVP award shares             3.18416
NBA championships won            1.03335

Example

The parameter estimates given in the previous section can be used to obtain the predicted probability of Hall of Fame election for a particular player. I will go through an example using Jo Jo White. Find the values of the eight predictor variables for White, multiply them by the coefficients given in the table above, and find the sum of the products:

height                        -0.20518 * 75      = -15.3885
last season indicator          4.21609 *  0      =   0
NBA points per game            0.45098 * 17.2031 =   7.7583
NBA rebounds per game          0.37523 *  3.9964 =   1.4996
NBA assists per game           0.39329 *  4.8925 =   1.9242
NBA All-Star game selections   0.48684 *  7      =   3.4079
NBA MVP award shares           3.18416 *  0.0073 =   0.0232
NBA championships won          1.03335 *  2      =   2.0667
-----------------------------------------------------------
                                                     1.2914

To find the predicted probability of Hall of Fame election, do the following:

P(HoF election) = e**1.2914 / (1 + e**1.2914)
                = 0.784

Based on Jo Jo White's statistics and accomplishments, the probability that he has been elected to the Hall of Fame is 0.784.

Summary

Hall of Fame probabilities are presented for all players with a minimum of 400 NBA games played. Although it can be risky to make predictions for active players, you can think of these probabilities as answering the question "If this player retired today, what is the probability he would be elected to the Hall of Fame?". The model was built using a pool of 668 players. One method to assess classification accuracy is to compare the estimated Hall of Fame probability for the case to the actual result. Of the 668 players, 78 had been elected to the Hall of Fame and 590 had not. If the player's predicted probability of election was greater than or equal to 0.5, I predicted that he was in the Hall of Fame. Of the 78 players in the Hall of Fame, 63 were correctly classified (80.8%) and 15 were not (19.2%). Of the 590 players not in the Hall of Fame, 583 were correctly classified (98.8%) and 7 were not (1.2%). Overall, 646 of the 668 players (96.7%) were correctly classified by the model.