Explanation ▪ Career & Active Leaders

What statistics or accomplishments have the Hall of Fame voters deemed to be most important? This question can be answered using a technique called logistic regression. The logistic regression model is a binary response model where the response is classified as either a "success" (in this case, being elected to the Hall of Fame) or a "failure" (not being elected to the Hall of Fame). One or more predictor variables are selected and the resulting model can be used to predict the probability of a success given certain values of the predictor(s).

For the Hall of Fame problem, I tried to use as many predictor variables as I could think of, but I did not use statistics that have not been kept for most of the NBA's history (e.g., steals). My player pool consisted of players who had played a minimum of 400 NBA games and had been eligible for at least one Hall of Fame election. After trying numerous models, my final model had seven predictor variables:

- height (in inches)
- last season indicator (1 if 1959-60 or before, 0 otherwise)
- NBA points per game
- NBA rebounds per game
- NBA assists per game
- NBA All-Star game selections
- NBA championships won

All of the predictors listed above were significant at the 0.05 level. Other than height, all of the predictors had positive coefficients. ABA statistics, honors, and championships were not important predictors of Hall of Fame status, which is why I only used NBA statistics in my final model. I don't like ignoring the ABA statistics, but that's what the voters have apparently done. Keep in mind that my goal was not to determine who in the Hall of Fame, but rather who is likely to be in the Hall of Fame.

The table below gives the parameter estimates of the coefficients for each of the seven predictors:

height -0.1771 last season indicator 3.1498 NBA points per game 0.3433 NBA rebounds per game 0.4193 NBA assists per game 0.3327 NBA All-Star game selections 0.5626 NBA championships won 0.9151

The parameter estimates given in the previous section can be used to obtain the predicted probability of Hall of Fame election for a particular player. I will go through an example using Jo Jo White. Find the values of the seven predictor variables for White, multiply them by the coefficients given in the table above, and find the sum of the products:

height -0.1771 * 75 = -13.2825 last season indicator 3.1498 * 0 = 0 NBA points per game 0.3433 * 17.2031 = 5.9058 NBA rebounds per game 0.4193 * 3.9964 = 1.6757 NBA assists per game 0.3327 * 4.8925 = 1.6277 NBA All-Star game selections 0.5626 * 7 = 3.9382 NBA championships won 0.9151 * 2 = 1.8302 ---------------------------------------------------------- 1.6951

To find the predicted probability of Hall of Fame election, do the following:

P(HoF election) = 1 / (1 + e**(-(1.6951))) = 0.845

Based on Jo Jo White's statistics and accomplishments, the probability that he has been elected to the Hall of Fame is 0.845.

Hall of Fame probabilities are presented for all players with a minimum of 400 NBA games played. Although it can be risky to make predictions for active players, you can think of these probabilities as answering the question "If this player retired today, what is the probability he would be elected to the Hall of Fame?". The model was built using a pool of 750 players. One method to assess classification accuracy is to compare the estimated Hall of Fame probability for the case to the actual result. Of the 750 players, 89 had been elected to the Hall of Fame and 661 had not. If the player's predicted probability of election was greater than or equal to 0.5, I predicted that he was in the Hall of Fame. Of the 89 players in the Hall of Fame, 74 were correctly classified (83.1%) and 15 were not (16.9%). Of the 661 players not in the Hall of Fame, 651 were correctly classified (98.5%) and 10 were not (1.5%). Overall, 725 of the 750 players (96.7%) were correctly classified by the model.