HHOF Probability?

BlueBull

Habby Man
Oct 11, 2017
1,699
1,436
Vancouver Island
On Basketball-Reference.com, there is a statistic known as "Hall of Fame Probability". It Shows how likely an NBA player is to make the Naismith Basketball Hall of Fame, and it's pretty Cool.
(Link to the Explanation here: Hall of Fame Probability | Basketball-Reference.com)
To know what the statistic is and how to calculate it, click the link I have put here (^), but to put it simply, It's a very cool stat and I think there should be a hockey equivalent of it.
I've been working on the "Hockey Hall of Fame Probability" stat over the past few hours, and I have done 3 NHL Players, Which have shown interesting results: Sidney Crosby 99.78%, Patrick Kane 99.54%, Carey Price 79.25%.

[Note: I translated the NBA variables into NHL Variables: Height stays as Height, NBA Titles change to Stanley Cups, Leaderboard Points stays mostly the same, except I have 5 eligible stats for skaters (Goals, Assists, Points, +/- and Penalty Minutes), and 3 for Goalies (Wins, Goals Against Average and Shutouts), The Peak Win Share Variable Changes to Peak Point Shares, and the All Star Game Variable is Changed to have it's Number Variable Split in Half between All Star Games and All Star Team. (ex: Ans = (Answer to Previous Calculations) + ((x+x)*(ASG)) in the NBA Variable, Ans = (Answer to Previous Calculations) + (x*(ASG)) + (x*(AST)) in the NHL Variable)]

I'm thinking this variable looks pretty good for a few hours work, but I'm wondering how I can polish it and fully bring it over to the NHL/Hockey-Reference. For example, I made all of these on the assumption that the example integer they gave me had a formula of "x*(Career GP)= Int.", and the Int. is used to mulitply the base numbers for each Variable (In which case, x = aprox. -.000216, as Tony Parker had 940 GP by 2014), which would be kind of redundant. I have not been able to find a base formula for the stat(s) for the NBA version, so If someone knows said base stat(s) for it, that would help.

I would like to get feedback on this. This is work in progress, so It aint perfect.

-BlueBull
 

Hockey Outsider

Registered User
Jan 16, 2005
9,197
14,635
I've tried to come up with a HOF probability stat a few times, but never had much luck. A few thoughts:

The biggest challenge is figuring out what statistics are actually relevant. For example, the basketball metric appears to be using height - which might make sense in that sport. I'm doubtful height either would or should be relevant in hockey. (I mean, generally taller players have an advantage, but if you're looking back at what they actually accomplished over the course of their careers, I can't imagine that the Hall is more likely to induct the taller player, all things being equal).

Wouldn't it be redundant to have each of goals, assists and points? It might make sense to look at goals and assists separately, if you think the HOF values one more than the other, but it seems unnecessary to include points as well.

I don't really follow basketball but my understanding (someone correct me if I'm wrong) is there's way less variation in offense from season to season, compared to the NHL. So the model probably should use, somehow, era-adjusted stats (then again, in recent years the HOF has inducted a lot of "compiler" type players, like Andreychuk and Ciccarelli, so maybe they either don't understand or don't care that the NHL in the 1980's to mid 1990's was much higher-scoring).

I'm surprised the NBA model doesn't look at the MVP award. I'd imagine that any reasonable NHL model would need to consider the Hart, Art Ross, Smythe, Norris and Vezina trophies (possibly the Calder, Selke and Byng, but my guess is none of those would prove to be relevant).

Garbage in, garbage in - it's been well established that point shares, as formulated by Hockey-Reference.com, isn't a very good statistic. I'd be hesitant to use a known, flawed statistic as a key input into the model. See the previous point - you can probably capture peak value by looking at trophy cases instead.

I'd be included to scrap the all-star game statistic altogether. I don't want to say it has zero value, but it's not much higher than that. Using the year-end all-star team would be much more relevant - but the problem there is, due to positional imbalance, some players are at a disadvantage.
 

Hockey Outsider

Registered User
Jan 16, 2005
9,197
14,635
For example, I made all of these on the assumption that the example integer they gave me had a formula of "x*(Career GP)= Int.", and the Int. is used to mulitply the base numbers for each Variable (In which case, x = aprox. -.000216, as Tony Parker had 940 GP by 2014), which would be kind of redundant. I have not been able to find a base formula for the stat(s) for the NBA version, so If someone knows said base stat(s) for it, that would help.

My guess is they created this formula using a mathematical technique known as "logistic regression". The inputs would be all of the variables in their formula (ie height, points, etc.), and the output would be a binary variable - yes the player made the Hall of Fame, or no they didn't.

In terms of where the multiples for each variable come from - if they did, in fact, use logistic regression, they would have used a data analysis program to calculate the weighting assigned to each variable. I believe it's theoretically possible to do this in Microsoft Excel, but it's not easy.

The other nice thing is any stat program would tell you whether the variables are statistically valid. For all we know, they started their model with ten variables, and dropped the weakest variables one by one, until they were only left with variables that were statistically relevant.

A stat program would also be able to highlight something called "multicollinearity". Aside from being a great word for Scrabble, this means that there are two (or more) variables that are correlated with each other. In this point, there's clearly multicollinearity between points and both goals and assists. Generally it's undesirable to have this in the model (someone with a more academic math background can explain this better than me, but I know the model is likely to be less accurate overall as a result of this).

For this reason, I would hesitate to use the variables in their formula. The numbers (presumably) have been optimized for the NBA, but there's no reason to think they'd be accurate for the NHL.
 
  • Like
Reactions: Bear of Bad News

Hockey Outsider

Registered User
Jan 16, 2005
9,197
14,635
I found an article about HOF probabilities for forwards - NHL Forward Hall of Fame Probabilities — Queen's Sports Analytics Organization

The results generally make sense, but there are some conceptual problems to point out:
  • The formula uses a combination of era-adjusted and unadjusted statistics. This doesn't make much sense. I'm not convinced that the Hall of Fame voters really understand the necessity to use era-adjusted stats - but it seems even less likely that they're using era-adjusted numbers for some statistics, and unadjusted for others.
  • The formula uses hockey-reference.com's point shares. As has been discussed in depth on HFBoards, this is a poorly-constructed formula and shouldn't be used in player comparisons.
  • The formula only looks at the number of Stanley Cups won, ignoring everything else about a player's personal contribution. You can argue that the Hall voters are simplistic Cup-counters themselves (and that's partly true). But it fails to differentiate between players who were generally good playoff performers stuck on bad teams and ones who were legitimately disappointing. (Cam Neely never won a Stanley Cup, but his playoff heroics were instrumental in getting him into the Hall). I also found it surprising that the Conn Smythe wasn't factored in.
  • The formula looks at the Hart and Selke trophies, but only as a binary outcome - you either win it or you don't. Therefore Jagr (1x Hart, 4x second place, 2x third or fourth place) and Taylor Hall (1x Hart, zero votes the rest of his career) get equal credit.
  • The weightings seem off. They walk through an example with Jagr. The fact that he scored 1.11 unadjusted points per game (impressive, but behind players like Federko, Savard and Nilsson) is apparently ten times more valuable to his HOF case than him scoring 2,080 ere-adjusted points (something only topped by Gretzky and Howe). That weighting makes no sense.
  • Maybe the response to each of these points is "the formula works". My experience (in trying to make my own HOF prediction tool) is there are a ton of inputs that can be used, and many of them are highly correlated with each other. I don't think it's tremendously difficult to make a formula that "works" - the challenge is finding a model that also has some conceptual validity.
This isn't intended as an attack on the study or the author. But it highlights the challenges in developing a model that's both 1) accurate and 2) makes sense.

It also would have been interesting to see what the formula says about past players. The formula would presumably show players like Messier and Yzerman (let alone Gretzky and Lemieux) as locks - how would they rate other players? Who does the formula see as the most deserved eligible player? More controversially, who's the least-deserving player currently in the Hall?
 

The Macho King

Back* to Back** World Champion
Jun 22, 2011
48,863
29,472
Isn't a huge part of the issue that the HHOF is among the pro-sports leagues to be the least open about their process? I know the Athletic did a few articles about the procedure last year, but it still basically comes down to "voters nominate players, then vote on those players, then wittle it down until they end up with five names." I think other sports are both more open and more strict.
 
  • Like
Reactions: sabremike and DaveG

Fantomas

Registered User
Aug 7, 2012
13,356
6,743
You just need to come up with a stat that approximates the thinking process of the Canadian-American old boys club.
 
  • Like
Reactions: abo9

jcs0218

Registered User
Apr 20, 2018
7,968
9,881
Isn't a huge part of the issue that the HHOF is among the pro-sports leagues to be the least open about their process? I know the Athletic did a few articles about the procedure last year, but it still basically comes down to "voters nominate players, then vote on those players, then wittle it down until they end up with five names." I think other sports are both more open and more strict.
Baseball publishes its complete voting results. You get 75% of the vote or you don't make it.

Both the fact that baseball publishes its results and the fact that there is a 75% cutoff make it better. Sometimes (like in 2021), nobody gets inducted.

The problem with hockey is that they are overly concerned with making the Hall of Fame induction an event. They feel they need to have 5 inductees every year to fill-out the event, and ensure they have enough material to stretch out the entire weekend and ceremony.

This ends up resulting in less than Hall of Fame worthy players making it in.

In baseball, players like Curt Shilling haven't gotten in yet after 9 years on the ballot, and he was one of the top 10 best pitchers for about 10-12 seasons. Shows you how much tougher it is to make the baseball Hall of Fame.
 
  • Like
Reactions: morehockeystats

trentmccleary

Registered User
Mar 2, 2002
22,228
1,103
Alfie-Ville
Visit site
This ends up resulting in less than Hall of Fame worthy players making it in.

If you look at it by era, they did a reasonable job. Where they blew it has been with the video game stats era (1980-1995). That's the era where they let standards collapse and starting inducting everybody and his linemate. They aren't going to do that for the Deadpuck era or the current era. Heck, they're still probably going to be inducting mediocre 1980's players 10 years from now.
 

ChiTownPhilly

Not Too Soft
Feb 23, 2010
2,106
1,391
AnyWorld/I'mWelcomeTo
Baseball publishes its complete voting results. You get 75% of the vote or you don't make it.

Both the fact that baseball publishes its results and the fact that there is a 75% cutoff make it better. Sometimes (like in 2021), nobody gets inducted.

The problem with hockey is that they are overly concerned with making the Hall of Fame induction an event. They feel they need to have 5 inductees every year to fill-out the event, and ensure they have enough material to stretch out the entire weekend and ceremony.

This ends up resulting in less than Hall of Fame worthy players making it in.

In baseball, players like Curt Shilling haven't gotten in yet after 9 years on the ballot, and he was one of the top 10 best pitchers for about 10-12 seasons. Shows you how much tougher it is to make the baseball Hall of Fame.
I always find comparisons/contrasts between the Hockey Hall of Fame & the Baseball Hall of Fame really interesting. I'd venture to say they are the two sports whose history is most familiar to me.

Yes, I'd agree that the idea of inducting five into the HHoF each year does no particular favors to the standard. That said, the bigger problem is that the 3+ male professional players they select every year have lately included some wrong'uns like Housley & Kevin Lowe. Still, it's not like the Baseball Hall of Fame isn't capable of coughing up a hairball like Harold Baines every now-and-then. Back when Frankie Frisch had a seat at the head of the table on the BHoF Veterans Committee, they set up a virtual hairball factory- ralphing up inductions like Chick Hafey, Burleigh Grimes, Travis Jackson...

So- the players like Baines, Grimes, and so on got in through a path other than the Sportwriters Vote. Does this mean that we should trust the Sportswriters- exclusively? Well- they have their issues too- and head of the table among them is their treatment of Kurt Schilling.

At the risk of a digression...

Kurt Schilling is the best Pitcher not named Roger Clemens who's not in the Hall of Fame. Kurt Schilling, by objective measures, is superior to the 'average' Hall of Fame pitcher. [A reading of the statistics suggests that 'average' Hall of Fame pitcher breaks around Hal Newhouser level. Jim Palmer is a little better, Juan Marichal not quite-as-good (but close!)- about that range.] That's how the advanced statistics sort them... and Schilling (whose legacy is deeply enhanced by what he's done in the PostSeason) is UNDERvalued by the advanced metrics, as they do not take the PostSeason into account.

The long and short of it is the following human failing:

1) At least 75% (if not more) of the ballot-submitting Sportswriters are repulsed by Schilling's political views.
2) At least a third of those (if not more) are actually all right with allowing that aversion to influence their voting decision.

There really is no parallel- in any sport- to the stall of Schilling's Hall of Fame candidacy. Assuming that Schilling's trust in a Special Committee induction down-the-road is rewarded, the closest hockey parallel I can think of is Clint Benedict- whose two clear strikes against him were a mid-career fondness for the bottle and (perhaps no less significantly) the fact that a number of contemporary observers didn't like his unconventional playing style. HHoF Goalie standards are tough, notoriously so- but Benedict's résumé is well north of an 'average' HHoF netminder. Still, he waited until age 72 before being inducted- watching good, but not as great, Goaltenders like Hugh Lehman, Tiny Thompson and Alec Connell get the call before he did. Really says more about the mentality of the selectors than it says about Benedict.

 

Scandale du Jour

JordanStaal#1Fan
Mar 11, 2002
62,407
29,209
Asbestos, Qc
www.angelfire.com
I mean, it is VERY obvious why Curt Schilling is not in and likely won't ever be in... and it has NOTHING with his abilities or accomplishments. He is rather a bad example to show how "tough" the baseball hall of fame is compared to the NHL.

He would have been a shoe-in if not for... well... you all know :laugh:
 
  • Like
Reactions: MarotteMarauder

93LEAFS

Registered User
Nov 7, 2009
34,051
21,149
Toronto
I mean, it is VERY obvious why Curt Schilling is not in and likely won't ever be in... and it has NOTHING with his abilities or accomplishments. He is rather a bad example to show how "tough" the baseball hall of fame is compared to the NHL.

He would have been a shoe-in if not for... well... you all know :laugh:
Naw, he doesn't have the stats to be a shoe-in. Baseball is insanely tough. The benchmark that guaranteed entry for starters was 300 wins. That's been lessened over the years. Halladay got in with 203 but also had 2 Cy Youngs. The only modern starters who got in with less wins were Halladay (who debuted later) and John Smoltz (who had a dominant run as a reliever).
 

The Panther

Registered User
Mar 25, 2014
19,351
16,000
Tokyo, Japan
If you look at it by era, they did a reasonable job. Where they blew it has been with the video game stats era (1980-1995). That's the era where they let standards collapse and starting inducting everybody and his linemate. They aren't going to do that for the Deadpuck era or the current era. Heck, they're still probably going to be inducting mediocre 1980's players 10 years from now.
Not that I ever took the Hall of Fame inductions very seriously, but having Housley, Andreychuk, and Lowe all inducted in a few years of each other killed off any admiration I had for a "Hall of Fame career".

Housley scored a bunch of points on mediocre teams. And that's all he did. He's basically the defence equivalent of Tony Tanti, except that he played a long time. His coaches didn't trust him (a defenceman) to kill penalties or be on the ice with a minute left in close games. He won nothing.

Andreychuk at least was really good at what he did, but he wasn't as good at what he did as, say, Tim Kerr. He just had a longer career. He was basically Mike Bullard with a longer career.

Lowe was a player I watched, as a kid, win Stanley Cups. At least he has his 6 Cups. But he was never a top-five player on his team for any of them. He likely was never a top-10 player on any of them (arguably he was in 1984, but that's it). From about 1981 to 1987, Lowe was an above-average defensive defenceman, yet never at All Star level. He was a very slow skater, and possibly the worst shot-blocker I've ever seen.

These choices can be defended in certain quarters, but the point, for me, is that a sports Hall of Fame needs to have stricter standards.
 

Weztex

Registered User
Feb 6, 2006
3,118
3,731
If you think Gary had nothing to do with the decision...

The rule that the inductee must be retired from the category it's being inducted, was, IMHO, albeit unspoken, but absolutely obvious and natural.

The rule doesn't apply to the builders category. Peter Karmanos, Jeremy Jacobs, Gary Bettman, Jim Rutherford, Jerry York and Ken Holland are all HoF inductees who were active in 2021.

Bettman is the most successful president/commissioner in NHL history and was the biggest Hall of Fame shoe-in outside of Crosby, Ovechkin and Jagr. Why would he ever try to force his way in?
 

morehockeystats

Unusual hockey stats
Dec 13, 2016
617
296
Columbus
morehockeystats.com
The rule doesn't apply to the builders category. Peter Karmanos, Jeremy Jacobs, Gary Bettman, Jim Rutherford, Jerry York and Ken Holland are all HoF inductees who were active in 2021.

Bettman is the most successful president/commissioner in NHL history and was the biggest Hall of Fame shoe-in outside of Crosby, Ovechkin and Jagr. Why would he ever try to force his way in?

First of all, the fact that all that list is inducted, doesn't make that list right. Well, for owners, maybe, at age 75+, make an exception.

Second, you compare the commissioner not to the history, but to the contemporary commissioners of the three other major sports. I haven't seen anything exceptional from Gary,

Moreover, IMHO, Campbell was more successful than Bettman.
 
Last edited by a moderator:

abo9

Registered User
Jun 25, 2017
9,110
7,220
If you look at it by era, they did a reasonable job. Where they blew it has been with the video game stats era (1980-1995). That's the era where they let standards collapse and starting inducting everybody and his linemate. They aren't going to do that for the Deadpuck era or the current era. Heck, they're still probably going to be inducting mediocre 1980's players 10 years from now.

That actually would make it simpler for a mathematical model. You would not need to adjust points per era, because the humans in charge of voting are looking at things like Cups, individual awards, point totals (not necessarily relative to peers).

Some players would be missed by an algorithm for sure, but these could be considered/identified as HHOF outliers and it's a little more difficult to numerically assess things like "Leadership, competitiveness, camaraderie, etc."
 

Hockey Outsider

Registered User
Jan 16, 2005
9,197
14,635
I found a series of articles on a blog where the author makes a HOF probability model:

- 1st article
- 2nd article
- 3rd article
- 4th article
- 5th article

The author hasn't released the full model, but he's hinted at what parameters are included. (It appears to be some combination of career scoring totals (adjusted for era), Stanley Cups, major awards, and year-end all-star teams).

The model doesn't care if the parameters are "fair", it only looks at what's correlated with HOF inductions. For example, the model really likes John LeClair's chances because he was a five-time all-star, but it doesn't "know" that he was getting those nods as a LW, which was clearly the weakest position during the DPE.

The author also understands that the choice of which parameters are included would impact the end result. (His model has Brad Marchand as having a much higher likelihood of being inducted than Patrice Bergeron. That, of course, makes no sense, and he acknowledges it's because, for whatever reason, he isn't including the Selke trophy in the calculations).

The second article linked above was a deep dive into the challenges in building the model, and talking about the risk of over-fitting the data. This is what I struggled with when (unsuccessfully) trying to build a similar system. I can build a model that predicts HOF inductions with 100% accuracy based on historical data - but the model would consist of dozens of arbitrary rules, and it would be (mostly) useless in predicting future inductions.

According to the model, the most worthy players are Howe (the only player at 100%), Gretzky (who comes in at 1/500 trillionth of a percent below 100%), Bourque, Richard, and Ovechkin. The best eligible players not currently in the Hall are John LeClair and Paul Thompson (who both have about a 95% probability), Keith Tkachuk and Brad Marchand (low 90's), and Flash Hollett (the model doesn't "know" that he peaked during the talent-depleted WWII years).

The HOF inductees with the lowest probabilities are, not surprisingly, players who spent most of their best years out of the NHL (Igor Larionov, Slava Fetisov, Sergei Makarov, Vaclav Nedomansky). Bob Gainey and Guy Carbonneau both had a 3% probability (but remember, the model doesn't take the Selke trophy into account). Manually scanning the results, some HOF'ers with low probabilities include Pat Lafontaine (18%), Harry Howell (18%), Rod Langway (15%), Bernie Federko (12%), Kevin Lowe (8%), Clint Smith (10%), Gerry Cheevers (5%), Leo Boivin (4%), and Edgar Laprade (1%).

I know it's not fair to criticize a model based on cherry-picking one observation, but it was surprising to see Dick Duff, Dino Ciccarelli and Clark Gillies (25-32%) rank higher than Peter Stastny (22%) - who also ranks behind Doug Weight, Pat Verbeek, and Justin Williams!

The author's work is very impressive (I've been meaning to do a project like this, but never found the time). There are a few surprises, but for a purely objective, statistical system, the results seem to be about as accurate was one could hope for.
 
Last edited:

Mickey Marner

Registered User
Jul 9, 2014
19,758
21,551
Dystopia
Something based on The Hockey News Yearbook top-50 players would be a good place to start. It would provide an annual weighting of stats, trophies, eye test, fame etc. that closely reflects the HHoF committee. A list like that would be able to rectify the Marchand-Bergeron situation that Hockey Outsider mentioned. We know Bergeron has been better than Marchand, but it’s difficult to demonstrate that accurately with stats and trophies. The top-50 list should give you just the right amount of subjectivity to plug the holes found with stats and trophies.
 

Ad

Upcoming events

Ad

Ad