Aging Curves in Baseball
So I’m working on an aging curve in baseball research project with two of my students here at Loyola. While there has been quite a bit of work done on aging curves in many sports, including baseball, our question that we are interested in is this: What would the aging curve look like if players played every season from the age of 22 through 40. Because what we observe are the players who “survive”. What WOULD have happened if a player who was forced out of the league at the age of 30 had played until they were 40? We view this as a missing data problem and are currently using multiple imputation with a hierarchical structure to impute missing seasons and then estimating the age curves based on the imputed data. I’d like to do the aging curve estimation using functional data analysis, but……we’ll see.
Anyway, I’ve started doing some lit review for this and I figured I’d post some of the interesting articles that I’ve found related to the topic:
Albert (1992) looks at estimating models for home run rates and as part of this Albert incorporates an aging curve into his model. A quadratic form is assumed for aging curve.
Berry et. al (1999) incorporates an aging curve into their analysis, but instead of a quadratic form they use a nonparametric model. They looked at hockey, golf, and baseball. (Albert (1999) in a comment argues against the aging model presented in Berry et. al. (1999).
Wakim and Jin (2014) take a function data analysis approach to the problem and look at MLB and NBA. This is probably the most sophisticated statistical analysis that I have seen so far in regards to aging curves.
Dendir (2016) in the Journal of Sports Analytics looks at when soccer players peak and, based on their analysis, found that players in top leagues peak somewhere between 25 and 27.
Vaci et. al (2019) looked at aging curve in NBA players.
This is clearly not an exhaustive list of paper related to aging curve in sports, but it’s some of the interesting papers that I’ve come across so far.