Building an NCAA men’s basketball prediction model
Last Spring, Loyola statistics professor Greg Matthews and I won the March Machine Learning Mania contest run by Kaggle, which involved submitting game probabilities for every possible contest in the 2014 NCAA men’s basketball tournament.
Recently, we co-wrote a paper that motivates and summarizes the prediction model that we used. In addition to describing our entry, we also simulated the tournament 10,000 times in order to help quantify how likely it was that our submission would have won the Kaggle contest.
The paper has been submitted for publication at a journal, and we are crossing our fingers that it gets accepted. The pre-published version of the paper is up on arXiv (linked here).
Quick summary: to estimate the probabilities for each game, we merged two probability models, one using point spreads (Rd. 1) and estimated point spreads (Rd. 2- Rd. 6) set by sports books, and the other using team efficiency metrics from Ken Pomeroy’s website.
View original post 138 more words