Last year my first round submission to Stat Geek Idol was “Predicting the Sweet Sixteen with a Classification Tree.” My two big suggestions were to 1.) not get overly excited about Florida State or Michigan and 2.) the 14 seeds looked good. Looking back, neither Florida State nor Michigan made it to the Sweet Sixteen, but none of the 14 seeds won any of their games. (Though just about every other seed did including 9,10,11,12,13, and 15. Nearly all the lower seeds EXCEPT for 14.)
So, I was right about some things and wrong about others. You cant win them all. Though I did get one strong endorsement from Tweeter @ClevTA who claims that my classification tree helped him win his pool, besting about 350 other entries. Let’s look at what the classification tree predicts this year.
The first split is based on RPI of 0.6169. Teams above this threshold will be the R groups (right hand side of the tree image below) and teams below the threshold with be the L groups. Overall, in the years used to build the model, teams in the R group advanced to the Sweet Sixteen about 67 percent of the time, whereas teams in the L group advanced just less than 10 percent of the time.
The R group teams this year are Duke, New Mexico, Louisville, Miami (FL), Kansas, Gonzaga, Florida, Indiana, Michigan State, Georgetown, Ohio State, Marquette, Memphis, Syracuse, Arizona, North Carolina, Michigan, Kansas State, Belmont, Saint Louis. All the other teams are in the L group.
The R teams
The R group teams can be broken down into 4 more sub-groups R1-R4. Teams in the R1 group qualify for the Sweet Sixteen about 91% of the time and every single team in the R2 group has qualified for the Sweet Sixteen of in the years used to build the model (2007-2011). On the other hand, teams in the R3 group have only qualified about 51% of the time, and no team in group R4 has qualified for the Sweet Sixteen between 2007 and 2011. So, who’s in each group:
R1 (91.18%): RPI >.643
(2)Duke, (3)New Mexico, (1)Louisville, (2)Miami (FL), (1)Kansas, (1)Gonzaga
R2 (100%): RPI>0.6169 & RPI <0.643 & Opp.Effective.Poss.Ratio<0.9147
(3)Florida, (6)Arizona, (11)Belmont, (4)Saint Louis
Of course all four of these teams can’t make it to the Sweet Sixteen as Arizona plays Belmont in the Second (nee First) round of the tournament.
R3 (51.61%): RPI>0.6169 & RPI <0.643 & Opp.Effective.Poss.Ratio>0.9147 & Avg.2nd.Half.Margin>2.998
(1)Indiana, (3)Michigan State, (2)Georgetown, (2)Ohio State,, (6)Memphis, (4)Syracuse, (4)Michigan,
R4 (0%): RPI>0.6169 & RPI <0.643 & Opp.Effective.Poss.Ratio>0.9147 & Avg.2nd.Half.Margin<2.998
(3)Marquette, (4)Kansas State, (8) North Carolina [Correction: In the original post, I had Marquette and Kansas State in the R3 group. They should be in the R4 group.]
The L teams
L1 (5.34%): RPI < 0.6169 and Assists.Turnovers<1.317
Notables: (5) Oklahoma State, (5) VCU, (5) UNLV, (6) Memphis, (6) Butler, (7) Illinois, (7) San Diego State
Butler and VCU have both advanced to the Sweet Sixteen out of this group before.
L2 (17.86%): First Split RPI < 0.6169 and Assists.Turnovers>1.317 and Opp Pct Pts From 2 >=.5133
(7)Notre Dame, (6)UCLA, (11)Bucknell, (7)Creighton, (9)Temple, (5)Wisconsin
L3 (66.67%): RPI < 0.6169 & Assists.Turnovers>1.317 & Opp Pct Pts From 2 <.5133
Teams that Made the Sweet Sixteen in Bold
R1 (5/6): Syracuse, Michigan State, Kentucky, North Carolina, Kansas, Duke
R2 (1/1): Ohio State
R3 (3/7): Marquette, Baylor, Indiana, Missouri, Georgetown, Wichita State, Memphis
L1 (6/45): Wisconsin, Cincinnati, Louisville, Xavier, Ohio, North Carolina State, The remaining 39 teams.
L2 (1/9): Florida, St. Mary’s (CA), Notre Dame, Creighton, Purdue, California, South Dakota State, Belmont, Iona