My gentle criticism of the RPI
Lately, I’ve been chirping about how bad the RPI is. I had some free time this morning, so I thought I’d dig into it and write down exactly why I dislike the RPI. Below you’ll find details on how to calculate the RPI, my criticisms of the formula, and, finally, an example of how RPI can go haywire.
There are three components to the RPI:
- Wining Percentage (WP)
- Opponents’ Wining Percentage (OWP)
- Opponents’ Opponents’ Winning Percentage (OOWP)
Winning percentage (WP)
This is calculated by taking the number of a wins a team has and dividing it by the number of games that team has played. However, since 2005, home wins count as 0.6 of a win and away wins count at 1.4 wins (neutral wins count as 1).
Opponents’ Winning percentage (OWP)
For team , calculate the winning percentages of each opponent excluding team from the winning percentage calculation. When calculating OWP wins are not weighted as in the calculation of WP. Once each team’s OWP is calculated, take the average of all these winning percentages to get the OWP component for team .
Opponents’ opponents’ Winning percentage (OOWP)
For team , calculate the OWP for each of their opponents. Include games against team in this calculation. Take the average of all these OWP to compute the OOWP for team .
For full details of the RPI see Ken Pomeroy’s explanation.
My criticisms of the RPI
- Ad hoc weighting of the games in calculation of WP: Away wins are worth 1.4 wins whereas home wins are only counted as 0.6. This makes an away win more than TWICE as important as a home win. This doesn’t sound right to me. Does anyone know where these numbers (i.e. 1.4 and 0.6) came from? I’d love to know.
- Weighting not done in OWP or OOWP: When OWP or OOWP is calculated, all games are worth 1 win again. Where does the weighting go? This seems like another totally arbitrary decision.
- Averaging averages: I didn’t realize this at first because I couldn’t believe the formula would actually do this, but the formula is taking the average of the opponents winning percentages. That’s different than the winning percentage of the opponents. Here is an example, imagine there are three teams A, B, and C. Team A is 1-9 and team B and C are both 1-0. The winning percentages of these teams is 3/12=0.25. But if we take the average of the averages, as RPI does, we get (.1+1+1)/3=0.7. This does not make sense to me unless all teams play exactly the same number of games.
- Excluding the team from OWP but now OOWP: To calculate the OWP for a team, that team is excluded from the calculation. But that same team is added back in when calculating the OOWP. Why?
- Arbitrary linear weighting: Where did the 0.25, 0.5, 0.25 numbers come from? This again seems entirely arbitrary. (UPDATE: Thanks to the wonderful people of the internet, the answer can be found here.)
- OWP gets the most weight: Why is opponents winning percentage more important than your own win percentage. To boost your RPI is simple, just play good teams. It doesn’t even matter if you lose, as long as your opponents just keep winning. (See my example below.)
A silly example of the RPI gone crazy
I spent this morning trying to come up with silly examples of the RPI. If I’ve coded everything correctly (if you find a mistake please let me know), here’s one example of the RPI gone wild:
Let’s say there are 5 teams: A, B, C, D, and E. A beats C twice, B beats D twice, C beats D twice, A beats E twice, and B beats E twice. (In each set of two games, one game was home and one was away for each team).
- A: 4-0 (Home 2-0, Away 2-0)
- B: 4-0 (Home 2-0, Away 2-0)
- C: 2-2 (Home 1-1, Away 1-1)
- D: 0-4 (Home 0-2, Away 0-2)
- E: 0-4 (Home 0-2, Away 0-2)
Before you look below, try to make a reasonable ranking of these teams in your head. Write this down and come back to it.
The winning percentages for each team are 1 for A and B, 0.5 for C, and 0 for D and E. The OWP for these teams are 1 for E, 0.5 for A, C, and D, and 0 for B. And the OOWPs for these teams are 0.875 for B, 0.750 for A, 0.5 for C, 0.250 for D, and 0.125 for E.
When we apply the 0.25, 0.5, and 0.25 linear weights to WP, OWP, and OOWP, respectively, we get the following RPI results:
- A: 0.6875
- E: 0.53125
- C: 0.5
- B: 0.46875
- D: 0.3125
Team A ranked first makes sense. They were 4-0 and beat C and E. D ranked last also makes sense. They were 0-4 and lost to B and C. But the three teams in the middle make no sense. Team E is ranked 2nd with NO wins. They are above C who is 2-2 and both losses came against team A. Further, and here is the big finish, team E, at 0-4, is rated above undefeated team B in spite of the fact that team B beat E twice! That makes no sense. You could go on constructing these scenarios all day. It’s really not that difficult to make crazy scenarios happen with the RPI formula. What’s happening here is the team E is being inflated by their OWP, which is the largest component of the RPI. More important than even their own winning percentage. The RPI makes no sense.
What is my point?