Silver’s predictions for each state are made in terms of probabilities. So, sometimes he’ll get those right and other he’ll get wrong totally based on random chance. But how much should we expect from him? Should we expect him to get every single one of his predictions correct?
Let’s assume that the probabilities he reports are 100% accurate (that’s a huge assumption, but I’m doing it) and we simulate the election thousands of times. Based on these simulations, how many of Silver’s predictions do we expect him to get wrong.
Using Silver’s 56 state presidential win probabilities from last Friday, we can consider Silver’s prediction in each state to be the candidate who has the higher probability. Now, taking those same probabilities, we can simulate an election based on those probabilities and compare those to Silver predictions. Now, we can run this simulation thousands of times, and count how many times Silver’s predictions are expected to be wrong. According to these simulations Silver is expected to get every one correct only about 5% of the time based on his probabilities if this election was run many, many times. So it’s very likely that he gets at least one state wrong. It’s to be expected. At the other end of the spectrum, there is only about a 1.6% chance that Silver gets more than 5 states wrong.
This means we should expect that Silver , about 93.4% of the time, will get somewhere between 1 and 5 states incorrect. 55% of the simulations, more than half, have Silver getting either 2 or 3 states incorrect, and the average number of incorrect predictions in my simulations was 2.4522.
So, what should we expect from Silver? We should expect him to miss at least one state, and most likely to miss 2 or 3 states.
Many people may be aware of the so called Redskins Rule for predicting presidential elections. Basically, the rule states:
If the Redskins win their last home game before the election, the party that won the previous election wins the next election and that if the Redskins lose, the challenging party’s candidate wins.
This basically means that Romney is a virtual lock to win the presidency after the Redskins fell 21-13 to the Carolina Panthers at home. The Redskins rule has successfully predicted the results of the presidential election 17 out of 18 times since 1937, only failing once in 2004. 17 out of 18 is pretty good, but it’s also only been successful 50% of the time in the last 2 elections; Not so great recently. Football has had it’s run as the go-to sport for projecting presidential elections, but it’s time for it to step aside and let America’s past time take over predicting political outcomes. After all, who knows America better than baseball?
Consider this: Boston has won 75 or more games AND Philly has won 91 or fewer games in the same season 11 times in the last 16 years that presidential election was held (1948, 1952, 1956, 1968, 1972, 1980, 1984, 1988, 1996, 2000, and 2004). In 9 of these 11 years, the Republican party won the presidency. The two exceptions were Clinton in 1996 and Truman in 1948. Conversely, Boston has won less than 75 games OR Philly won 92 or more games in 5 times in the last 16 presidential election years (1960, 1964, 1976, 1992, and 2008). The Democratic party has won all five of those elections. So, if either Boston is terrible or Philly is very good, the Democrat wins. Otherwise the Republicans win with high probability. Using just Boston and Philadelphia, the model correctly chooses the winner 14 out of 16 times.
But let’s take a closer look at what went wrong in the two years that were mis-classified: 1948 and 1996. What happened the year Clinton ran for re-election? The Indians won 99 games. And what happened in 1948? The Indians won 97 games. In fact, in the past 16 presidential election years, the Indians have only won more than 93 games twice. And both of those years were misclassified using only Boston and Philadelphia as predictors. By adding in Cleveland to the model, we can now correctly classify all of the past 16 presidential elections. In hind sight it seems obvious that we need to add the Indians into our model as everyone knows how important it is to include Ohio in any model predicting the presidential race. (I wonder if “Dewey defeats Truman” could have been avoided entirely if the newspapers had relied on the number of Indians regular season wins rather than polls based on biased samples.) The point here is that Ohio really is the most important state in the electoral college.
So, let’s review the three events that need to occur for a Republican to win the presidency and see how they apply to this years race:
1. The Phillies need to win less than 92.
This is great for Romney as the Phillies were a completely average 81-81.
2. The Indians needs to win les than 97 games.
The Indians were atrocious in 2012 finishing with a records of 68-94 more than 25 games below the 97 game barrier. This is second step toward a Romney victory.
3. The Red Sox need to win at least 75 games.
Now, as every Red Sox fan knows, the Red Sox did not win 75 games this year. In fact, they finished 69-93, so we have to predict an Obama victory, even in the face of the Redskins rule.
So, for the record: I’m predicting an Obama victory.
However, this is only where things start to get interesting. For a team to win 76 games, they need a winning percentage of 46.91%, and the Red Sox finished with a winning percentage of 42.6% and missing the mark needed to ensure victory for Romney. Would anyone care to hazard a guess what the Red Sox winning percentage was at the end of August? The Red Sox were 62-70 for a winning percentage of 46.97% on August 31. They were on pace to win exactly 76 games and ensure a Republican triumph. However, they went one to finish the season 7-23, missing the necessary win total to ensure a Romney presidency. Now, clearly, an organization as sophisticated as the Red Sox would be aware of such an obvious statistical relationship as this one; Bill James wasn’t born yesterday after all. Further, the Red Sox and everyone else were aware at the beginning of the 2012 season that it was unlikely that the Phillies nor the Indians would likely reach the magic win marks for the Democratic Party. Therefore, it seems likely, even probable, that the Red Sox goal in 2012 was to lose at least 86 games in order to ensure a second Obama term. Does that sound crazy? Well, here are two key pieces of evidence to support my argument: Bobby Valentine and “the trade”. Clearly, if they were trying to win baseball games, a baseball organization wouldn’t hire Bobby Valentine. Secondly, the Red Sox traded away some of the core players of their team. Specifically, on August 25, the Red Sox traded away Adrian Gonzalez, Josh Beckett, and Carl Crawford to the Los Angeles Dodgers. Sure, the Red Sox front office will claim that it was to “dump salary”, but isn’t it possible that the Red Sox front office realized that on August 25 their record was 60-66 and on pace to win 77 games and made the trade to affect the outcome of the election in November? Just think about the two parties involved in the trade: Boston, which is the capital of one of the most liberal states in the country, involved in a trade with Los Angeles, which is in a state that could not be more blue if it tried. So, like I said, we’re the Red Sox JUST a terrible baseball team in 2012 who hired a manager who has never won anything and traded away some of it’s best players to get some expensive contracts off the books? Maybe. But isn’t that also exactly the story you’d use to cover up your tracks if you were the owner of a baseball team who was intentionally trying to lose at least 88 games in order to guarantee a victory and second term for Barack Obama? I’m just saying.
See you back in reality on November 7.
From Deadspin: Nate Silver’s Braying Idiot Detractors Show That Being Ignorant About Politics Is Like Being Ignorant About Sports
The article also pointed me to the Princeton Election Consortium, which is also fantastic. They have the probability of an Obama win at 99.0% and predict an electoral college win of 315-223. Below are some of the graphs they have produced about the election, I especially like the 2012 Electoral College Map with each state’s area displayed proportional to its electoral votes.
The tables below are for the Google and Yahoo search “Michele Bachmann “(including a space after the last name) for the various dates indicated in the table. Each column has the date of the search and the top five Google or Yahoo auto-complete terms for the search.
Michele Bachmann – Google
|8-17-2011||8-22-2011||8-23-2011||8-29-2011||8-31-2011 – present|
|elvis||corn dog||slavery||slavery||husband gay|
|bio||slavery||husband gay||husband gay||hot|
Michele Bachmann – Yahoo
|hot||hurricane||hurricane||new hair||campaign manager|
|for president||sarasota||irene||margaret thatcher||hurricane irene|
|bio||for president||for president||for president||for president|
What does all this mean? I have no idea, but I suspect it will be difficult to win a Republican party nomination and then a general election with terms like “slavery” and “husband gay” attached to your name.
Another thought: I wonder if the political affiliations of users are constant across the three major search sites or are there a greater percentage of liberals on Bing than on Google, for instance. Could you use auto-complete terms to gain any insight into this? Or is this type of information perhaps already available?
If you don’t want to read this whole thing, just check out the graph: Multidimensional Scaling: Republican Candidates – 8/16/2011
I was having a conversation with some friends today and someone mentioned that Rick Perry might have problems in the election because there were rumors he was gay. So I went to google and typed in “Rick Perry is” and google kindly offered me the following auto-complete options: “gay”, “an idiot”, “a rino“, “evil”, “not a conservative”. This got me thinking how this compared with the other candidates google auto-completes. For instance, if you google “Mitt Romney is” you get suggestions like “a mormon” and ” an idiot” as well as three other suggestions. I did this for all of the major candidates (sorry Thaddeus) and recorded the five google auto-complete suggestions.
Then I created a vector for each candidate based on the google auto-complete words. Each candidate was an observation and each word was a variable. The candidate would get a 5 if the word was first on their list, a 4 if it was second, and so on with a 0 if the word was not mentioned in their auto-complete.
I then used multidimensional scaling (the cmdscale function in R) to allow me to visually display the relative positions of the candidates to each other. This all led to this graphic: Multidimensional Scaling: Republican Candidates – 8/16/2011. The location of the circles is based on multidimensional scaling, the size of the circle is relative to their standings in a national poll taken from fivethirtyeight.com, and the top five google auto-completes are displayed in or near the appropriate circle.
- Every single candidate has the term “an idiot” in either the first or second auto-complete term
- 3 candidates were listed as “hot” (Palin. Bachmann, and Romney)
- “stupid” was only used to describe women
- Perry and Santorum (who has a much bigger google problem that anything I’ve listed here) had “gay” listed in their autocpmpletes and Pawlenty had “definitely not gay”
- Bachman and Palins circles are nearly identical in size (11.7% ad 11.4%, respectively) and words (they share “an idiot”, “hot”, and “stupid”)
- “a douchebag” appears in auto-completes for Santorum, Gingrich, and Pawlenty. I imagine it will be hard to win with this word attached to your name. (John Kerry couldn’t do it.)
- The only overwhelmingly positive google auto-complete was for Herman Cain whose fifth auto-complete option was “awesome”