## Silver wins Gold: Ranking the poll aggregators in the 2016 presidential election

It’s taken me a few days to write this because I’ve been basically unconscious for the last 3 days.  To recap, Hillary Clinton, certainly a weak candidate, but also clearly the most qualified candidate to ever run for president, got hundred of thousands (and when it’s all said and done likely millions) more votes than Donald J. Trump, a racist, sexist, xenophobe, who doesn’t understand the Constitution, but the latter will be the president because we choose presidents based on a system that was created in a time when England had a king, some people owned other people, and before scientific evidence of germs.

The silent majority isn’t a majority, it’s just an arbitrarily, geographically well located minority.  I know this makes you want to scream into a pillow or punch a wall, but if you want to do something productive instead, here are some suggestions.

Anyway, the point of this post is to review the six poll aggregators that made numeric predictions for each state and were compiled on the New York Times The Upshot: New York Times (NYT), FiveThirtyEight (538), Huffington Post (HuffPost), PredictWise (PW), Princeton Election Consortium (PEC), and Daily Kos (DK) (The raw data can be found on my GitHub Page in the repo Statsinthewild).

Below is a sweet plot that I made comparing the predictions of these six aggregators.  I’ve ordered the states from most red to most blue based on the average of the six predictions from November 5 three days before the election.  Then for each state I plotted a boxplot for the distribution of the 6 predictions and overlaid the individual predictions on top.  The colors of the boxplots are blue if the state (or district) went to Clinton and red if they went to Trump.

What immediately stands out for me on this plot is how much lower the Clinton win probabilities were for 538 compared to the other five sets of predictions for states starting and Nevada and moving right on the plot towards bluer states.  Other notable outlying predictions include Huffington Post’s predictions for Florida and North Carolina, which were 97% and 89%, respectively. The New York Times had some outlying probabilities that were high for Clinton in states like Mississippi and Missouri as well as Utah and Georgia.  FiveThirtyEight had many outlying probabilities for the “blue” states, but their most notable outlier for the red states was Alaska, which they gave Clinton a 26% chance of winning.  The next highest probability for Clinton in Alaska was 10%.

So now let’s analyze who was the best.  I’m going to do this in two ways: Brier Score and Logarithmic Loss.  I computed results based only on the 50 states and Washington, D.C. ignoring the weird districts in Maine and Nebraska.  Results are below:

 Average Rank Poll Aggregator Brier Score Log Loss 1 FiveThirtyEight 0.066 0.216 2 PredictWise 0.074 0.259 3.5 New York Times 0.088 0.281 3.5 Princeton Election Consortium 0.089 0.272 5 Daily Kos 0.091 0.402 6 Huffington Post 0.104 0.446

The worst of the poll aggregators was the Huffington Post.  This looks to be because of their overconfidence in Clinton in several states that Trump won. For example they had Pennsylvania and Wisconsin at >99% and 98% for Clinton, both of which she lost.  Daily Kos comes in 5th with a similar Brier score as Princeton Election Consortium and New York Times, but a much worse Log Loss.  Log loss punishes you heavily for being over confident and wrong, and with predictions like Michigan and Wisconsin at >99% and 99%, respectively, the Daily Kos got crushed by Log Loss.  New up we have Princeton Election Consortium and the New York Times who finished 3rd and 4th, respectively, using Brier score.  However, they flip flop rankings when using Log Loss.  Next up, and claiming the Silver medal is the market site PredictWise with a Brier score of 0.074 and a Log Loss of 0.259.

So who was the big “winner” of this Election?  Nate Silver.  A few days before the election I said that for him to look good in this election he needed it to be close or have Trump actually win.  Well Trump won and he was the only person who really gave Trump any chance of winning.  On top of that, his state by state predictions outperformed all of the other poll aggregators, and I’m crowning Nate Silver the champion of poll aggregators for the 2016 presidential election.  What Silver did better than any of the other models was when a state was truly a toss up, his model reflected it.  He had North Carolina, for instance, at 50% and Florida at 51% FOR the Republicans on November 5.  The only other set of predictions to get close to those numbers was PredictWise, which had North Carolina at 63% and Florida at 53% for Democrats.

Finally, here is a plot of the six poll aggregators with their Log Loss score on the x-axis and their Brier score on the y-axis.  Scores that are on the lower left are best and scores on the upper right are the worst.

P.S. Here is a list of articles critical of Nate Silver before the election from the Huffington Post, Fortune, Vox, the Washington Post, Huffington Post again, Mashable and Wired.  Whoops.

Cheers.

## NFL Rankings – Week 13

Rank Team Prosp Retro PredMargin Change SeasonHigh/Low Rank W L
1 DAL 56 73 0.684  – 1/13  11  1
2 ATL 55 67 0.492  – 2/5  7  4
3 OAK 45 64 -0.939  +2 3/27  9  2
4 NE 64 62 3.589  +2 2/13  9  2
5 KC 54 62 1.253  +3 4/25  8  3
6 DEN 63 62 2.466  -2 3/9  7  4
7 WAS 53 61 -0.751  -4 3/18  6.5  4.5
8 SEA 71 56 3.683  -1 7/19  7.5  3.5
9 SD 49 56 -0.507  +4 4/17  5  6
10 NYG 40 55 -0.312  – 10/20  8  3
11 PHI 50 55 0.084  -2 1/11  5  6
12 BUF 57 54 0.197  -1 4/26  6  5
13 NO 60 54 0.847  +4 13/27  5  6
14 PIT 58 52 1.844  +8 6/25  6  5
15 MIA 39 51 -1.207 15/30  7  4
16 DET 49 51 -0.415 15/25  7  4
17 MIN 50 50 0.192  -5 1/17  6  6
18 TEN 39 50 -2.117  +2 16/29  6 6
19 GB 58 49 2.259  +6 7/25 5 6
20 BAL 52 48 0.152  +4 8/26  6  5
21 CAR 63 47 2.270  – 3/27  4  7
22 HOU 51 47 -0.166  -4 14/22  6 5
23 ARI 55 47 1.847  -4 1/23  4.5  6.5
24 TB 44 47 -1.705  +2 24/32  6  5
25 IND 40 46 -0.794  -11 14/29  5  6
26 LA 36 43 -2.004  -3 6/26  4  7
27 CIN 55 41 1.442  – 17/28  3.5  7.5
28 NYJ 48 37 -0.577  – 11/30  3  8
29 JAC 31 34 -3.275  +1 24/32  2  9
30 CHI 45 34 -1.939  -1 24/31  2  9
31 SF 41 24 -2.671  – 28/32  1  10
32 CLE 31 24 -3.922  – 20/32  0  12

NFL Strengths by week:

## Dallas at Minnesota

Prediction: Vikings 22-21 (53.6%)

Pick: Vikings +3.5

Total: Under 43.5

## Washington at Arizona

Prediction: Cardinals 26-21 (62.2%)

Pick: Cardinals -2.5

Total: Under 49.5

## Kansas City at Atlanta

Prediction: Falcons 24-23 (52.9%)

Pick: Chiefs +3.5

Total: Under 49.5

## Miami at Baltimore

Prediction: Ravens 22-19 (58.8%)

Pick: Dolphins +3.5

Total: Over 41

## San Francisco at Chicago

Prediction: Bears 23-20 (57.2%)

Pick: Bears -2.5

Total: Under 43.5

Prediction: Bengals 25-22 (58.9%)

Pick: Bengals +2.5

Total: Over 41.5

## Houston at Green Bay

Prediction: Packers 24-20 (61.7%)

Pick: Texans +4.5

Total: Under 46.5

## Denver at Jacksonville

Prediction: Broncos 23-19 (61.1%)

Pick: Jaguars +5

Total: Over 42

## Los Angeles (nee St. Louis) at New England

Prediction: Patriots 26-18 (70.1%)

Pick: Rams +13.5

Total: Under 44.5

## Detroit at New Orleans

Prediction: Saints 27-24 (58.6%)

Pick: Lions +5

Total: Under 53.5

## Buffalo at Oakland

Prediction: Raiders 22-21 (51.8%)

Pick: Bills +3

Total:Under 49.5

## NY Giants at Pittsburgh

Prediction: Steelers 25-21 (61.1%)

Pick: Giants +6

Total: Under 49.5

## Tampa Bay at San Diego

Prediction: Chargers 25-22 (58.4%)

Pick: Buccaneers +4

Total: Under 47.5

## Carolina at Seattle

Prediction: Seattle 22-19 (58.8%)

Pick: Panthers +7

Total: Under 44.5

## Indianapolis at NY Jets

Prediction: Jets 23-21 (55.6%)

Pick: Jets +1

Total: Under 49.5

## AFC

AFC East

New England: 12-4 (9-2) 11.82

Buffalo: 9-7 (6-5) 8.949

Miami: 8-8 (7-4) 8.549

NY Jets: 6-10 (3-8) 5.542

AFC North

Pittsburgh: 8-8 (6-5) 8.401

Baltimore: 7-9 (6-5) 7.045

Cincinnati: 6-9-1 (3-7-1) 6.112

Cleveland: 1-15 (0-12) 0.788

AFC South

Houston: 7-9 (6-5) 7.457

Indianapolis: 7-9 (5-6) 7.401

Tennessee: 7-9  (6-6) 7.348

Jacksonville: 3-13 (2-9) 3.52

AFC West

Kansas City: 10-6 (8-3) 10.101

Oakland: 10-6 (9-2) 9.916

Denver: 9-7 (7-4) 9.362

San Diego: 8-8 (5-6) 7.698

## NFC

NFC East

Dallas: 13-3 (11-1) 12.759

NY Giants: 9-7 (8-3) 9.145

Washington: 8-7-1 (6-4-1) 8.120

NFC North

Detroit: 8-8 (7-4) 8.139

Minnesota: 8-8 (6-6) 7.942

Green Bay: 7-9 (5-6) 7.394

Chicago: 4-12 (2-9) 3.982

NFC South

Atlanta: 10-6 (7-4) 9.896

New Orleans: 8-8 (5-6) 7.802

Carolina: 7-9 (4-7) 7.149

Tampa Bay: 6-10 (6-5) 6.350

NFC West

Seattle: 10-5-1 (7-3-1) 10.232

Arizona: 7-8-1 (4-6-1) 7.46

Los Angeles (nee St. Louis): 6-10 (4-7) 5.565

San Francisco: 1-15 (1-10) 1.479

### Projected Playoffs

AFC

1. New England
2. Kansas City
3. Pittsburgh
4. Houston
5. Oakland
6. Denver

NFC

1. Dallas
2. Seattle
3. Atlanta
4. Detroit
5. NY Giants
6. Washington

Wildcard Round

(6) Denver over (3) Pittsburgh

(4) Houston over (5) Oakland

(3) Atlanta over (6) Washington

(4) Detroit over (5) NY Giants

Divisional Round

(1) New England over (6) Denver

(2) Kansas City over (4) Houston

(1) Dallas over (4) Detroit

(2) Seattle over (3) Atlanta

Conference Championships

(1) New England over (2) Kansas City

(2) Seattle over (1) Dallas

Super Bowl

(2) Seattle over (1) New England

## NFL Playoff Probabilities – Week 13

Team WinDivison MakePlayoffs MakeSuperBowl WinSuperBowl
ARI 6.8 20.9 1.4 0.7
ATL 86.5 94.0 17.4 8.2
BAL 17.1 17.5 2.1 0.8
BUF 0.9 43.7 2.9 1.4
CAR 2.0 11.3 0.5 0.3
CHI 0.1 0.1 0.1 0.1
CIN 5.4 5.7 0.7 0.2
CLE 0.0 0.0 0.0 0.0
DAL 98.5 100.0 36.2 18.3
DEN 20.0 62.0 9.2 4.9
DET 44.8 52.9 6.5 2.8
GB 21.4 26.8 3.2 1.8
HOU 36.2 36.7 4.1 2.1
IND 33.8 34.3 3.6 2.0
JAC 0.0 0.0 0.0 0.0
KC 46.0 83.1 15.2 7.6
MIA 2.8 25.6 1.2 0.3
MIN 33.7 43.5 4.9 2.2
NE 96.3 99.2 34.4 19.8
NO 10.9 29.8 2.0 1.0
NYG 1.3 76.5 4.6 1.4
NYJ 0.0 0.0 0.0 0.0
OAK 32.8 76.3 13.7 4.0
PHI 0.0 5.0 0.1 0.1
PIT 77.5 77.8 9.0 4.6
SD 1.2 7.6 0.5 0.2
SEA 93.2 96.8 20.1 12.2
SF 0.0 0.0 0.0 0.0
STL 0.0 0.0 0.0 0.0
TB 0.6 3.2 0.5 0.2
TEN 30.0 30.5 3.4 1.5
WAS 0.2 39.2 2.5 1.3

## Arizona at Atlanta

Prediction: Falcons 24-23 (50.9%)

Pick: Falcons +4

Total: Under 50.5

## Cincinnati at Baltimore

Prediction: Ravens 22-21 (50.4%)

Pick: Bengals +4.5

Total: Over 40.5

## Jacksonville at Buffalo

Prediction: Bills 24-18 (65.6%)

Pick: Jaguars +7.5

Total: Under 45.5

## Tennessee at Chicago

Prediction: Bears 23-20 (57.9%)

Pick: Bears +3.5

Total: Over 43

## NY Giants at Cleveland

Prediction: Giants 23-22 (52.7%)

Pick: Browns +7

Total: Over 44.5

## Washington at Dallas

Prediction: Dallas 26-22 (60.4%)

Pick: Washington Football Team +7

Total: Under 51

## Kansas City at Denver

Prediction: Broncos 23-19 (59.6%)

Pick: Broncos -3.5

Total: Over 39.5

## Minnesota at Detroit

Prediction: Lions 22-20 (55.4%)

Pick: Vikings +2.5

Total:Under 43

## San Diego at Houston

Prediction: Texans 23-21 (54.2%)

Pick: Texans EVEN

Total: Under 45.5

## Pittsburgh at Indianapolis

Prediction: Steelers 24-23 (51.0%)

Pick: Colts +3.5

Total: Under 49.5

## San Francisco at Miami

Prediction: Dolphins 23-19 (59.0%)

Pick: 49ers +7.5

Total: Under 45.5

## Los Angeles at New Orleans

Prediction: Saints 26-21 (63.0%)

Pick: Rams +7

Total: Over 45.5

## New England at NY Jets

Prediction: Patriots 24-22 (56.9%)

Pick: Jets +8

Total: Under 47

## Carolina at Oakland

Prediction: Panthers 23-21 (55.1%)

Pick: Panthers +4

Total: Under 50

Prediction: Packers 24-23 (50.2%)

Pick: Packers +3.5

Total: Under 47.5

## Seattle at Tampa Bay

Prediction: Seahawks 24-19 (62.3%)

Pick: Buccaneers +5.5

Total: Under 45

## Fun with Benford’s law: Election 2016 edition: What’s up with Iowa and Mississippi

Before I begin this post, I need to make it clear to the conspiracy lunatics out there that this is not evidence that the 2016 election was rigged.  As of right now, there is basically no evidence that this election was anything other than a massive, but fair, fuck up by the American people.  Could the election have been rigged?  Sure.  Anything, no matter how unlikely is possible in the world we live in now.  (I mean Donald Trump a racist, xenophobic, misogynist, was elected president of the United States of America, which was basically an impossibility like 3 weeks ago.)  But let me again say there is no evidence that the election was rigged.  And that includes this post.

Ok, now that we have that out of the way, let’s talk about Benford’s Law (tip of the hat to @mulderc for this idea).  Benford’s law states that in a list of numbers the leading digits does not appear uniformly.  The digit 1 is expected to be first about 30% of the time, while the digit 9 is expected to be first only about 4.5% of the time.  Specifically, for a digit d between 1 and 9, the probability that number appears first is given by the following formula:

$p(d) = log_{10}(1+\frac{1}{d})$

So let’s apply this to the 2016 election.  I downloaded data on the 2016 election at the county level from here.  Using all of the data for each of the candidates I get the following two plots.  The height of the bar is what is actually observed and the red dots are what is to be expected by Benford’s law.  It really is amazing how well this distribution fits the data.

And if we go back to 2012, we see exactly the same thing.  Amazing.  Benford’s Law seems so counterintuitive, but it’s observed in so many different places.

Next I wanted to look at individual states.  This is problematic for at least a few reasons.  Most notably, there are a some states that have very few counties in them (e.g. Massachusetts, Alaska, etc.).  So I went ahead and tested each state with at least 30 counties individually to see if their votes followed Benford’s Law.  I also went a step further, as Benford’s law can be extended to the first two digits, three digits, etc.  Below are the results of individual state goodness-of-fit tests for Benford’s law for 1-4 digits using a Bonferroni correction to control the family-wise error rate (FWER).  States where the null hypothesis is rejected for Trump and Clinton are colored red and blue, respectively.  States where the null is rejected for both Clinton and Trump are colored purple.  I’m pretty sure I shouldn’t even be doing a Benford’s goodness of fit on the first 3 or 4 digits when I only have 30 observations.  But I did it anyway.  I’d pay more attention to the plots for the digits 1 and 2.  On those plots we see that the null hypothesis for Trump’s totals in Iowa and Mississippi was rejected for 1 digit and for 2 digits the null was rejected in Mississippi only.  Let’s go look at Iowa and Mississippi in more detail.

If we look at Trump’s and Clinton’s vote totals in Iowa, we get the plot below.  This is significantly different than Benford’s Law with a p-value of 0.0205.

Next I looked at each candidates vote totals individually.  The departure from Benford’s Law is entirely driven by Trump’s vote totals.  Trump has was less 1’s than expected and more than expected for 2 through 5.

Before you go flipping out about how this is evidence of election fraud, you should look at Iowa from 2012.  Basically we see the same thing with the Republican candidate.  Too few 1’s and more 2’s and 4’s than we expect.  My guess as to what is happening here is that the types of counties that Republicans are winning in Iowa are not expected to follow Benford’s law?  Is that plausible?   But I’d love to hear other ideas as to what is happening in Iowa.

Now let’s look at Mississippi.  When we look at Clinton and Trump together, there is nothing significant.  Though we do see far fewer 1’s than expected, just like in Iowa.

When we look at Trump and Clinton individually, we see that Clinton’s vote totals are not significantly different than Benford’s Law expects, but Trump’s are very different again with far too few 1’s and way too many 4’s.

Finally, here are the plots for test for 2012 using a Bonferroni correction to control the FWER.  Iowa shows up again when d=1, but Oregon shows up when d=2.

In conclusion, Benford’s law is fun and there’s something weird about Iowa and Mississippi.

Cheers.

## 2016 Presidential Election Maps

Well, the 2016 presidential election is over and Donald Trump is going to be the president of the United States.  (How do I feel about that?  Here are my open letters to Donald J Trump and the American people.)

You’ve probably seen a map that looks like this a bunch of times showing which candidate won each state.  It tells an interesting story about the United States in 2016 in that it is a nice summary of the idea that there are two America’s.  One exists largely on the east and west coasts and the other exists basically in the middle of the United States.

What I dislike about this plot is that it is far too simplistic.  A better way to look it would be to shade in the state on a scale of red to blue.  That would look like this.  America isn’t really red and blue, it’s very purple.  One problem with this plot is that we don’t get any idea of the population in each of these states.

I added in the population by using the opaqueness and that plot looks like this.  Again, America isn’t really red and blue, it’s purple.

These plots are nice, but I’d like to drill down further.  So I downloaded data from Kaggle’s Data sets for the 2016 election at the county level to make this plot. I much prefer this plot with a color scale for percentage of votes in each county, rather than giving each county to one candidate or the other.  The percentages here are computed by only considering votes cast for the two major party candidates and then calculating what percentage each of those two candidates received.  You’ll notice that the west coast and east coast are not surprisingly mostly blue and the middle of the country is red.  Some interesting exceptions are New Mexico, Arizona, and Colorado are much bluer than the other states that surround them.  There is also an interesting band of blue in the south that runs from North Carolina west through South Carolina, Georgia, Alabama, and Mississippi.  I can’t explain that, but I’d love to hear theories about that.  It’s also interesting to see just how purple the upper midwest is in states like Wisconsin, Iowa, and Minnesota.

However, there are problems with this map too as it’s difficult to see the population of these counties as they are all presented as the same.  So in the next plot, I’ve used the opaqueness to show more populated counties and less populated counties are more translucent.  That plot looks like this below.  You can see from this plot not only the percentage of votes in each county, but also the population in those counties.  The most notable aspect of this plot is how much less red there is in this plot.  That’s a result of the red counties having much smaller populations than other counties.  I really, really like this plot.

Next I wanted to look at third party candidates by county and you get this.  You’ll see that Utah was the predominant state that voted for a third party candidate (Evan McMullin) as well as New Mexico (Gary Johnson) and parts of Idaho (McMullin).

Drilling down into individual third parties you can wee who was voting for Gary Johnson.  This was primarily centered in New Mexico, where Johnson is a former governor.  Johnson had very little support in the south.

Jill Stein did well in a few places in the United states, however, she failed to make it on the ballot in all 50 states.  The big pockets of Stein support are the northern coast of California and parts of Colorado.  I really like the juxtaposition of Vermont and New Hampshire next to each other.

Finally, here is a plot of McMullin support that was primarily in Utah and southern Idaho, which are heavily Mormon areas of the country.  There also seems to be moderate support for McMullin in Minnesota.

If you are interested in the code, you can find it here.  And the data is from here.

Cheers.

## Mad about Trump winning the electoral college, but losing the popular vote? Check out John Quincy Adams and Rutherford B. Hayes.

I’m still nauseous from the giant screw up America just made electing an entitled, unqualified man baby to be the leader of the free world.  The only thing that settles my stomach is looking at data.  (Or should I say daTUMS. Pun very much intended.)

Next I looked at vote margin against the percentage of the electoral college.  No one has ever lost the popular vote my more votes that Trump and it’s not even close.  Of course, it’s not really a fair comparison to compare Trump to Hayes, but when Bush lost the popular vote to Gore, it was at least relatively close.  Trump is going to end up losing the popular vote by possible 2 million votes.  That’s a lot of people.

Here is what that plot looks like if you zoom in on the origin.  There is Donald Trump way out to the left by himself nearly 2 million votes behind Clinton.

Cheers.

## Gun Violence Data

Guns are a public health problem.

After digging around the internet looking for data on gun violence for a few minutes,  I found Gun Violence Archive which has a ton of great information on gun violence in the US. You can search for incidents by date, location, age of victim, kind of gun, and much more.  On top of that, you can download CSV files directly from the website.

Here’s what I made with 32 days worth of data:

View original post 27 more words

## New Orleans at Carolina

Prediction: Panthers 27-23 (59.7%)

Pick: Panthers -3.5

Total: Under 51.5

## Buffalo at Cincinnati

Prediction: Bengals 23-20 (59.2%)

Pick: Bengals -3

Total: Under 47

## Pittsburgh at Cleveland

Prediction: Steelers 25-21 (60.0%)

Pick: Browns +9

Total: Under 49

## Baltimore at Dallas

Prediction: Cowboys 23-21 (55.7%)

Pick: Ravens +7

Total: Under 45

## Jacksonville at Detroit

Prediction: Lions 25-20 (59.7%)

Pick: Jaguars +6.5

Total: Under 47

## Tennessee at Indianapolis

Prediction: Colts 25-21 (59.7%)

Pick: Colts -3

Total: Under 52.5

## Tampa Bay at Kansas City

Prediction: Chiefs 24-19 (63.1%)

Pick: Buccaneers +7.5

Total: Under 44

## Arizona at Minnesota

Prediction: Cardinals 21-20 (50.1%)

Pick: Cardinals EVEN

Total: Over 41

## Chicago at NY Giants

Prediction: Giants 24-21 (58.2%)

Pick: Bears +7

Total: Under 47.5

## Houston at Oakland

Prediction: Raiders 21-20 (51.3%)

Pick: Texans +6

Total: Under 46

Prediction: Seahawks 24-19 (64.4%)

Pick: Eagles +6.5

Total: Under 44.5

## New England at San Francisco

Prediction: Patriots 25-21 (60.0%)

Pick: 49ers +13.5

Total: Under 51

## Miami at Los Angeles

Prediction: Rams 21-20 (53.2%)

Pick: Rams -1

Total: Over 40.5

## Green Bay at Washington

Prediction: Packers 25-23 (56.0%)

Pick: Packers +2.5

Total: Under 50.5