Pick: Patriots 27-26
Spread: Rams +2.5
Total: Under 56.5
Unrelated to this post: What time does the Super Bowl start? 5:30pm Central.
Moving on, below is a 2d histogram of frequencies of the last digits of the final score of ever NFL game from 1920 through last year’s Super Bowl.
If I only use games from 2000 through the 2018 Super Bowl the 2d histogram looks like this.
Here are my picks for the NFL playoffs. Also, Ravens-Chargers shouldn’t be a first round game.
Texans over Colts, 26-22
Chargers over Ravens, 22-21
Cowboys over Seahawks, 21-20
Bears over Eagles, 27-18
Chiefs over Chargers, 28-25
Patriots over Texans, 24-21
Saints over Cowboys, 30-18
Rams over Bears, 29-22
Chiefs over Patriots, 29-27
Saints Over Rams, 29-27
Saints over Chiefs, 30-27
The Bears just clinched the NFC North for the first time in, I want to say, 100 years, by beating the Green Bay Packers last weekend at Soldier Field. Their week 15 meeting was the second time these division rivals have played this season and their first meeting came way back in week 1 when Chicago blew a 20 point lead and they looked well on their way to a 5-11 season, while Aaron Rodgers looked like Superman. But that was a long time ago and everyone seems to have caught up to the idea that the Bears are good this year and Green Bay is not. And you can see this in the spreads for the two games.
In the first meeting at Lambeau Field in week 1, the Packers were 6.5 point favorites over the Bears, who covered despite losing in crushing fashion. 14 weeks later and spread for the Bears-Packers game at Soldier Field was Bears -5.5. That is a shift of 12 points.
Now some of this has to do with home field advantage. If two teams were essentially equal on a neutral field, you’d expect this difference to be about 6-ish (-3 at home and +3 away). But 12 seemed rather large to me, and I wondered if that was the largest shift in spreads in a rematch this year. While it is not, in fact, the largest, it is close. There were two matchups that had a larger shift in spreads. Stop reading and try to guess what those match-ups were.
Ok. You ready now? The largest difference this year was the Titans and Jaguars. In September the home Jaguars were favored by 10 over the Titans in week 3. In week 13, The Titans at home were favored by 5.5, for a 15.5 point swing. Coming in at number 2 was Atlanta and New Orleans. In their first meeting in week 3, the Falcons were favored by 2. In week 11, the Saints were favored by 12.5. The previously mentioned Bears and Packers came in at 3rd largest with a shift of 12.0.
The only other two double digit shifts were Buffalo-NY Jets and Dallas-Philadelphia. The Bill vs Jets shift happened in only 4 weeks. In week 10, the Jets were favored by 7, then in week 14 the Bills were favored by 4.5 . Rounding out the top five was the Cowboys and Eagles. In week 10, the Eagles were favored by 7 points then by week 14 the Cowboys were favored by 3.5. Here is the list of all of the shifts of at least 5:
- Jacksonville – Tennessee: 15.5
- Atlanta – New Orleans: 14.5
- Chicago – Green Bay: 12.0
- Buffalo – New York Jets: 11.5
- Dallas – Philadelphia: 10.5
- Kansas City – LA Chargers: 7
- Dallas – Washington: 6
- Miami – New York: 6
- San Francisco – Seattle: 5.5
- Baltimore – Cincinnati: 5.5
- Cleveland – Pittsburgh: 5.5
- LA Chargers – Oakland: 5.0
I’ll follow up on this when the season ends, and I also want to go back and look at past seasons.
The Ringer published an article today entitled “The NFL’s Analytics Revolution Has Arrived” by Kevin Clark. The first section of the article is a relatively interesting overview of the state of advanced analytics in the NFL. But then everything goes down hill. And where does it start to go down hill? Right here:
“It is amazing,” Warren Sharp said, “how many teams anonymously follow me on Twitter.” Sharp is an engineer with his own analytics site and has been playing around with football statistics for about 20 years. He is among the top minds in football not working full time for a team.
Ok. First of all, why does this read as a press release promoting Warren Sharp? Second, let’s talk for a second about who Warren Sharp is. You might remember him from this blog post (which was picked up by Slate, the Wall Street Journal, and Huffington Post) about how the “The New England Patriots Prevention of Fumbles is Nearly Impossible”. It turns out that the analysis was highly flawed, and myself and a colleague detailed the problems with the “analysis” over at Deadspin and Neil Paine over at FiveThirtyEight.com did a great job summarizing the whole kerfuffle.
Sharp then basically claimed that he had been redeemed by the Wells Report, but that was also not true either. In fact, in 2015 immediately after the league implemented stricter ball handling procedures to prevent potentially deflating footballs, the Patriots still had the lowest fumble rate in the league. As Mike Lopez explains in Sports Illustrated:
In any case, the 2015 season makes for an excellent out-of-sample test with respect to New England’s fumble tendencies. Although the Patriots have been accused of going crazy lengths to gain a winning edge, it seems safe to assume that any suspect ball routine could not have been a part of the game-day preparation process this season. (The NFL implemented new procedures for inspecting game balls.) As a result, if one initially made the link between the Patriots low fumble rates and deflated footballs, the natural follow-up would be to assume that New England’s fumble rates would revert toward the league average in 2015.
So what happened in 2015?
• The Patriots had the fewest fumbles of any NFL offense.
• The Patriots had the best fumble rate of any NFL offense.
• The Patriots had one of their best fumble rates of the past decade.
Based on only this, it is my opinion that Warren Sharp is really not that great of a statistical analyst. And look, I make mistakes. Everyone makes mistakes. Its basically impossible to do statistics without ever making a mistake. Humans are human after all. But what bothers me so much about Sharp is that he just seems to ignore the legitimate criticisms and doubles down.
But wait, there is more! In addition to this, Warren Sharp is a tout. While The Ringer generously promotes his site, Sharp Football Stats, they don’t seem to mention his other site, Sharp Football Analysis, where Sharp sells football picks to gamblers. (You can buy a season long membership for the low, low price of $250….) According to Sharp, his record, shown below, is a 59% winning percentage over 12 years, with a whopping 77% win percentage in Overs (which is somehow different than “Over Leans”).
When something seems too good to be true, it usually is. There is absolutely no way he’s correctly picked 59% of games against the spread over the course of 12 years. And here’s how you can tell this isn’t real: If he was picking 59% correctly over the course of 12 years, he wouldn’t be selling the picks. He wouldn’t need to because he’s be extremely wealthy and wouldn’t need your $250 membership fee. There are a few very good professional gamblers, but you’ve probably never heard of them (Like Bill Benter, for example), and they certainly wouldn’t be selling their picks if their picks were any good because they could be making way more money betting on them (Benter made a BILLION dollars….with a “B”!). So his numbers are probably not the most truthful……
In fact, Game Advisers, which tracks handicappers plays, has Warren Sharp as 16-23-1 for a negative 23.41% ROI. Not quite the same as what Warren claims.
Also, apparently, he pissed someone off enough for them to start http://sharpfootballanalysistruth.blogspot.com. The blog has exactly one post:
One of the links in that blog post links to an entire thread about how Warren Sharp is a scam. A poster named Dr. H refers to him as a “sleazeball hack”……..his words, not mine.
And finally, a public service announcement from one of the covers.com forums:
So anyway, my point is that Sharp is a tout who does, at best, sloppy statistical analysis. And yet these major media outlets are touting (see what I did there…?) him as this genius. He’s not.
Anyway, back to that quote from The Ringer article. That paragraph continues:
In fact, when you talk to people inside the league, some think he might be the top mind, period. Though he’s been writing on the internet for many years, he said it wasn’t until 2018 that teams started reaching out to him to discuss analytics. He says he’s heard from at least five and has done work as a consultant.
While I haven’t personally asked anyone I know who works for an NFL team, I would bet everything I own that exactly 0% of the data scientists/statisticians working for NFL teams would consider this guy to be the “top mind, period“. And if I’m wrong about that, I can just take a page out of Warren Sharp’s playbook and lie about my record……..
P.S. They also mention my old friend Bill Barnwell (who is still blocking me on Twitter) in this article. I actually enjoy reading Barnwell’s stuff, but he also wrote this article once, which was a really poorly done statistical analysis for Grantland. You can read all about the shortcomings of that analysis here and here.
In my original post I fixed a few parts of the code (the white bishop was missing…derp) and I made the border lines thicker. I’ve also found that these look way better when only using the first 40-50 moves or so. Beyond that they get really boring. So here are all 12 games and the 3 tie break games using only the first 50 moves (25 white, 25 black):
Kathy Explains all of Statistics in 30 Seconds and “How to Succeed in Sports Analytics” in 30 Seconds
@causalKathy explains all of statistics in 30 seconds.
I spent the weekend of October 19-21 in Pittsburgh at the 2018 CMU Sports Analytics Conference. One of the highlights of the weekend was Sam Ventura asking me to explain causal inference in 15 seconds. I couldn’t quite do it, but it morphed into trying to explain all of statistics in 30 seconds. Which I then had to repeat a few times over the weekend. Figured I’d post it so people can stop asking. I’m expanding slightly.
Kathy Explains all of Statistics in 30 Seconds
Broadly speaking, statistics can be broken up into three categories: description, prediction, and inference.
- Mapping inputs to outputs
- Predicting outcomes and distributions
- Inference/Causal Inference
- Prediction if the world had been different
- Counterfactual/potential outcome prediction
I’ll give an example in the sports analytics world, specifically basketball (this part is what I will say if I only have 30 seconds):
- Slicing your…
View original post 589 more words
This morning Yahoo Sports published a piece titled “Op-ed: How one flawed study and irresponsible reporting launched a wave of CTE hysteria,” written by Merril Hoge, a former NFL running back and ESPN analyst, and Dr. Peter Cummings, an Assistant Professor with the Boston University School of Medicine. In it, they criticize the methods of an article that was published in the Journal of the American Medical Association (JAMA) in 2017 entitled “Clinicopathological Evaluation of Chronic Traumatic Encephalopathy in Players of American Football.” The JAMA article, as they point out, has been an important part of widespread discussions about football and brain injury.
Hoge and Cummings make three major points against the article (which they call strike 1, 2, and 3. Get it. It’s a sports reference!):
- There was no control group
- There was selection bias
- Failed to control for external factors
And you know what? The authors are exactly correct.
What kind of moron would design a study without a control group? How can you design a randomized control trial (RCT) without a control group?!?! It’s right there in the name! It’s insane. Amateur hour. Next, as anyone with half a brain knows, you have to select your subjects randomly in a RCT. It’s also right in the name!! Randomized. Control. Trial. And then, get this, they didn’t even control for external factors! As Cummings and Hoge point out, “nearly half the players had a history of substance abuse, suicidal thinking or a family history of psychiatric problems.” Can you imagine not controlling for that stuff in a randomized control trial? How could anyone be so stupid?
Ok, seriously. How could someone do a randomized control trial without having a control group, randomization, and controlling for external factors?
Well…they didn’t. The JAMA study wasn’t a randomized control trial. It was a case series. The original authors made no attempt to do a randomized control trial. If you look at the original study, the authors note in the very first line of the findings section of the abstract that they are using a “convenience sample.” This means that they know they aren’t doing a randomized control trial. Which means they aren’t even attempting to show a causal link between playing football and CTE. In fact, in the Wikipedia entry for case series it explicitly mentions that “unlike studies that employ an analytic design (e.g. randomized control trials)…case series do not…look for evidence of cause and effect.” This is how the actual authors summarize their findings in the conclusions section of the paper:
In a convenience sample of deceased players of American football, a high proportion showed pathological evidence of CTE, suggesting that CTE may be related to prior participation in football.
They NEVER claim a causal link. Which makes this statement by Hoge and Cummings all the more odd:
Then we took a closer look at the study that led to the Times story — apparently something few journalists had bothered to do. When we dug into the methodology, we were floored. The study was so badly flawed that it was nearly worthless. But that’s not what had been reported in practically every major media outlet in the world. Thanks to the barrage of sensationalist coverage, the “110 out of 111 brains” story had turned into a wildfire, and we were standing around with a couple of garden hoses, telling everybody to calm down.
They criticize the authors of the Times story for not taking a closer look at the original paper. But I wonder if Hoge and Cummings actually read the original paper. Their article on Yahoo Sports criticizes three aspects of the original study that literally aren’t a part of the original study.
The cynic in me thinks maybe all of this is just a ploy to sell more books. Because as you’ll notice, Hoge and Cummings are promoting a book called “Brainwashed: The Bad Science Behind CTE and the Plot to Destroy Football”, which I’m not going to link to because, based on this Yahoo Sports article, I’m guessing it’s trash*.
Side Note 1
I will admit that Hoge and Cummings do have a point that the media coverage of this research was fairly skewed. The media plays up the potential link between football and CTE and maybe doesn’t fairly state that as of yet no causal relationship has been established between playing football and CTE. But even if the media coverage was fair, what Hoge and Cummings should be doing is calling for more, high quality research into the relationship between football and CTE instead of trashing a case series for not being a randomized control trial. Because right now the truth is that we simply don’t know if there is causal relationship. But we also don’t know that there isn’t a causal relationship.
Side Note 2
Smoking and lung cancer is a famous example where there was clearly a correlation known for a long time, and for decades people were talking about how the relationship wasn’t shown to be causal. Eventually, scientists were able to show a causal link. But it was all the demonstrated correlation between lung cancer and smoking that led people to study the causal relationship between the two. So, this CTE study is important not because it shows a causal link—which, again, no one is trying to demonstrate here—but because it is consistent with what we would expect if the relationship between CTE and football were causal, and will lead to more work on studying that relationship. This is an important part of how science works.
And finally, remember, “correlation does not even imply correlation”
*I need to be clear that I HAVE NOT read the book. The book could be amazing. I’m ONLY commenting on the Yahoo Sports Op-Ed (added 10/24/2018 – 8:03am).
Wins are a shitty statistics in baseball. You can be a great pitcher on a terrible team and end up with a win-loss record that isn’t impressive at all. For instance you could be Jacob DeGrom and be an outstanding pitcher on an awful team (The Mets were 77-85). deGrom ended the season with a win-loss record of 10-9 and the Mets were 14-18 in games that he started. He also ended the season with an ERA of 1.70 and only gave up more than 3 earned runs in an outing once during the entire season (He gave up 4 earned runs on April 10 against Miami). From May 18th through the end of the season he had 24 quality starts in a row (6+ IP, 3 or fewer runs). 24 QS IN A ROW!
So I wanted to look at how someone could be so dominant could end up with only 10 wins and their team going 14-18 when they started. So the first thing I looked at was the scores of these games. Maybe the Mets weren’t scoring a ton of runs. In 21 of deGrom’s 32 starts (65.625%) the Mets scored three or fewer runs and the Mets were 4-17 in these games. (The league average for runs per game in 2018 was 4.45.) And in only 7 games (21.875%) did the mets score 6 or more runs. They were 6-1 in these games. Below is the scatterplot for the scores of Mets games that deGrom started in the 2018 season.
Next I looked at the number of earned runs that deGrom gave up in his start and how many runs the Mets’ opponent ended up with at the end of the game. The plot below shows the number of earned runs allowed by deGrom in an outing versus the total number of runs allowed by the mets. In only 26 of deGrom’s 32 starts the Mets managed to give up at least one run that wasn’t credited to deGrom (This could either be the bullpen giving up runs, or unearned runs because of errors. Either way, not deGrom’s fault).
Think about this. In deGrom’s 32 starts he pitched a total of 217 innings and gave up 41 earned runs. That’s about 75.35% of the innings in those games (assuming all games went a full 9 innings). Between unearned runs and runs given up by other Mets’ pitchers, the Mets allowed their opponents 63 more runs. Think about that. In games when deGrom started he pitched 217 of the total innings and the Met’s needed to only get through 71 more innings. 63 runs that weren’t deGroms fault in 71 innings. While this isn’t exactly a runs per game calculation 63 runs in 71 innings is almost 8 runs per 9 innings. Did I mention the Met’s were bad?
So how many games should the Met’s have won with deGrom starting this year? I sort of checked the answer to this question by looking at home many wins the Mets would have had in deGrom starts if the bullpen was simply league average. To do this, I computed the league average runs per out rate and then drew from a Poisson variable with this mean take n draws where n is the number of outs left in the game. I then added up the number of runs given up by the bull pen and added the to deGrom’s ER for that game. I then counted how many times the Mets would have won a game (with ties gonig 50-50 to each team). The mean is 18.85 with a median of 19 wins and 95% of the simulations had a win total between 16-22. Basically this means that, in my crude calculations, the Met’s bullpen cost the Mets somewhere between 2 and 8 wins in deGrom’s starts.
The histogram of the simulated values of the number of Mets wins if deGrom had a league average bullpen can be seen below. Almost never is it 14, the actual win total for the Mets in deGrom’s starts.
So should deGrom win the Cy Young Award with a 10-9 record? Well his ERA was 1.70. The next best in the National league was Nola at 2.37. And the only other pitcher in the entire league to end the season with an ERA under 2 was Blake Snell (1.89) who is basically a lock to win the AL Cy Young after going 21-5. So would I give it to deGrom? Probably. But I wouldn’t be that upset if Nola won it in the NL.
The one thing we can all agree on is the Mets suck.
Finally, you can see my code here. It’s a mess, but there it is: https://github.com/gjm112/StatsInTheWild/blob/master/deGrom.R