I have now officially lived in Chicago-land for over a year, and I’m beginning my second academic year at Loyola University Chicago. After one full academic year, I can say that I still absolutely love it: I love the department, I love the students, I love Chicago. I’ve also learned quite a bit in the past year both academic and non-academic. So here it is. My advice to someone and thoughts on the first year of the tenure track:
- Find out who can help you with administrative stuff. Every university everywhere is going to have some arcane system of paper work for getting reimbursed or when you want to purchase something. Then you have to send it to the right person. And you’ll always send it to the wrong person. I’ve worked at two state schools (Umass and UConn) and, while it’s better at a private school, you’ll never escape the paper work. So find out who knows what they are doing and let them help you. We have an amazing administrative assistant (I love you Agnes!) who helps me with all of my reimbursements and purchasing paperwork. I didn’t realize this for about 6 months, which is why (at least in my mind) I got nothing accomplished the first semester. Paperwork is the worst.
- Advertise what you do for research to students. No matter what you do, some student will be interested in it (well maybe not all areas of research, but most). And the good ones will come ask if they can help. Just because they are interested. Let them help. It’s a win-win for everyone as long as the student is good. The real keys are (1) figuring out which students are good and (2) picking an appropriate project for the level the student is at. These are skills that you won’t learn in grad school or at a post-doc. You just sort of have to figure this out as you go along (like so many other things in academics.)
- Go to as many department/college/university events as you can. Even if you think some of these events are corny (and some of them will be), just go. Go and just meet people. You never know who you’ll meet at these things. Or who you’ll be introduced to. In my day-to-day life in the department there are countless professors who I never get to interact with for whatever reason. They may have a totally opposite schedule than me, they may be in an entirely different field (i.e. analysis, number theory, any of the other math fields), they may just be trying to avoid other people. But this might be your only opportunity to meet the really interesting people in your department. And everyone likes meeting really interesting people. I met a collaborator who I am currently writing papers and a grant with at a joint math/anthropology department event called “bacon and booze”. (Science seems to be based on booze. Data and booze. Quote me on that: The two most important ingredients in science are data and booze. Is there an event called “Data and Booze”. If not, I’m starting one. #ramblingOver)
- Where ever you end up, find the local R/python/etc. users group and attend meetings. For me, this is the Chicago R Users Group (CRUG or ChicagoRUG). I’ve been to a handful of meetings and I’ve met a bunch of really interesting people from both academics and industry. And I’ve learned a lot about R. I’ve even been asked to present twice (#linesForTheCV).
- #lifeAdvice “I can’t do that, I’m not [blank]” This doesn’t really have to do with my job, but it’s something I learned in the last year. I remember when I was a kid, I’d always think “I can’t do that, I’m not [blank]”. The blank could be a baseball player or a musician or a skateboarder or a programmer or a business major. But the thing is, no one IS anything. If you want to be something, just do it even if you aren’t that thing. For me this is most relevant in my art in the last year. Since I’ve moved to Chicago I’ve been submitting my art to shows and I keep getting accepted. I was even invited to do an entire show of my work, which was up for a weekend this past summer. So I’m an artist because I say I’m an artist. And whatever you want to do, you are that just because you say you are that (I mean don’t be delusional about this, you’re not the president of the USA.) But if you want to be an author, don’t let the fact that you aren’t an author stop you from being an author. If you want to be a musician, don’t let the fact that you can’t play music stop you. If you want to be a statistician, don’t let the fact that you don’t have a statistics degree stop you from doing that. The internet will teach you everything you need to know. You just need to practice. Everything is made up. Be whatever you want.
- I had the opportunity of being on two search committees my first year. I probably wouldn’t advise this if you had a choice, but I was grateful that I was able to participate on two search committees in my first year. I was on a computer science search committee and a statistics search committee. Being involved in such important decision making in my first year really made me feel like my department (and the CS department) valued my opinion. I also cherished the opportunity to be involved in having input into the direction the department would move in in the next decade. There are currently 4 tenured/tenure track statisticians in the department and three of them have been hired in the last two years. This is an incredible opportunity for me to really have a lot of input in the direction that a statistics program will move in next few years. (I hope I don’t screw it up.) But I immediately have the chance to try and fix all of the things that I thought were wrong or not perfect about my undergraduate/graduate experience (which overall was excellent!) Usually a new professor does not have these opportunities to change a program untiul many years into their academic careers. I’m really excited to have this much influence at the beginning of my career. (And again, I’ll try not to blow it.)
- Your are going to be tired on Friday night. When I was a grad student, time was absolutely unlimited. At least that’s how it felt. Want to spend a week learning about web scraping? No problem! That dissertation can wait! As a post-doc, I had a little bit less time as there was a project that always needed to be worked on, but I also didn’t have any teaching or service related responsibilities. Now, as an assistant professor, I still have to do research, but just throw teaching two classes a semester and departmental service (like two search committees!), and you can see that time is not on only no longer unlimited, it’s almost non-existent. (I’m writing this in a rare free moment during the semester). Mike Lopez (a.k.a @statsbylopez) had this to say about being tired: “My major issue was getting enough rest, especially the first semester. If I had to do things over again, I would’ve recognized that you can’t keep up the same research pace you had when you were in grad school or a post-doc. As a grad student or a post doc, you have the time nearly every day to devote to major projects. That time just doesn’t exist when you also have to teach and advise. Time for faculty should be a zero-sum game; roughly fix the hours, and let things fall where they fall. To interrupt sleep, social life, or family responsibilities is a major mistake that many make in their first year (me being one of them).”
- Finally, students have lots of personal problems and sometimes you might need to help. Remember what it was like being 20? It’s not easy. Girl problems, boy problems, family problems, money problems, school problems, life problems. When you are 18/19/20/21/22, you’ve got these problems. And sometimes you might be the person that a student feels comfortable talking to about it. Or, because a student either misses class or assignments, you may end up hearing about their troubles. I was not prepared for this, and I’m still not exactly sure how to deal with it. I guess my best advice is to just sit and listen. Sometimes that’s what people need. Just someone to listen. Other times students tell me about problems that they are going through, and I went through the exact same thing. Without going into details here, I think it’s really helpful to know that someone went or is going through what you are going through. I know it made me feel better when I was in college, and I hope that I can do that for someone else.
Originally posted on StatsbyLopez:
With interest in statistical applications to sports creeping from the blogosphere to the mainstream, more writers than ever are interested in metrics that can more accurately summarize or predict player and team skill.
This is, by and large, a good thing. Smarter writing is better writing. A downside, however, is that writers without a formal training in statistics are forced to discuss concepts that can take more than a semester’s work of undergraduate or graduate training to flesh out. That’s difficult, if not impossible and unfair.
One such topic that comes up across sports is the concept of regression toward the mean. Here are a few examples of headlines:
Regression to the mean can be a bitch! (soccer)
Clutch NFL teams regress to the mean (football)
Beware the regression to the mean (basketball)
30 MLB players due for regression to the mean (baseball)
View original 1,294 more words
Retro is based only on games played in the 2014-2015 season and heavily weighs strength of schedule.
Prosp is based on 4 years of data weighted for recency. It’s based on expected points.
Both Retro and Prosp are framed in terms of the probability of defeating and average team.
Team – (Median wins) expected wins
New England – (11-5) 11.31
Miami – (8-8) 8.06
Buffalo – (7-9) 7.20
NY Jets (6-10) 6.24
Baltimore (9-7) 9.31
Cincinnati – (9-7) 9.00
Pittsburgh – (8-8) 7.80
Cleveland – (5-11) 5.35
Indianapolis – (10-6) 9.53
Houston (9-7) 9.18
Tennessee (6-10) 5.54
Jacksonville (3-13) 2.73
Denver – (12-4) 12.16
San Diego – (8-8) 8.10
Kansas City – (8-8) 7.70
Oakland – (3-13) 3.10
Philadelphia (10-6) 10.25
Dallas (9-7) 8.74
NY Giants (7-9) 6.58
Washington (6-10) 6.29
Green Bay (12-4) 11.54
Detroit (9-7) 8.38
Chicago (7-9) 7.03
Minnesota (6-10) 5.49
New Orleans (12-4) 11.70
Carolina (9-7) 8.64
Atlanta (8-8) 7.99
Tampa Bay (5-11) 4.88
Seattle (12-4) 12.05
San Francisco (11-5) 11.21
Arizona (6-10) 5.81
St. Louis (5-11) 5.09
2. New England
2. New Orleans
3. Green Bay
5. San Francisco
Projected Wild Card Round
Indianapolis beats Cincinnati 24-22
Baltimore beats Houston 23-20
Green Bay beats Dallas 27-22
Philadelphia beats San Francisco 23-22
Projected Divisional Round
Denver beats Baltimore 27-21
New England beats Indianapolis 29-23
Seattle beats Philadelphia 26-21
New Orleans beats Green Bay 27-26
Projected Conference Round
Denver beats New England 29-25
Seattle beats New Orleans 26-22
Denver beats Seattle 25-23
Season Long Bets
Arizona Under 8.5 -105
Buffalo Under 8.5 -130
Denver Over 10.5 Even
Jacksonville Under 5.5 -115
Minnesota Under 7.5 +200
New Orleans Over 8.5 -145
New York Jets Under 7.5 -140
Oakland Under 6 -115
San Francisco Over 6.5 Even
St. Louis Under 8 -165
Seattle Over 11 -135
Green Bay -250
New Orleans +220
San Francisco +2000
New England -140
New Orleans +2200
Crazy Long Shot Super Bowl Match-Up
New Orleans vs Houston +50000
Originally posted on andrea cirillo's blog:
That got me move my Shiny App on an Amazon AWS instance.
Well, it was not so straight forward: even if there is plenty of tutorials around the web, every one seems to miss a part: upgrading R version, removing shiny-server examples… And even having all info it is still quite a long, error-prone process.
All this pain is removed by ramazon, an R package that I developed to take care of everything is needed to deploy a shiny app on an AWS instance. An early disclaimer for Windows users: only Apple OS X is supported at the moment.
As one would expect, using ramazon is a very pleasant experience, given that you just have to run a function, ramazon(), passing to it the EC2 instance public_DNS and…
View original 856 more words
I got back from #JSM2015 in Seattle yesterday. While I was there I compiled this list of links of interesting things (talks, R packages, etc.) that I took away from JSM2015. There are a ton of slides that I would love to add to this list (i.e. the rest of the talks from the session that @styatsbylopez organized), so if you have a link to anything like that, please send it my way and I’ll add it.
Interesting talks at JSM:
- “Automatic Forecasting at Scale” Sean Taylor (@seanjtaylor)
- “Relax, I’m a Data Scientist” Jenny Bryan (@JennyBryan)
- “Interactive graphics for high-dimensional genetic data” Karl Broman (@kwbroman)
“intRo: Statistical Analysis Software for Teaching“, Erix Hare and Andee Kaplan
- “Data Wrangling for the Lahman“, Ben Baumer (@baumerben)
- “Recent Advances in Interactive Graphics for Data Analysis“, Carson Sievert
Interesting talks at JSM about Sports (with some R packages):
- “Building an NCAA men’s basketball predictive model and quantifying its success” My slides. (@statsinthewild)
- “Refs – They’re Just Like Us! Adversarial and Social Pressures in the NFL“, Mike Lopez (@statsbylopez)
- The full working paper about referees behavior under pressure, Mike Lopez
- “Acquiring, Visualizing, and Modeling MLB Umpire Strike/Ball Decisions with PITCHf/x Data“, Carson Sievert (@cpsievert)
- Pitch RX, Carson Sievert
- The Deuce package in R for tennis data, Stephanie Kovalchik
- Paper: “A Machine Learning Strategy for Predicting March Madness Winners“, Jordan Gumm, Andrew Barrett, and Gongzhu Hu
Interesting links from JSM:
Originally posted on TIME:
Say this for the anti-vax clown car: it never seems to run out of new punchinellos to climb inside. If it’s not scientific fabulist Andrew Wakefield, he of the fraudulent study that got the whole vaccine-autism myth started, it’s Jenny McCarthy, she of the supposedly vaccine-injured son whose autism was cured in part by—yes!—a gluten-free diet because, um, gluten is bad, very bad.
After McCarthy, there was Saturday Night Live alum Rob Schneider—because when you’re looking for guidance on the wisdom of vaccines, who are you going to trust: the World Health Organization, the Centers for Disease Control and the National Institutes of Health, or the man who gave us Deuce Bigelow, Male Gigolo? I mean, hello, the movie was huge.
Now, to this group of board-certified jesters add Jim Carrey—the ex-Mr. Jenny McCarthy—who rose on July 1 in all his orange-wigged, floppy-shoed, seltzer-down-the-pants fury to condemn California…
View original 540 more words
Originally posted on God plays dice:
C. Liam Brown has built a Battleship probability calculator, which (roughly speaking) works by finding the square which is the most likely to yield a hit given the set of hits and misses so far. You can play against it if you want. A lot of this might be said to be a web-friendly implementation Nick Berry’s analysis of the game, although analysis and implementation are two different beasts. (Funny, that keeps coming up in my day job…)
I woke up this morning to a twitter comment about the “CDC Whistleblower Saga” from last year from one of my favorite twitter followers. This obviously led to a conversion explaining to me that: vaccines aren’t effective, the idea of herd immunity has been debunked, they are making kids sick, and they cause autism (Vaccines Don’t Cause Autism). I should note that none of these claims have any scientific backing to them. Other twitterers also told me that vaccines aren’t 100% effective (true; MMR is about 93% and 97% effective for 1 and 2 doses, respectively) and vaccines have side effects (also true, though side effects are rare). But also, not a logical argument against vaccination. I think we often forget (or in my case, never saw (thank you vaccines!)) how bad the measles really is (the measles are horrible).
And I know deep down, that no matter what I say, I’m not going to change someone’s mind on twitter. So why do I engage in “discussions” with people like this. I guess first, I can’t help myself. If someone engages me first, and they are wrong, I’m going to tell them that they are wrong. Though I’m not sure this is the best way to effectively deal with the anti-vaccine crowd (Here is how Jamelle Bouie of Slate suggests dealing with them), but I can’t help myself. I do try not to insult or attack people, but rather their arguments. But I find this difficult to do sometimes when I believe that these people are actively causing harm by trying to spread their anti-vaccines beliefs. (So if I insulted you today, I apologize to you. I should be better than that. But I still think your ideas are pseudo-scientific cray-ball wackadoo stuff).
But, secondly, I am absolutely fascinated that people think this way. It’s so foreign to how I think about the world. I know people who are espousing these beliefs actually believe them in spite of the mountains of evidence against their claims. To this end, The Atlantic wrote a really interesting article last fall about the psychology of anti-vaxers. It’s a fascinating read. And a bit sad with quotes like this:
Dr. Douglas Hulstedt, a pediatrician in Monetery, California, shares Smoot’s preference for personal stories over scientific evidence. Hulstedt accepts patients who are not vaccinated. He goes even further, and recommends refusing vaccinations if a patient has a family history of autism, lupus, Crohn’s disease, or Type 1 diabetes.
“Why do I need a medical study?” he says. “If 80 percent of the parents of children with regressive autism in my practice say their child reacted after the MMR [measles, mumps, and rubella] shot, why do I need a medical study?” Hulstedt says that studies showing no link between the MMR vaccine and autism or showing that vaccines are safe and effective might have “fraud in the reportage.”
This is a medical doctor posing the question: “Why do I need a medical study?”. That is absolutely appalling and evidence why I believe medical doctors need more statistical training before, during, and after medical school. Statistics is a complicated subject. Statistics is hard. I find it is constantly difficult, and I’m supposed to be the “expert”. But it’s just a difficult subject to tackle. Statistics is hard. But we need it as part of the scientific method to objectively answer medically important questions. Like do vaccine work (Yes).
But it’s so easy to make mistakes. As an illustration, let’s consider the plot below which shows measles deaths per 100,000 people over time. This was sent to me by my favorite twitter follower with the (sarcastic) text:
They are arguing (I believe) that the death rate from measles was dropping for decades prior to the introduction of the measles vaccine, and the measles vaccine did little to lower the death rate of measles. So it follows that the measles vaccine isn’t as effective as science makes it out to be, therefore CONSPIRACY! #tinfoilhat #jadehelm
In all seriousness though, if you have no statistical knowledge, this might seem like a convincing argument. And I’m sure there are a lot of smart people (and not so smart people) who could be convinced by this plot. The problem with this is that this “analysis” is inherently trying to isolate the effect of vaccines on death rates without controlling for any other factors that are related to the death rate. Medicine advanced quite a bit from 1840 to 1940 and the probability of dying from measles dropped considerably. Even with no vaccines. But that’s all this plot is demonstrating. And it’s offering almost no evidence as to the effectiveness of the vaccine and is a case study in confounding.
I’d also argue the that graph is potentially misleading the viewer with scales. By the time the vaccine is introduced in that graph, the line is so close to 0/100,000 that it’s hard to see the relative effect of the vaccine. The death rate could have dropped 10% or 90% (It does drop some amount) and the viewer wouldn’t be able to tell . The graph would be much stronger if it was zoomed in on the years 1948 to 1978. But that doesn’t seem to be the narrative that is trying to be passed on with that graphic.
To really get a handle on the effectiveness of vaccines, we should be looking at cases of measles rather than the measles death rate. The graph below shows cases of measles in the US from 1954 through 2008. The first vaccine was introduced in 1963 and a second version was released in 1968. Notice the large and immediate drop from 1963 to 1969. It’s possible that there could be some huge confounding effect that explains this drop, but I think it would be difficult to present a reasonable confounding effect here that would dwarf the effect of the introduction of vaccines. The decline in measles cases was immediate and rapid. So are vaccines effective in reducing disease? Yes. Yes. Yes. Yes. Yes. and finally Yes.
Finally, I’ll close with this advice from the World Health Organization (WHO) an trying to persuade the anti-vaxxers:
How one addresses the anti-vaccine movement has been a problem since the time of Jenner. The best way in the long term is to refute wrong allegations at the earliest opportunity by providing scientifically valid data. This is easier said than done, because the adversary in this game plays according to rules that are not generally those of science. This issue will not be further addressed in this paper, which aims to show how vaccines are valuable to both individuals and societies, to present validated facts, and to help redress adverse perceptions. Without doubt, vaccines are among the most efficient tools for promoting individual and public health and deserve better press.8
You can (and should) read the whole paper here.
Originally posted on r4stats.com:
[Since this was originally published in 2013, I’ve collected new data that renders this article obsolete. You can always see the most recent data here. -Bob Muenchen]
Learning to use a data analysis tool well takes significant effort, so people tend to continue using the tool they learned in college for much of their careers. As a result, the software used by professors and their students is likely to predict what the next generation of analysts will use for years to come. I track this trend, and many others, in my article The Popularity of Data Analysis Software. In the latest update (4/13/2012) I forecast that, if current trends continued, the use of the R software would exceed that of SAS for scholarly applications in 2015. That was based on the data shown in Figure 7a, which I repeat here:
View original 1,093 more words
For the last nine years I’ve gone on a golfing trip to Vermont with my dad. After the first year we started a competition complete with plaque and names engraved for the champions. Here are the rules. We play 7 rounds over the course of 4 days and you take your lowest score on each hole from any of the rounds and fill out one score card. Whoever has the lowest “master card” wins. (Ties are broken by looking at who had the lower score on the highest handicapped hole, second highest handicapped hole, etc.)
Currently, the series is 5-2 in favor of the dad. Here is a plot of what this competition looks like when I’m playing well. I’m the orange line and my dad is the green line. This is from 2013 when my father started off very, very slowly. Through 3 rounds we were tied, and then I took and held the lead at the end of the 4th round and held the lead through the middle of the 6th round when my dad went bonkers and made a bunch of birdies and picked up pars on holes he hadn’t gotten yet. So what does it look like when someone is getting crushed in this competition?
After two full rounds, my father is 10 full strokes ahead of me. It’s an absolute blow out through 2 rounds. That black dot is a birdie, which pops made on the 18th hole of the second round today. An early dagger. Though I have plenty of holes that are currently sitting at double bogey, so I should make up ground fairly quickly (hopefully, anyway). I’ll update this tomorrow after we’ve completed our rounds.
Update: Through three rounds: I’m down 75-81.
Update 2: Through 4 rounds I’m down 75-78 (and I do not have the tie breaker).