## Lisa Goldberg and the Hot Hand in Basketball

So Lisa Goldberg gave a talk at Loyola Chicago on Monday afternoon about the hot hand in basketball where she presented a paper where she shows that there is no statistical evidence that the hot hand exists. While we didn’t film her talk, this numerphile video is basically the same as what she presented on Monday:

The basic argument in the paper is that the probability of making the next shot given you made the previous shot is not statistically different than the probability of making the next shot given you missed the previous shot. So I have a lot of thoughts on this, but first let’s talk about the really interesting history of this topic.

In 1985, what can very accurately be called the seminal paper from Gilovich, Vallone, and Tversky, which defined the hot hand and concluded there there was no statistical evidence for it. For years this was considered orthodoxy in most of the sports statistics world, even though almost everyone in basketball feels that the effect is real.

Years after that paper was original published, in 2015, Jason Miller and Adam Sanjuro published a paper where they pointed out that the way that Gilovich et. al. (1985) went about looking for the hot hand was slightly flawed in that their were unaccounted for biases that are introduced in the streaks that were not accounted for when the streak length that is considered is small.

What they pointed out in this paper is really, really interesting. So let me talk about it for a second. Let’s say you have a finite string of coin flips from a fair coin. Call 0 a tails an 1 a heads. So you might have a string of flips like this: 001101001. Now, for a fixed number of flips, what is the proportion of 1’s occurring after a 1 in a finite sequence? It’s 50% right? Right?!?! It has to be 50%!

Turns out, it’s not 50%. Andrew Gelman has an excellent explanation of the issue here.

Following the Miller and Sanjuro paper in 2017, Daks, Desai, and Goldberg published a paper where they updated Gilovich’s original paper using permutation testing to account for the bias that Miller and Sanjuro pointed out in their paper. In Daks et. al. they find that even when using the permutation tests and accounting for the bias, they still find no evidence of a hot hand.

This paper led to that numerphile video about the hot hand up at the beginning of this post, though Miller and Sanjuro don’t agree with the findings in that paper.

So what are my thoughts on this? I don’t think that any of these paper are looking for the hot hand in the correct way. I think you need to look at building some sort of mixture model or a hidden Markov model with two states representing hot and regular. Once you fit that model you can look to see if there is a significant difference between the states and compare this model to a one state model and see which one gives you a better fit. I’ve written about this type of thing before in baseball with Rob Arthur.

I also think, specifically in basketball, you absolutely cannot be viewing the data as a string of 0’s and 1’s. If you only are looking at makes and misses you are ignoring so many other factors such as shot distance, game situation, distance to nearest defender, etc. that affect the probability of making a shot that need to be controlled for. What’s nice about studying the hot hand idea in baseball for pitchers is that there are relatively few factors that need to be controlled for when looking at pitchers (runners on base, score, pitch type, etc.). And it’s also easier to look at pitchers because there are no opposing players who are trying to hinder the pitchers ability to do what they are doing and the pitch is always coming from the same distance away (This is why I think bowling would be a nice place to look for the hot hand. Someone get me the data!) In basketball, everything is different from shot to shot.

So am I convinced that the hot hand exists? My answer is really, truly, I don’t know. I haven’t seen anything that convinces me it does or does not exists. And also, it depends. It depends on exactly how you define the hot hand.

Anyway……..

After Dr. Goldberg’s talk, I was lucky enough to get invited to dinner with her because, dammit, I’m important……….(Also, at the dinner were John and Sue Dewan and Lisa Goldberg’s Daughter)

While at dinner our department head introduced me as the director of our Data Science program and Dr. Goldberg asked me this following question: What are the three things you want students to take away from your program.

I stalled for a bit, and then just straight up said I’m going to avoid answering that, but I’ll tell you want to I think a Data Science student should know how to do.

Towards the end of dinner, I just had to ask Dr. Goldberg the same question she asked me. What did she think the three most important things were (I hope i remember these at least somewhat accurately):

- Statistics does better with more data and but more data is harder for computers. Dealing with this issue is fundamental to doing data science.
- Remove your personal biases from the analysis.
- Design your experiment before hand.

Pretty good answers. But I’ve now been thinking about this question for a few days now. So what are the three most important things that I want our Data Science students to know?

After thinking about this for a while here are my answers (in no particular order):

- Always try to do the thing you are trying to do. (For example, if you just want to build the best classifier model, you aren’t that interested in interpreting parameters.)
- Data Science consists of two major parts: managing the data and analyzing the data. Neither part is more or less important.
- You can manipulate statistics to say many different things. Be ethical. (Present data to others the way you want others would present data to you.)

And of course, I want all of my students to know that the answer to virtually every single question in statistics is “It Depends”.

Ok. Good night.

Cheers.

## NFL Super Bowl Squares Distribution

Here is the distribution of the last digits of the final scores of NFL games all-time:

And here is the distribution for recent games, which I believe I defined as since 2000.

Go bears.

Cheers.

## Statsinthewild Official Super Bowl Prediction

Prediction: 49ers , 26-25

Spread: 49ers +1

OU: Under 52.5

## NFL Playoff Predictions – Divisional Round

**Divisional Round**

Ravens (51.84%) over Titans, 23-22

Chiefs (75.8%) over Texans, 33-18

Packers (52.41%) over Seahawks, 26-24

49ers (60.57%) over Eagles, 26-21

## The tentative syllabus for my “radical” redesign of Intro Stat

Here is my tentative syllabus for my radical redesign. The structure of the course follows roughly the 9 goals put forth in the GAISE report. Please comment.

Oh also this: I’m getting rid of slides. I’ll have a marker for board work and a computer to do the analysis and simulations. But no slides!

- Week 1-1: Intro class. Go over syllabus. Discuss the 9 goals put forth in the GAISE report. Talk about ethics (IRB, informed consent, etc.)
- Week 1-2: Software: Introduction to R. Syntax. Getting data in/out of R. Basic structures (e.g. data.frames, matrices, vectors, etc.), etc. Reproducible documents (i.e. R Markdown)

- Week 2-1: Critical consumers: Assign students to read this paper over the weekend. Spend a full day of class discussing pros and cons.
- Week 2-2: Collecting data activity. I am going to make rectangular cards whose length, width, area, labels, and colors have statistical properties that I design. I’m going to hand them to the class and make them decide what questions we should ask and what we should measure. We will come back to this data many times throughout the semester.

- Week 3-1: Graphical Displays and Numerical Summaries:
- Types of data
- continuous
- categorical
- time-to-event data

- Univariate summaries for continuous data:
- mean
- median
- variance
- IQR range
- percentiles

- Tables for categorical data
- Univariate dataviz
- histograms
- boxplots
- barplots
- violin plots
- maps!

- Types of data
- Week 3-2: Graphical Displays and Numerical Summaries:
- Bivariate summaries for continuous data
- correlation
- pearson
- spearman
- kendall contingency

- simple linear regression
- two-way tables
- odds
- odds ratio

- correlation
- Bivariate dataviz
- scatter plots
- mosaic plots
- stacked bar plots
- side by side boxplots
- side by side histograms

- Bivariate summaries for continuous data

(Example data: Hospital General Information.csv https://data.medicare.gov/data/hospital-compare)

- Week 4-1: Variability:
- Intro to probability
- Describing Distributions (shape, center, variability, outliers)
- Expectation and Variance

- Week 4-2: Variability
- Bayes Theorem
- Specific Distributions
- normal
- binomial

- Week 5-1: Variability
- Sampling Distributions
- Lot’s of simulations!
- Emphasize the difference between data distribution and sampling distribution

- Bootstrapping

- Sampling Distributions
- Week 5-2: Variability
- Central limit theorem (CLT)
- Lot’s of simulations

- Central limit theorem (CLT)

- Week 6-1: Randomness
- Sampling
- Discuss famous cases where sampling was poorly done (e.g. Dewey defeats Truman)
- Talk about the Census!
- Selection bias
- Discuss sampling strategies (probability vs probability sampling)
- SRS
- Stratified
- Cluster

- Discuss population vs sample

- Sampling
- Week 6-2: Statistical Models:
- Simpsons paradox
- Very simple models (i.e. X ~ N(mu, sigma))
- Simple Linear Regression (no inference…..yet)

- Week 7-1: Exam 1
- Week 7-2: Statistical Inference
- What is statistical inference?
- Ideas of point and interval estimation
- Explain correct interpretation of confidence intervals!

- Idea of hypothesis testing
- Type I and Type II errors

- Multiple testing problems (FWER and FDR)

- Week 8-1: Statistical Inference
- Hypothesis testing of one mean.
- parametric tests (Z and t-test)
- non-parametric test (sign test, permutation test)

- Hypothesis testing of one mean.
- Week 8-2: Statistical Inference
- Interval estimation of one mean
- parametric (Z and t-interval)
- non-parametric (bootstrap intervals)

- Interval estimation of one mean

- Weel 9-1: Statistical Inference
- Two dependent samples hypothesis testing
- parametric (Z and t-test)
- non-parametric (Wilcoxon signed rank test, permutation test)

- Interval estimation
- parametric (Z and t-intervals)
- non-parametric (bootstrap intervals)

- Two dependent samples hypothesis testing
- Week 9-2: Statistical Inference
- Two independent samples hypothesis testing
- parametric (Z and t-test, Welch’s test, pooled variance)
- non-parametric (Wilcoxon Rank Sum/Mann Whitney U, permutation test)

- Interval Estimation
- parametric (Z and t-intervals)
- non-parametric (bootstrap intervals)

- Two independent samples hypothesis testing

- Week 10-1: Statistical Inference
- Simple Linear Regression
- parametric (t-tests)

- Simple Linear Regression
- Week 10-2: Statistical Inference
- k-sample problems
- parametric (ANOVA) (It’s just regression with categorical predictors!!!!!)
- non-parametric (Kruskal-Wallis)

- k-sample problems

- Week 11-1: Statistical Inference/Statistical Models
- Multipel Regression

- Week 11-2: Statistical Inference/Statistical Models
- Multiple Regression

- Week 12-1: Statistical Inference
- Categorical Data
- Inference for proportions
- parametric (using CLT)
- non-parametric (permutation test)

- Chi-square tests
- parametric (using CLT)
- non-parametric (permutation test)

- Inference for proportions

- Categorical Data
- Week 12-2: Statistical Inference/Statistical Models
- Simple Logistic Regression

- Week 13-1: Statistical Models
- Survival Analysis
- Motivate with example why we can’t just use mortality rates (in 100 years everyone is dead!)
- Censoring
- Truncation
- K-M Curves (comparing two K-M curves)

- Survival Analysis
- Week 13-2: Statistical Inference
- Intro to Missing Data
- Examples where ignoring missing data is bad
- Why is the data missing?
- Missingness mechanisms
- Really simple multiple imputation?

- Intro to Missing Data

- Week 14-1: Statistical Inference
- Introduction to Bayesian statistics
- Motivate Why?
- Define prior, likelihood, posterior
- Estimating a proportion example
- Credible Intervals

- Introduction to Bayesian statistics
- Week 14-2: Statistical Inference
- Introduction to Bayesian statistics (continued)
- Bayesian Hypothesis testing
- Bayes Factor

- Introduction to Bayesian statistics (continued)

- Week 15-1: Case study
- Case study from start to finish.
- We are going to start with this data set and analyze it from start to finish.
- We are going to do it “Data Fest Style”: There is no specific question. We are just looking for interesting stories to tell from the data.

- Week 15-2: Case Study
- Case study continued

Week 16: Final Exam

## More thoughts on my “radical redesign” of Intro Stats (part 2)

Here are my first set of thoughts on my “radical redesign”.

More thoughts:

- I think we need to introduce non-parametric statistics in intro stats. I basically had no idea about non-parametric statistics until I taught a course called non-parametric statistics in my first semester as a professor at Loyola. I’m totally sold on them. I think we do a disservice teaching all these parametric procedures, which were useful 50 years ago (I mean they are still useful) because the extra assumptions were greatly simplifying. But we have computers and don’t really need that extra level of simplification all the time now.
- I think we should mention the t-test as basically an after thought. My plan is to introduce hypothesis testing using simulation and directly examine the distribution of the test statistic with this simulation. Once you do that the fact that it’s a t-distribution (or whatever distribution it is) doesn’t even really matter as long as you have the distribution. Students get way too hung up on using t-tests like they are the end all be all of hypothesis testing. It should be presented as ONE test among many.
- I’m going to completely get rid of slides. I’m going to go into every class with a plan and a data set. All theory will be written on the board while trying to get students involved as much as possible. I will write simulations on the spot to show students examples of code. (I will keep the simulations simple). I will then do all data analysis on the spot. NO SLIDES.
- Something that I am on the fence about that I read in the GAISE report: dropping probability theory. In the section with a list of topics to potentially drop they include this. The more I think about it though, the more it makes sense. We already have other classes that will cover probability in much more detail and we don’t need much probability to actually do a lot of data analysis (we do need some though).
- Another note the GAISE report makes that I have been screaming about for years is getting rid of the F@#$ing tables. Students in my class aren’t allowed to use Z/T/whatever tables to look up probability. It’s an antiquated skill and it has been for like 30 years. Yet it’s still taught in so many intro stat classes. If I see students using a table, I reserve the write to rip up the table on the spot and throw the pieces into the air while yelling about how it’s 2020 not 1920.

## An incomplete list of people who have positively influenced my life. With anecdotes.

Well, I’m probably going to get tenure in the Spring, which is a fairly big accomplishment. And it’s the end of a decade. So I’ve decided to look back on some people who have had a positive influence on me in my life. Here is a VERY INCOMPLETE list, with amusing anecdotes.

Roughly in chronological order:

**Michael Kinsley**: When I was a little kid, Mike lived next door. We were best friends. He moved to Pennsylvania when I was like 8. Devastating. Since then I’ve only seen him a handful of times (one weird day when we watched an XFL game in West Springfield, his wedding, once in Springfield at Sophia’s, etc), but most recently we met up in Chicago. We went to a barbecue place across from Wrigley, and he managed to get us a 50% employee discount because the new job he was starting was somehow tangentially related to something. It was truly masterful work.

He also once tried to play the word “re-re” in me in scrabble when we were like 10. “re-re”. As in short for retard (it was a different time).

**Dad**: My father gave me two pieces of advice when he dropped me off at college: 1) Don’t drink anything that you didn’t pour yourself and 2) When a woman asks how old she looks, always answer 22.

I wrestled in high school. At some point I wanted to box. My father wouldn’t let me. I am so thankful he never let me box.

**Mom**: My mom let me do everything that I wanted. She asked me if I wanted to play an instrument, and I told here maybe the drums. Just based on that I took lessons for like 10 years as a kid. Basically anything I wanted to try, I was encouraged to do. That’s awesome. Ellen Improv

**Mark Franczyk: **From basically the time I was born, our families were going on vacations together and doing various other things. We went to school together from K-8 but we for some reason were never in the same class. Year after year it was like some joke that the teachers were playing on us. I think in like 5th or 6th grade we ended up in the same class finally. Anyway, basically my entire childhood was spent with Mark.

Another random thought: Remember ski club? We used to do this thing in elementary school where like on Tuesday nights were would go to Mount Tom and ski. Mount Tom has been closed for like 20 years at this point. It’s hard to even imagine that there was a ski area in Holyoke at one point.

**Mrs. Vosburgh: **I had Mrs. Vosburgh in 3rd grade. She was the best teacher I had in elementary school.

**Mrs. Lussier: **She taught me math in 8th grade and I’ll always associate FOIL with her.

Also, she had us do this thing once because kids weren’t being nice to each other, and we had to write a nice word about everyone in class. Then she would compile the words for everyone. I remember using the word “kinky”, not fully understanding what it meant, to describe someone. I had to do my list over. So, I guess you could say that I learned the definition of “kinky” from my 8th grade teacher. Which is a very strange thing to say out of context.

**Shaun McGrady: **I met Shaun in high school. he was two years ahead of me. We had a study hall together I think. Years after we graduated we ended up playing poker together a lot. In fact, I met my wife, who was roommates at the time with his current wife, at one of his poker tournaments. Which leads me to to….

**Bare Naked Ladies, the band: **Shaun knew my wife because he ran some sort of BNL fan club. So I’m currently happily married to my fanatics wife as a direct result of the band Bare Naked Ladies. I don’t like it that much, but what are you going to do.

**Rob Higney: **This guy was the best man at my wedding. I once threatened to beat him up in high school before we were friends. He is without a doubt, the smartest person I know who also thinks that everything a 12 year old finds funny is still funny.

**Dan McCarthy: **My greatest achievement in life is when we won that midnight beer pong tournament on Christmas Eve with you drinking 75% of the beer. You are truly an inspiration.

**JOC: **Nearly all of my political views are based on conversations with this guy. Not sure he actually knows that. Specifically, my views on taxes.

**Joanna: **I didn’t know what her last name was for about 10 years. She’s the only person on earth I don’t think I have ever been annoyed with.

**Ann Kellner**: I took AP Calculus from Mrs. Kellner my senior year of high school. To this day it was the best class I have ever taken with the best teacher I have ever had. It was unbelievably well organized, and I still remember basically everything from that class.

**Bonnie Moriarty**: Dr. Moriarty was my english teacher junior and senior year of high school. My sophomore year of high school I was in “regular” English and I kept getting B’s and C’s because I was totally uninterested. I wanted to do honors and she let me in even though my grades weren’t the best. This is where I learned to write. I still use the process that she showed us in that class to this day. Also, fun fact: I once told this teacher that I didn’t need to know how to write because I was good at math. That is maybe the dumbest thing I have ever said. I write every single day.

Another thing that I will always remember from this class is this poem called “The Unknown Citizen”. When we were discussing the poem in class I said something like, “This guy’s life seems pretty good. He had everything he wanted”. She responded, “Oh, Greg. I hope you are joking”. I think about this basically every day.

**Mike Cecere**: Coach Cecere was my wrestling coach my first three years of high school. My junior year of high school I had just finished 4th at sectionals and the top 4 from Western Mass made it to states. So I was basically the last person in. On the day before the tournament after our last practice while I was waiting to get picked up he told me that there was no reason that I couldn’t place at this tournament. I didn’t really believe him. But I ended up finishing 4th. It’s the first time in my life I did something that I didn’t think was possible.

**Mike Maynard: **When I started wrestling my freshman year of high school, Mike was also a freshman. We basically beat the shit out of each other for 4 years. And the only reason I was any good at wrestling is because I had someone better than me to practice with every single day.

**Dennis Fenton**: Coach Fenton was my coach my senior year. His practices were unbelievable. I was never a better wrestler than I was at the end of my senior year of high school. And I’ll always remember the lazy man’s fire man’s carry (In my last victory in high school, I beat a guy who beat me at states using this move), the Peterson series, and the Penn State ride.

**My wife**: I mean I can’t say enough about this lady. She’s just the best. I love art because of her. It took here a decade, but she convinced me that art isn’t just scribbling with paint on canvas. I’ve basically had an entire art school education thanks for being married to her. She’s just great.

**Anna Foss: **Anna was probably my first friend in college. We lived on the same floor freshman year. I once went to her house in a car that she had borrowed from someone else. She told me that if her dad asked, I had to tell him it was my car. Her dad did ask and I told him it was my car. I said, “Oh. I don’t know. I’m not that good with cars.”

**Scot Junkin**: Scot lived next door to me freshman year of college. Freshman year of college was very fun. I want to be Scot when I grow up.

I went to visit Scot in Utah a few months before my first kid was born. He took me downhill mountain biking. It is one of the best experiences of my life. I think about it all the time.

**Matt Houde: **My roommate in college my Senior year. We used to set our alarms for 7am ready to attack the day and then snooze for like 4 hours before laying in bed for another hour an insulting each other. College was pretty good.

**Jon Cahill: **Jon is the nicest person I have ever met. I met Jon at college and now he hangs out with a bunch of my friends from high school without me.

**Tejal Patel:** At WPI there was a fraternity that almost of the wrestlers joined. I ended up joining a different fraternity in large part because of Tejal who wrestled, but wasn’t in the wrestling fraternity. So many of my best friends today I met in that fraternity.

**Carlos Morales: **Professor Morales was must undergraduate major project advisor and basically got me hooked on statistics. We did a project where we used a Bradley-Terry model to rank tennis players. I wanted to (and still want to) do a project where we analyzed wrestling data, but that data is not so easy to get. This is what got me hooked and I stayed at WPI after I graduated to do a M.S. in Applied Statistics.

**Jayson Wilbur: **Professor Wilbur took over my Master’s project after Professor Morales left to take a job in industry. He let me do a project on predicting NFL point spreads! That’s pretty awesome. I also had some really important conversations about doing a Ph.D. with Jayson that I will always be thankful for. I was a not so great master’s student, but he still wrote me a recommendation to do a Ph.D., and I’ll always appreciate that. I hope I haven’t let him down.

**Andrew Swift: **Professor Swift taught me Bayesian statistics at WPI. I also took a class called life contingencies with him because I was an actuarial science major (the best thing to ever happen to me was failing the actuarial exam. twice. Otherwise I might be n actuary……). I had a lot of conversations with him about statistics, and he was another professor who helped lead me toward statistics. Unfortunately, he’s a Dolphins fan……..

**Peter Cook: **My first job after school was at Brookstone in their direct marketing department segmenting catalog mailing lists. If that sounds awful to you, you are correct. I didn’t really like anything about this job except for Peter. He was my first boss, and I couldn’t have had a better experience with him. I learned so many practical things from him.

Story about him: For whatever reason, we were talking about something called a “death pool“. It’s not important why we were talking about this, but on his way out of work one day he mentioned that I should take “that alligator hunter” guy. Literally the next work day he got stung by that sting ray and he just walked into work in the morning and said “I told you you should have taken him”. I’m sure that not a totally accurate telling of that story, but that’s the version I’m sticking with.

He’s also the reason that I sign all my emails with “Cheers”. Cause apparently that’s what British people say. And he was very British.

**Ofer Harel: **Ofer was my Ph.D. adviser. Basically my entire professional academic career is thanks to him. It’s hard to over state the influence that he has had on my professional life.

**Elijah Gaioini: **Elijah taught me that everyone has a different loss function. Which basically changed the way I viewed the world whether he meant it to or not.

**Brien Aronov**: I’ve never met a person who baffles me so much. In grad school, I used to have really interesting conversations with Brien. No one views the world quite like this guy. He is genuinely one of the most interesting people I have ever met. I’ll never, ever understand him, but that’s the best part. And f$&^ing MOVE!

**Paddy Harrington**: I wouldn’t have made it through grad school without Paddy. We were two of the three Americans in the “remedial class” when we started our Ph.D. together. I believe that every single class I ever took, Paddy was in. I hope I help him half as much as he helped me in getting through those classes. I wish I talked to him more. Paddy, if you are reading this, CALL ME!

**Mike Lopez**: I “met” Mike Lopez online and through twitter because we were both writing sports statistics blogs. We finally met in person for this first time (I think) at JSM in Montreal. Since then we won a Kaggle contest and have written two papers together : one about the Kaggle contest and the other with Ben Baumer about competitive balance. The latter paper is probably my favorite paper that I have ever written. Now Mike works for the NFL, and he likes to remind me how much more successful he is than I am.* When I first writing about sports, I always felt like I had to sort of hide it because it wasn’t “real statistics” or “real research”. But getting to work with Mike has helped me embrace pursuing sports statistics as a “real” research topic.

*That’s completely untrue. Anyone who has ever met Mike knows he is incapable of this. He’s so nice it’s painful.

**Andrea Foulkes: **Dr. Foulkes was my post-doctoral adviser at UMass. I have my current job today because she took me on a post-doc and I’m eternally grateful for that. I didn’t know anything about genetics when I started working with her, and it was an incredible opportunity to learn something completely new. I’m not sure I will ever have another opportunity where I have just total time to focus on learning a new field.

**Nick Reich: **Nick started as a professor in the School of Public Health at Umass the same year (the year before?) I started as a post-doc. I found it really helpful to get to watch him on the tenure track. I also felt really comfortable around him enough to ask him “stupid questions”. I remember in my third year as a post-doc I started teaching, and I asked him “what do I even wear to teach?” I know it’s hard to imagine me asking that question because I dress so well now, but as a post-doc I was……fashion challenged.

We used to play chess at lunch, and he would generally smoke me. But those games helped me get through a tough time in my life when I started having panic attacks again. I also remember one time where I showed up in his office having a panic attack, and I was just like “talk to me”. That really helped. Thank you.

When I was a post-doc I wrote a blog post critical of a Grantland article. I didn’t really know what to do with it, but in talking with Nick he encouraged me to submit it to Deadspin. At the time Tommy Craggs was at Deadspin and Nick had worked with him in the past, and Nick wrote an email supporting my post getting put on Deadspin. I really appreciate that. Thanks for that.

The advice I’ll always remember him giving me was “Don’t let the perfect be the enemy of the good.” This is so true.

Nick is also the first person to buy my art. Guy has good taste.

**Ben Baumer: **I met Ben at a talk he gave at UMass. He had started as a professor at Smith the same year I started as a post-doc. I went up to him after the talk, and asked if he wanted to work on a project with him on baseball. I was thinking that we would do something small. Ben was like “Let’s re-do WAR”. That’s a big project. Well, we developed openWAR (with Shane Jensen), and it won a SABR award. The moral here is go big.

**Tim O’Brien:** Every time I have a question about something at Loyola, I go ask Tim. He’s been incredibly helpful, and it’s an absolute pleasure working with him.

**Peter Tingley and Emily Peters: **Peter and Emily are a married couple who work in the Math Department with me at Loyola. When I first moved out to Chicago, they invited my wife and I over for dinner. Which was nice because we knew basically no one in the Chicago area. Peter and Emily are two of the big reasons why I’ll never leave Loyola. They are fantastic colleagues and they are both pretty good at math #understatement.

**Everyone in Loyola’s Math Department:** I could just list everyone here, but that’s a lot of people. I’ve never met even a single person in this department who isn’t fantastic. I really like this department. Special shout out to Agnes, who is the best.

**Harry Pavlidis: **I met Harry almost right away when I moved to Chicago at a conference at Northwestern. He’s one of the head honchos over at Baseball Prospectus. The community that he has created and the people I have been able to meet through him has been absolutely fantastic. And he’s an all around great guy.

**David Montgomery:** I did the Second City Improv program from level A-E and my group wanted to keep going. So we kept taking classes with David. Our group has been together for 4 years, and it’s been one of the great experiences of my life. And David has been

**The Emeralds of Saigon: Denny O’Malley, John Retterer-Moore, Jackie Hilmes, Michaela Choy (and Sweet Sarah and Michael Cho): **I’ve been doing improv with these jokers for almost exactly 4 years. I hope I do improv with them for another 40 years. Also, I know that I say that we aren’t friends all the time. But secretly I consider myself friends with you.

**Brian Seguin: **Brian got hired a year after I did at Loyola. I’ve learned more at our weekly lunches at IDOF than I did in 4 years of college. Brian is also largely influential in me learning about functional data analysis and shape analysis and is always very helpful with teaching me the right notation. He’s also a pro at deck drinking in the summer.

Cheers.

## Some initial thoughts on my “radical” redesign of intro stats

- I need a new example of Simpson’s Paradox. Anyone got any ideas?
- I think we need to talk much more about sampling methods in intro stats courses. I don’t know how much other people talk about this, but I usually mention it in like 15 minutes in one class and then we never talk about it again.
- I want to employ the “theory, simulation, example” for all the topics that I am going to cover. So for instance, for a one sample t-test we would talk about what the form of the t-test is (in more advanced classes we could derive it using the idea of ancillary statistics), then using R simulate the test statistic over and over again to see the actual distribution of the test statistic, then get an actual set of data and do an example. Theory. Simulation. Example.
- I think ethics should be included in every intro stats course. I’ve never formally done it before, I want to include it in this, and I have no actual idea how to include it. Anyone have any thoughts?
- I’m heavily relying on the GAISE report for ideas about what to include in my course. Instead of traditional chapters from a book, I think I’m going to use their nine goals as 9 modules in my course. And then I’ll put the appropriate techniques into each of those modules. Only issue is that I want software to appear in all of the modules so I’ll need to re-order that goal from 8 to like 2.
- I really like the distinction that the GAISE report makes between CONSUMERS and PRODUCERS of statistics. I’m teaching STAT335, which is Introduction to Biostatistics. These students will be about 50% consumer and 50% producers. The course should be tailored to that. In our STAT 103 course, the students are going to be 95% consumers of statistics. We need to revamp that course to account for that. I’ve never really thought about that distinction before in terms of developing course material for an intro stat course.
- At Loyola we have essentially 3 intro stats courses: 103, 203, and 335. Right now that are largely the same intro stats course with different students and slightly difference math requirements. My goal long term is to totally re-design these so that 103 is geared towards consumers, 203 is geared towards producers, and 335 is right in the middle for consumers and producers.
- The GAISE report in incredible.