# The Cubs are good #hottake

This is the blog post that I would have written after the Cubs 30th game when they were 24-6 if I wasn’t in bed 20 hours a day for the last week with “the sickness”.  Anyway, I scraped baseball reference to get the game results of all teams going back as far as they go.  First I looked at how may teams had started 24-6 or better in their first 30 games. The list is here:

^ – Lost World Series

* – Won World Series

Detroit 1984*

### 25-5

Detroit 1911

Pittsburgh 1902 (No World Series in 1902)

### 24-6

Chicago Cubs 1907*

Pittsburgh 1921

New York Yankees 1928*

New York Yankees 1939*

Boston 1946^

New York Yankees 1958*

LA Dodgers 1977 ^

Oakland 1981

Chicago Cubs 2016 ?

(Note: Chicago White Sox 1912 (23-6-1))

Prior to the 2016 Cubs, 11 teams since 1902 have started 24-6 or better.  In one of those years there was no World Series (1902), so considering the 10 teams that started 24-6 or better and there was was World Series that year, 7 of those teams made it to the World Series with 5 out of those 7 teams winning the World Series.  So professionally I’m not saying that the Cubs will make the World Series, but personally I am guaranteeing in.

Next I wanted to to look at the relationship between teams winning % after 30 games and there winning % at the end of the season.  The plot below shows a scatter plot of this.

No team has ever finished the season with a winning percentage over .800.  The highest winning percentage ever was the 1906 Cubs with a winning percentage of 0.758 (116-36-3). More recently the Mariners in 2001 finished with a winning percentage of 0.716 (116-46).

I also fit a simple linear regression line through the data and the fitted values are as follows (red line on the scatter plot):

$\hat{\beta}_0 = 0.2757 \hat{\beta}_1 = 0.4533$

This model predicts that the Cubs will win, on average, 103.41 games this year based on their first 30 games with a 95% prediction interval of (84.58, 122.24).  That’s probably not that interesting of an interval so I also looked at a 50% prediction interval which ended up being (96.93, 109.89).  This means there is about a 50% chance that the Cubs end up with wins in this interval.  Further, it also means that there is about a 1 in 4 chance the Cubs end up with fewer than 97 wins, but there is also about a 1 in 4 chance that the Cubs win MORE THAN 110 games.

Cheers.