Mere Mortals: Some Interesting Tweets

On Monday, I posted in response to Bill Barnwell’s article on Grantland called “Mere Mortals” where he makes the claim that regular NFL players who played between 1959 and 1988 have a statistically significantly lower mortality rate than regular baseball players of the same era.  I suspect all that has been demonstrated is that older people die more often than younger people because the groups are not directly comparable, since age was not controlled for in the comparison.  What I believe we are dealing with here is correlation and not causation.  Baseball, almost surely, is not killing people faster than baseball players.  And even if it was, it has not been demonstrated.  Not even close.

So, I’ve been reading some other stuff by Barnwell (which I really enjoy) including his Twitter feed lately and two particular tweets interested me greatly.  The first one:

Appreciate the kind words about the study. For those who asked: Average age of MLB player at time of passing was 60.9; for NFL, it was 58.8.

And the second one:

Went back and looked at age for all players in my study by request; MLB players in sample were on average 24 months older than NFL players.

The first tweet should make someone pause and think about how this can be, while at the same time, baseball players have a higher mortality rate.  I suspected, originally, and still do, that it was because one group was simply older than the other group.  Which…..is exactly what was tweeted in the second tweet.  MLB players in the sample were TWO YEARS older than NFL players in the sample.  Can you name something that 62 year olds do more often than 60 year olds?  I can.  They die more often.

So, it seems to me like all that the study has actually demonstrated is that older people die more often.  So, maybe Barnwell will dial back his “stunning” claims?  Maybe not.  This is the tweet directly before tweet number 2:

ICYMI: Wrote about the stunning respective mortality rates of MLB and NFL players from 1959-88 on @grantland33http://ow.ly/d1QGH

Cheers.

 

MLB rankings – 8/20/2012

StatsInTheWild MLB rankings as of August 20, 2012 at 12:18pm.  SOS=strength of schedule

Team Rank Change Record ESPN TeamRankings.com SOS Run Diff
NYY 1 75-50 4 1 4 +102
Texas 2 71-50 5 3 13 +89
Tampa Bay 3 ↑6 68-54 7 4 5 +69
Washington 4 76-46 1 2 23 +109
Oakland 5 65-56 13 6 8 +32
Atlanta 6 ↑4 70-52 3 5 21 +84
Chi WSox 7 ↓2 66-55 9 10 14 +67
Cincinnati 8 ↓2 74-49 2 7 30 +73
Detroit 9 ↓1 64-57 12 9 12 +24
LA Angels 10 ↓7 62-60 15 11 7 +21
Boston 11 59-63 17 12 3 +34
Baltimore 12 ↑2 66-56 14 8 2 -47
St. Louis 13 65-56 11 17 29 +106
Toronto 14 ↓2 56-65 19 18 1 -25
Seattle 15 ↑1
59-64 20 13 6 0
LA Dodgers 16 ↑3 67-56 10 16 25 +38
Arizona 17 62-60 16 19 24 +40
SF 18 67-55 8 14 26 +30
Pittsburgh 19 ↓4 67-55 6 15 28 +19
Kansas City 20 ↑2 54-67 25 20 11 -47
NY Mets 21 ↓1 57-65 18 21 15 -33
Philadelphia 22 ↑3 57-65 22 22 18 -30
Cleveland 23 54-68 23 24 9 -125
Milwaukee 24 55-66 24 27 27 -11
Minnesota 25 ↓4 51-70 26 23 10 -86
Miami 26 56-67 21 25 16 -84
San Diego 27 ↑1 54-70 27 26 22 -70
Chi Cubs 28 ↓1 47-75 28 29 20 -98
Colorado 29 47-73 29 28 19 -112
Houston 30 39-83 30 30 17 -169

Past Rankings:

8/14/2012

8/6/2012

7/23/2012

7/9/2012

7/2/2012

6/25/2012

6/19/2012

6/9/2012

5/28/2012

5/23/2012

5/14/2012

5/7/2012

4/30/2012

4/23/2012

4/16/2012

4/13/2012

Cheers.

Grantland Newsflash: The old die more often than the young

Grantland recently published this article, Mere Mortals, which claims that:

Baseball players who accrued at least five qualifying seasons from 1959 through 1988 died at a higher rate than similarly experienced football players from the same time frame.  The difference between the two is statistically significant6and allows us to reject the null hypothesis; there is a meaningful difference between the mortality rates of baseball players and football players with careers that emulated the [National Institute for Occupational Safety and Health] NIOSH criteria.

The authors then go on to collect data on football and baseball players who played at least 5 years between 1959 and 1988, and their results are below:

Baseball Football
Qualifying Players 1,494 3,088
Alive 1,256 2,694
Deceased 238 394
Mortality Rate 15.9 percent 12.8 percent

From this table, to their credit, they calculated confidence intervals for the mortality rate, as well as performing a fisher exact test to test for independence between the rows (dead or alive) and columns (baseball and football). For football players, the 95% confidence interval for the mortality rate was (11.6, 13.9), and, for baseball players, the 95% confidence interval was (14.1,17.8).  The Fisher exact test gives a p-value of about 0.004 and from this they conclude, correctly, that the mortality rate is significantly different between the groups at the 0.01 level.

So, the big question is, as they pose it:

Why is it that baseball players from the ’60s, ’70s, and ’80s are dying more frequently than football players from the same era? Truthfully, as a layman, I can’t say with any certainty, and I don’t think it’s appropriate to speculate. A deeper study into the mortality rates of baseball players that emulated the NIOSH focus on specific causes of death versus the general population might prove valuable.

Well, I’ll “field” (pun intended) this one.  Baseball players are dying more often because they are older that football players.  The authors, as far as I can tell, never controlled for the age of the players, or any other risk factors for that matter. In 1959, there were, as far as I can tell, 12 NFL teams each with 40 players.  That 480 players.  In 1988, there were 28 teams with 59 players each; A total of 1652.  In baseball, in 1959 there were 16 teams with, let’s use the largest number, 40 man teams, for a total of 640 players.  That number in 1988 was 1040 (26 teams with 40 players).  So there were almost 3 and half time more players in the NFL in 1988 than there were in 1959.  The number of baseball players only increased about 1.6 times over this same period.

These numbers aren’t exact, but the point still stands:  The group of football players that has been collected here has a greater proportion of younger people in it than the baseball group.  So it’s not exactly apples to apples.  In fact, it’s not even close.  You’d expect, just based on the ages of the players in these groups for baseball players to have higher rates of mortality than the football players.  So basically they have demonstrated that the old die more often than the young.

Cheers.

P.S. My first boss once gave me this example.  Remember the ad where it was claimed that 90% of all trucks sold in the last ten years were still on the road?  You’re comparing cars that are ten years old in the same group with cars that are less than a year old.  Not exactly apples to apples.

Idiots, Liars, and Unicorns: Presidential Politics and Search Engines

In the past I’ve posted search engine auto-completes for some of the presidential candidates.  For instance, here are Romney and Obama’s results from 5/30/2012, here are Romney and Obama’s results from 4/16/2012, and here are the republican primary candidates from 12/29/2011.  Below you will find the auto-completes for the two presidential candidates from 8/16/2012.  I’m also including word clouds now.

I’m using three search engines (Google, Bing, and Yahoo!) and two search terms for each candidate (Mitt Romney, Mitt Romney is, Barack Obama, Barack Obama is).  I’m then weighting the terms from 10 to 1 for Google and Yahoo and 8 to 1 for Bing (as they only return 8 search terms), based on the order they appear in the auto-completes.  For the first two word clouds, I’m additionally weighting the search engines with Google getting weight 11.7, Bing gets 2.7, and Yahoo gets 2.4.  (These numbers are approximately the number, in billions, of searches performed on each site respectively in February 2012.)

The first word cloud represents all of the words with weighting for both presidential candidates.  Kind of makes you think a little bit about the political discourse in this country when some of the tops words for presidential candidates are idiot, liar, and  antichrist.  (For those of you new to the internet, here is the explanation for “your new bicycle”.)

This next word cloud is the same as the previous one, except it is separated by candidate.  The blue and red words are Obama and Romney, respectively.  If you’re wondering about the “Unicorn” on the Romney side of the word cloud, you may be interested in this facebook page.  According to them, “There has never been a conclusive DNA test proving that Mitt Romney is not a unicorn. We have never seen him without his hair — hair that could be covering up a horn. No, we cannot prove it. But we cannot prove that it is not the case.”  Truer words have never been spoken….

The final wordcloud of the trio breaks down the auto-complete terms by search engine.  Note that, these words for this wordcloud are not weighted by search engine, but they are weighted by order within each search engine.  I think it’s kind of interesting that, for Yahoo, the big words are religions: Muslim and Mormon.  This makes me wonder if different search engines might predict in some way political affiliation, and, apparently, I’m not the only one who’s thought about this.  Looks like a group called Engage has already looked into this and their results are summarized nicely in this graphic.  According to them, Googlers tend to be more Democratic and Bingers (?) tend to be more Republican.  (I don’t see Yahoo on their graphic, which I find odd.)  Also, according to Alexa.com Bing users tend to be older than the average internet user, slightly more likely to have “some college” education, and slightly less likely to have a graduate degree.  Google and Yahoo users tend to be very much the average internet user with the exception that they are much less likely to be over the age of 65.

Below here, you’ll find screen shots of the Google, Yahoo, and Bing auto-completes if you’re interested in the raw data that I used.

GOOGLE






YAHOO!

 


BING

 


Cheers.

kenbonzon's avatarblog maverick

When it comes to getting a job, the USA has bifurcated into two employment worlds, the digital world and the brick and mortar world.

The brick and mortar world is everything you physically touch. Its manufacturing. Its retail sales. Its distribution. Its construction. Etc.

The digital world is everything defined by what you find on computing devices. It can be on your desk, in your hand or in the cloud.

What has happened is that the brick and mortar world has had every bit of intelligence that can be sucked out of it completely removed.  Any information that can be created, identified or recognized is being captured in as automated a process as possible and delivered to “big data” or even small data databases in the cloud. What used to require some intelligence at the brick and mortar work place has been seeded and ceded into the cloud.

Every smart…

View original post 990 more words

Low hit, no hit, and perfect games: The King Felix Edition

 

 

Felix Hernandez of the Seattle Mariners just threw the third perfect game of the season, so I figured it was a good time to update my low hit games graphs that I posted in June.  So, here they are:

Cheers.

 

MLB Rankings – 8/14/2012

StatsInTheWild MLB rankings as of August 6, 2012 at 8:17pm.  SOS=strength of schedule

Team Rank Change Record ESPN TeamRankings.com SOS Run Diff
NYY 1 63-44 3 1 5 +92
Texas 2 63-44 4 2 13 +83
LA Angels 3 58-51 9 6 7 +49
Washington 4 ↑2 65-43 2 4 23 +82
ChiSox 5 ↑4 59-48 7 5 14 +64
Cincinnati 6 ↑5 66-42 1 3 29 +72
Oakland 7 ↑1 58-50 8 7 8 +28
Detroit 8 ↓1 58-50 13 9 12 +24
TampaBay 9 ↑1 56-52 14 12 3 +19
Atlanta 10 ↑5 62-46 5 8 21 +62
Boston 11 ↓6 54-55 17 13 6 +29
Toronto 12 ↓8 53-55 18 14 2 +9
St. Louis 13 ↑1 59-49 10 15 30 +110
Baltimore 14 ↓2 57-51 16 11 1 -57
Pittsburgh 15 ↓2
61-46 6 10 28 +36
Seattle 16 ↑1 51-59 20 17 4 -3
Arizona 17 ↑4 55-53 15 19 26 +42
SF 18 ↓2 59-49 11 16 27 +19
LA Dodgers 19 ↓1
59-50 12 18 25 +15
NY Mets 20 53-56 19 20 17 -5
Minnesota 21 ↑3 47-61 25 22 11 -79
Kansas City 22 45-62 26 23 10 -60
Cleveland 23 ↓4 50-58 21 21 9 -90
Milwaukee 24 ↓1 48-59 22 26 19 -13
Philadelphia 25 49-59 24 24 19 -29
Miami 26 49-60 23 25 15 -100
Chic Cubs 27 ↑1 43-63 28 27 18 -79
San Diego 28 ↓1 46-64 27 28 22 -61
Colorado 29 38-68 29 29 20 -117
Houston 30 36-73 30 30 16 -142

Past Rankings:

7/23/2012

7/9/2012

7/2/2012

6/25/2012

6/19/2012

6/9/2012

5/28/2012

5/23/2012

5/14/2012

5/7/2012

4/30/2012

4/23/2012

4/16/2012

4/13/2012

Cheers.

Almost the best

Canada managed to win 18 total medals in the 2012 Olympics, while only tallying one gold medal.  Previously, I proposed the question, has anyone ever won more medals than this with fewer golds?  The answer, shockingly is yes.  In 1952, Germany managed to win 24 total medals and exactly ZERO golds (7 Silver and 17 Bronze).  Incredible.  Really nice work.

Other countries that have come close include the United Kingdom in 1960 (2G, 6S, 12B), Sweden in 1984 (2G, 11S, 6B), and Cuba in 2008 (2G, 11S, 12B).

Another fun fact: Only 5 times has a team won more than 50 total medals and won more bronze than silver and more silver than gold.  They are Sweden 1920 (19G, 20S, 24B), Soviet Union in 1964(30G, 31S, 35B), West Germany in 1984 (17G, 19S, 23B), Germany in 2000 (13G, 17S, 26B), and, most recently, Russia in 2012 (24G, 26S, 32B).

See you in Brazil in 2016!

Cheers.

2012 Olympics – Final Medal Count

Canada is incredible.  They somehow managed to win 18 total medal and only 1 gold.  Amazing.  How is it possible to be so consistently nearly the best?  Has anyone ever won more medals with fewer golds?

Cheers.

All-Star Plots?

Here are some star plots for major league baseball batters, pitchers, and ball parks.  The star plots represent the outcomes of a particular at bat for a hitter, a pitcher, or at a given ball park.  For each plot, batter, pitcher, and ball park was varied, while the other two parameters were filled in with the average value.  For instance, all batters outcomes are calculated as if they were facing J. Kinney at Wrigley Field; Pitchers data was calculated as if they were facing K. Medlen at Wrigley; and Park factors were calculated as the outcome of J. Kinney vs K. Medlen at different ball parks.  The data use to calculate these were downloaded from baseball-reference.com and includes the results every single plate appearance so far this season (about 125,000 so far) and where the game was played.  Six outcomes to an at bat were considered: out, walk, single, double, triple, and home run.  The probability of each of these events was estimated creating a vectors of probabilities with six elements corresponding to each of the six outcome considered.  I’ve chosen to display this data using the star plots below.  The key to the star plot can be found in the lower left corner of each plot and displays the probabilities of each outcome relative to other batters.  For instance, a large blue pie piece on the left indicates that batter’s plate appearance ends with a HR more often relative to other players.  Likewise, a large red pie on the right indicates that the batter’s plate appearance ends in an out more often than other players.

I’ve chosen 100 batters based on their wide range of hitting styles.  In the first row, you’ll players who make outs at the lowest rates relative to other players.  These include players like Joey Votto, Andrew McCutchen, David Wight, and Mike Trout.  Further down, you’ll start to see players who you might describe as single’s hitters.  These include players like Ruben Tejada, Derek jeter, B. Revere, and Juan Pierre.  Finally, towards the bottom row, you’ll see the players who are primarily power hitters like Adam Dunn and Jose Bautista with large blue, for home run,s and orange, for walks, pie pieces, with significant red for outs.  Other players on this row like Saltalamacchia, Plouffe, and Rosario have the large blue and significant red pieces, but they lack walks.

The star plot for pitchers is below.  The first thing we need to say here is that Justin Verlander is very, very good at pitching a baseball.  Some other interesting pitchers here are Yu Darvish, Edison Volquez, and Carlos Zambrano.  They seem to give up relatively few hits, but they give up many more walks that the average pitcher.

These plots are ordered from highest to lowest probability that an out will be made in a given plate appearance.  Pittsburgh, Seattle, and San Francisco lead the way in pitching friendly parks.  These are the same as the bottom three  according to ESPNs measure of Park factor.  The most hitter friendly park is, no surprise, Coors field in Colorado.  Other hitter friendly parks include Target and Chase field in Minnesota and Arizona, respectively.  Arizona is expected here, but Minnesota is a little bit surprising.  It looks like, while it is rare to make an out, most hits are only singles, which don’t generate as many runs are their extra base counterparts.  Home run friendly parks include Coors, Chase, Camden Yards, Miller, Comisky, and Yankee Stadium.  Fenway park is solidly in the hitter category, but it gets that way, rather than by giving up many homeruns, by yielding a greater percentage of doubles than any other park.

 

Cheers.