Category Archives: Uncategorized
The Value of Statistics (in the wild)
Form the Freakonomics blog: “The Value of Statistics“.
I like the last line of this post: “It goes to show that thinking up the right regression to run can be worth millions.”
Cheers.
Global Warming and Faith (in the wild)
Earlier this year the Pew research group released some interesting findings on “Religious Groups’ Views on Earth Warming Evidence”.
If you want, you can download the actual data set, as well as many other interesting data sets, directly from their site.
Cheers.
Chernoff faces (in the wild)
I took a multivariate class several years ago and towards the end of the semester the professor showed us Chernoff faces. I was thinking about them for some reason tonight, and so I figured I’d do a search for Chernoff faces on the internet. Here is an interesting application of Chernoff faces to MLB managers. This got me excited so I did a google search for “R faces” hoping to find an R package for Chernoff faces, however, this search yielded this web site. The headline on that web site says: “French rapper Monsieur R faces up to three years in prison and a 75,000-euro fine for referring to France as a ‘slut’ and a ‘bitch’ and saying ‘I piss on Napoleon and General de Gaulle’ on his latest album.” Not quite what I was looking for, but completely fantastic. God Bless the internet.
After another quick search, I downloaded this R Package. And now I’ve spent all night “Chern”-ing out Chernoff Faces.
Here is one for a few selected MLB hitters:

Here is one for a few selected MLB pitchers:

And here is my favorite about the economy (Data is from here):

I especially like how the Chernoff faces get smaller and actually appear to get sadder as the economy worsens. I guess it’s so bad that even Chernoff’s faces are feeling the recession.
Cheers.
Risk in the wild
Ever play Risk and notice the first person has a huge advantage? Apparently, Andrew Gelman has too, and he offers a simple solution to the problem here.
Cheers.
Cocaine’s a hell of a drug (in the wild)
Here is a nice little post about the price and purity of cocaine over time. The second graph released by the Washington Office on Latin America shows a huge decline in the price of cocaine in the eighties and a much more gradual decline in price from the early nineties through 2008.
They conclude: “In a report released this week, WOLA points out that there has been a general downward trend in cocaine prices in recent decades, despite the occasional spikes, indicating that crackdowns on cocaine trafficking are not working.
I’m not an economist, but couldn’t this also be a result of less demand for cocaine? If anyone actually knows what they are talking about, I would love to hear from you.
Also as a note, the SITW blog officially does not endorse the use of cocaine. As Kurt Cobain once said, “Drugs are bad for you, They will fuck you up.” And you better believe we’re not going to censor that like the local Aberdeen authorities. Not on SITW.
Cheers.
Lawyers salaries (in the wild)
I got into an argument with one of my friends who is in law school about lawyers salaries. So I searched around the internet, and I found this fascinating graph of 2008 starting lawyers salaries from this blog entry on elsblog.org .
The blog entry goes on to say:
“Of the 22,305 law school graduates in NALP’s sample (over half of all 2008 graduates), a remarkable 23% (5,130 ’08 grads) reported an entry-level salary of $160,000. In contrast, 42% of entry level lawyers reported salaries in the $40,000 to $65,000 range. Once again, the central tendencies are a poor guide to the distribution as a whole: whereas the mean salary is a $92,000, the median salary was $72,000. Further, the two modes ($50,000 and $160,000) are separated by $110,000.”
Some comments:
1.)They sample over 22,000 law school graduates, which they claim is over half of all graduates. I wonder, however, if there is any systematic bias in this sample (I have no evidence there is or is not). For instance, people making very low wages may choose not to respond. This could further inflate (or deflate, if the opposite was occuring) starting salary statistics.
2.) I’m sure if you ask law schools about job prospects after you graduate, they would be happy to site the average starting salary of $92,000. I’d also bet if you ask law students about how much they expect to make they would quote the $92,000 average starting salary. Judging by this graph though, I imagine there are a lot of jaded 1st year lawyers pulling in $60,000 a year, which is by no means a bad living except……..
3.) Debt. It would be interesting to see this same graph of 2008 first year salaries, but minus the loan payments. The standard repayment period according to this is 10 years. Accoridng to this the tuition at Harvard Law school in 2009 is $41,500.
Say you finance all of that and nothing else. You owe $124,500. At 5% over ten years you owe $1382.21 a month. At $160,000 you are making over $13,000/month. Not a problem at all. But if you’re making
$60,000 a month like a lot of first year lawyers, you are effectively only making $43.413.48=$60,000-$16,586.52 (salary-loan payments). I think that would be an interesting graph. Anyone have that data?
Cheers.
Lady Tasting Tea (in the wild)
Here is an interesting excerpt from a review of the book The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century: “Salsburg believes that the public is not fully aware of the degree to which recent developments in statistics impact the way we perceive the world. He correctly points out that the twentieth century saw the fading of a deterministic outlook and the rise of a statistical/probabilistic way of looking at the world. This ongoing revolution is not only in the physical sciences, it also touches the social sciences and even the humanities. Though profound, it is a quiet revolution that has been unnoticed by many.” –Marc H. Mehlman
SITW blog officially endorses the book The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century by David Salsburg. A good review of the book can be found here.
Interesting fact: David Salsburg received the first Statistics Ph. D. granted by the University of Connecticut in 1966.
Cheers.
Correlation of the Week (in the wild)
The New York Times printed an article today about how important statistics were in the real world which I mentioned in a previous blog post. One of the last paragraphs offers a great example of correlation versus causation which has now won the SITW Correlation of the Week! Congratulations 1940s public health experts.
“The rich lode of Web data, experts warn, has its perils. Its sheer volume can easily overwhelm statistical models. Statisticians also caution that strong correlations of data do not necessarily prove a cause-and-effect link.
For example, in the late 1940s, before there was a polio vaccine, public health experts in America noted that polio cases increased in step with the consumption of ice cream and soft drinks, according to David Alan Grier, a historian and statistician at George Washington University. Eliminating such treats was even recommended as part of an anti-polio diet. It turned out that polio outbreaks were most common in the hot months of summer, when people naturally ate more ice cream, showing only an association, Mr. Grier said.” – David Alan Grier, NY Times Article
Some notes: 1.) Whoops. 2.) If you’re name is David Alan Grier, and you’re not this David Alan Grier, go by David Grier. Just drop the Alan.
Cheers.
Bringing Sexy Back (in the wild)
I was just checking out the bernoulli trial blog and Stan has posted a great quote from google’s chief economist about how sexy I am. Well, it’s about how sexy statisticians are. The job anyway. (My friend who is doing a Ph. D. in English also sent me this highly relevant article from the New York Times: “For Today’s Graduate, Just One Word: Statistics”.)
“I keep saying the sexy job in the next ten years will be statisticians. People think I’m joking, but who would’ve guessed that computer engineers would’ve been the sexy job of the 1990s? The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it.
I think statisticians are part of it, but it’s just a part. You also want to be able to visualize the data, communicate the data, and utilize it effectively. But I do think those skills—of being able to access, understand, and communicate the insights you get from data analysis—are going to be extremely important. Managers need to be able to access and understand the data themselves.
You always have this problem of being surrounded by “yes men” and people who want to predigest everything for you. In the old organization, you had to have this whole army of people digesting information to be able to feed it to the decision maker at the top. But that’s not the way it works anymore: the information can be available across the ranks, to everyone in the organization. And what you need to ensure is that people have access to the data they need to make their day-to-day decisions. And this can be done much more easily than it could be done in the past. And it really empowers the knowledge workers to work more effectively.”
– Hal Varian, Google’s chief economist in The McKinsey Quarterly, January 2009
Cheers.
Misused statistics in the wild
Here is a blog post (from Andrew Gelman’s blog) about an article (also by Andrew Gelman) about the misuse of statistics, if you will, in the wild.
From his blog: “The article begins as follows:
In the past few years, Satoshi Kanazawa, a reader in management and research methodology at the London School of Economics, published a series of papers in the Journal of Theoretical Biology with titles such as “Big and Tall Parents Have More Sons” (2005), “Violent Men Have More Sons” (2006), “Engineers Have More Sons, Nurses Have More Daughters” (2005), and “Beautiful Parents Have More Daughters” (2007). More recently, he has publicized some of these claims in an article, “10 Politically Incorrect Truths About Human Nature,” for Psychology Today and in a book written with Alan S. Miller, Why Beautiful People Have More Daughters.
However, the statistical analysis underlying Kanazawa’s claims has been shown to have basic flaws, with some of his analyses making the error of controlling for an intermediate outcome in estimating a causal effect, and another analysis being subject to multiple-comparisons problems. These are technical errors (about which more later) that produce misleading results. In short, Kanazawa’s findings are not statistically significant, and the patterns he analyzed could well have occurred by chance. Had the lack of statistical significance been noticed in the review process, these articles would almost certainly not have been published in the journal. The fact of their appearance (and their prominence in the media and a popular book) leads to an interesting statistical question: How should we think about research findings that are intriguing but not statistically significant? . . .”
Note: Have you ever heard anyone say something like, “You know ‘they’ say beautiful parents have more daughters.” Ever wondered who the “they” is they were talking about? In this case, it’s Kanazawa. And he’s wrong. So, sometimes even “they” are wrong. Pretty scary cause most people trust the “they”.
Cheers.