Stats in the Wild

NCAA Basketball Rankings – 2/26/2013

Updated 2-26-2013 at 12:19am

Resume ranks the teams based on what they have actually accomplished this season. Predictor ranks teams based on how well they are expected to perform in the future. Seed give the projected tournament seed for a team.

Teams	W	L	Conf	Resume	Predictor	Seed
indiana	24	3	big-ten	1	1	1
michigan	23	4	big-ten	2	5	1
duke	24	3	acc	3	6	1
gonzaga	27	2	wcc	4	4	1
arizona	23	4	pac-12	5	10	2
florida	22	4	sec	6	2	2
kansas	23	4	big-12	7	11	2
louisville	22	5	big-east	8	3	2
miami fl	22	4	acc	9	20	3
kansas state	22	5	big-12	10	41	3
georgetown	21	4	big-east	11	28	3
michigan state	22	6	big-ten	12	21	3
pittsburgh	21	7	big-east	13	8	4
syracuse	22	5	big-east	14	7	4
ohio state	20	7	big-ten	15	12	4
wisconsin	19	8	big-ten	16	9	4
new mexico	23	4	mwc	17	46	5
cincinnati	19	9	big-east	18	14	5
marquette	19	7	big-east	19	43	5
san diego state	20	7	mwc	20	36	5
oklahoma state	20	6	big-12	21	19	6
oklahoma	18	8	big-12	22	49	6
memphis	24	3	cusa	23	32	6
butler	22	6	atlantic-10	24	45	6
saint louis	21	5	atlantic-10	25	35	7

Full Rankings

Posted in Uncategorized

Leave a comment

Feb 18

Posted by statsinthewild

Normal Deviate

STATISTICS DECLARES WAR ON MACHINE LEARNING!

Well I hope the dramatic title caught your attention. Now I can get to the real topic of the post, which is: finite sample bounds versus asymptotic approximations.

In my last post I discussed Normal limiting approximations. One commenter, Csaba Szepesvari, wrote the following interesting comment:

What still surprises me about statistics or the way statisticians do their business is the following: The Berry-Esseen theorem says that a confidence interval chosen based on the CLT is possibly shorter by a good amount of $latex {c/\sqrt{n}}&fg=000000$. Despite this, statisticians keep telling me that they prefer their “shorter” CLT-based confidence intervals to ones derived by using finite-sample tail inequalities that we, “machine learning people prefer” (lies vs. honesty?). I could never understood the logic behind this reasoning and I am wondering if I am missing something. One possible answer is that the Berry-Esseen result could be…

View original post 556 more words

Posted in Uncategorized

Leave a comment

March Madness Preview

Feb 17

Posted by statsinthewild

Using data from the 2012-2013 NCAA basketball season, I’ve ranked all of the division 1 teams in two ways. First, I have build a retrospective model for the season ranking all of the teams based on what they have actually accomplished so far this season. These rankings give weight to individuals games based on margin of victory and weight strength of schedule a little bit more heavily than most models. I used these rankings to create a tournament bracket by taking the highest rated team from each conference plus the next 37 highest rated teams as at large bids. Once this bracket was created, I used my prospective rankings to predict the games. The results are here.

I’m sure everyone out there will let me know what I got wrong.

Cheers.

Posted in Uncategorized

Leave a comment

NCAA Basketball Rankings – 2/9/2013

Feb 9

Posted by statsinthewild

Updated 2-9-2013 at 12:07am

Rank	Team	Conf	Record	Score
1	MICHIGAN	big10	21-2	86.86
2	MIAMI-FLORIDA	acc	18-3	86.35
3	INDIANA	big10	20-3	86.12
4	DUKE	acc	20-2	85.96
5	FLORIDA	sec	18-3	84.93
6	ARIZONA	pac10	20-2	84.4
7	KANSAS	big12	19-3	84.16
8	LOUISVILLE	bigeast	19-4	84.13
9	SYRACUSE	bigeast	19-3	83.24
10	PITTSBURGH	bigeast	19-5	82.62
11	GONZAGA	wcc	22-2	81.96
12	MINNESOTA	big10	17-6	81.7
13	MICHIGAN STATE	big10	19-4	81.69
14	GEORGETOWN	bigeast	16-4	80.56
15	OHIO STATE	big10	17-5	80.19
16	CINCINNATI	bigeast	18-5	80.15
17	NEW MEXICO	mountwest	20-3	79.54
18	MARQUETTE	bigeast	16-5	79.35
19	CREIGHTON	mvc	20-4	78.92
20	COLORADO STATE	mountwest	19-4	78.92
21	NOTRE DAME	bigeast	18-5	78.56
22	OKLAHOMA STATE	big12	16-5	77.95
23	UCLA	pac10	17-6	77.83
24	NC STATE	acc	16-7	77.43
25	WISCONSIN	big10	16-7	77.15

Full Rankings

Posted in Uncategorized

Leave a comment

NCAA Basketball

Feb 3

Posted by statsinthewild

Index	Home	Away	HomePred	AwayPred	Home Win
1	army	lehigh	68	79	0.05
2	connecticut	south florida	63	60	0.65
3	georgia tech	virginia	52	58	0.16
4	illinois	wisconsin	59	66	0.15
5	louisville	marquette	72	64	0.89
6	manhattan	saint peters	62	57	0.77
7	marist	rider	67	71	0.27
8	mcneese state	northwestern state	70	74	0.29
9	minnesota	iowa	76	69	0.83
10	stanford	oregon state	75	70	0.76
11	villanova	providence	70	70	0.51

Posted in Uncategorized

Leave a comment

NCAA Basketball Rankings – 2/3/2012

Feb 3

Posted by statsinthewild

Updated 2/3/2013 at 12:34pm

Indiana reclaims the top rankings after defeating Michigan last night, while previous number 2 Kansas falls three spots to number 5 after a loss to Oklahoma State.

Oregon, Wichita State, and Colorado State all fell out of the top 25. Oregon and Wichita State are both on tow game losing streaks.

Oklahoma State jumps into the top 25 after beating Kansas (at Kansas!) along with UNLV and New Mexico.

Rank	Team	Conf	Record	Score
1	INDIANA	big10	20-2	86.7
2	MICHIGAN	big10	20-2	86.49
3	FLORIDA	sec	18-2	86.05
4	MIAMI-FLORIDA	acc	17-3	85.61
5	KANSAS	big12	19-2	85.27
6	DUKE	acc	19-2	85.22
7	ARIZONA	pac10	19-2	84.22
8	LOUISVILLE	bigeast	17-4	82.27
9	PITTSBURGH	bigeast	18-5	82.25
10	SYRACUSE	bigeast	18-3	82.23
11	MINNESOTA	big10	16-5	81.94
12	CINCINNATI	bigeast	18-4	81.08
13	OHIO STATE	big10	17-4	80.93
14	CREIGHTON	mvc	20-3	80.73
15	MICHIGAN STATE	big10	18-4	80.65
16	GEORGETOWN	bigeast	16-4	80.61
17	GONZAGA	wcc	21-2	80.43
18	MARQUETTE	bigeast	15-4	79.79
19	NOTRE DAME	bigeast	18-4	79.75
20	COLORADO STATE	mountwest	18-4	78.82
21	NEW MEXICO	mountwest	19-3	78.54
22	NC STATE	acc	16-6	78.19
23	OKLAHOMA STATE	big12	15-5	77.55
24	UCLA	pac10	16-6	77.4
25	UNLV	mountwest	17-5	76.94

Full Rankings

Posted in Uncategorized

Leave a comment

What time does the Superbowl start? (and predictions)

Feb 2

Posted by statsinthewild

6:30.

I’ve previously released my Super Bowl pick here, but I’ve also decided to release my forecast for the box score of the game and some visualizations of the distributions of team scoring, totals, and margin of victory as a preview for what I’m going to try to do in the 2013 season.

So, here is my predicted box score of the game:

Team	Score	First Downs	Rushing Yards	Passing Yards	Total Yards	Turnovers
49ers	23.3	19.5	149.2	201.8	351.0	1.48
Ravens	20.2	18.4	108.1	223.3	331.4	1.59

Some selected probabilities:

Team	Win	Cover (4.5)	Cover (3.5)	Win 10 or more	Overtime	Over/Under (47)
49ers	63.2%	43.5%	48.3%	24.5%	4.9%	O 29.5%
Ravens	36.8%	56.5%	51.7%	8.5%	4.9%	U 66.2%

Cheers.

Posted in Uncategorized

Leave a comment

Feb 1

Posted by statsinthewild

I didn’t know this either!

Rmazing

I have been working with R for some time now, but once in a while, basic functions catch my eye that I was not aware of…
For some project I wanted to transform a correlation matrix into a covariance matrix. Now, since cor2cov does not exist, I thought about “reversing” the cov2cor function (stats:::cov2cor).
Inside the code of this function, a specific line jumped into my retina:

What’s this [ ]?

Well, it stands for every element $latex E_{ij}$ of matrix $latex E$. Consider this:

> mat
     [,1] [,2] [,3] [,4] [,5]
[1,]   NA   NA   NA   NA   NA
[2,]   NA   NA   NA   NA   NA
[3,]   NA   NA   NA   NA   NA
[4,]   NA   NA   NA   NA   NA
[5,]   NA   NA   NA   NA   NA

With the empty bracket, we can now substitute ALL values by a new value:

> mat [,1] [,2] [,3] [,4] [,5] [1,] 1 1 1 1…

View original post 55 more words

Posted in Uncategorized

Leave a comment

Feb 1

Posted by statsinthewild

Hilary: the most poisoned baby name in US history

Hilary Parker

I’ve always had a special fondness for my name, which — according to Ryan Gosling in “Lars and the Real Girl” — is a scientific fact for most people (Ryan Gosling constitutes scientific proof in my book). Plus, the root word for Hilary is the Latin word “hilarius” meaning cheerful and merry, which is the same root word for “hilarious” and “exhilarating.” It’s a great name.

Several years ago I came across this blog post, which provides a cursory analysis for why “Hillary” is the most poisoned name of all time. The author is careful not to comment on the details of why “Hillary” may have been poisoned right around 1992, but I’ll go ahead and make the bold causal conclusion that it’s because that was the year that Bill Clinton was elected, and thus the year Hillary Clinton entered the public sphere and was generally reviled for not wanting to…

View original post 1,430 more words

Posted in Uncategorized

Leave a comment

The Academic Journal System

Jan 30

Posted by statsinthewild

So, as I slowly make my way through the academic world, I’m learning about the whole journal process and the business of journals.

This is how it appears to me at this point in my career:

Academics submit articles that are peer-reviewed by other academics who are experts in the field. When a paper is accepted, the author of the article gives away their copyright to the publisher. The publisher then bundles these articles and sells them back to the academic institution where many of the authors of these papers work. So, universities are paying to have academics write these articles and then paying some outside party to have access to these same articles. And if you don’t pay a ton of money to these publishers, you have no access to these articles without stealing them. This is insane.

So, it seems, some people also thought that the public not having access to these articles, some of which are funded with public money, was insane. So open access journals were create. Once an article is published in an open access journal, anyone in the world can view it. Of course, it costs the author up to several thousand dollars to publish in many of these open access journals. That is also insane. It seems to me that the most important part of this whole process is the peer-review process. In fact, it’s really, in my mind, the only essential part of this process.

So, here is what I am proposing and someone please tell me why this wouldn’t work:

A totally free, totally open access journal (do any of these exist?). Authors would write a manuscript, the manuscript would get sent out to reviewers and the peer-review process would take place. Once an article was accepted it would be published on, for instance, a wordpress blog, which will host everything for free (or if you needed more space you could purchase it very cheaply). Then the whole world could read all of this brilliant scientific work for free. Universities would save money because they wouldn’t have to pay for access to journal articles and grant money could be spent on useful things for advancing science rather than going to the fees for open access journals. Why is this not a better system than what currently exists? Everything would stay the same, we’d just remove the publishers from skimming millions (billions?) of dollars out of the system. Isn’t the concept of a publisher antiquated at this point anyway? I mean take what I’ve just written, for instance. No publisher necessary.

So, someone please tell me why this wouldn’t work. Maybe I am totally missing something important I don’t realize.

Cheers.

Posted in Uncategorized

Leave a comment

Stats in the Wild

NCAA Basketball Rankings – 2/26/2013

March Madness Preview

NCAA Basketball Rankings – 2/9/2013

NCAA Basketball

NCAA Basketball Rankings – 2/3/2012

What time does the Superbowl start? (and predictions)

The Academic Journal System

Blogroll

Comedy

Data Art

Data Viz

Jobs

R

Tag Cloud