Part III: What I wish I had known when I started a graduate program in statistics

Originally posted on StatsbyLopez:

(Note: This is the third in a series about graduate life in statistics, co-written by Mike and Greg. For links to all articles in the series, click here).

1. You’re on your own

Sure, you are going to take classes that are taught by professors, but you are the one responsible for learning the material.  If you have a great professor, that’s wonderful, as it will probably be a lot easier to grasp the material and to do well on exams.  If the professor is terrible, however, you still need to learn the material. And in college, you could learn that material, take a C, forget it and never think about that stuff again.  In grad school, however, you are still responsible for that material, and in many cases its going to show up on your qualifying exams and/or general exam.  (Shhhh: once you…

View original 1,183 more words

Part II: Thriving in a graduate program in statistics

Originally posted on StatsbyLopez:

(Note: This is the second in a series about graduate life in statistics. For links to all articles in the series, click here).

Here are the best pieces of advice that I can give someone currently involved in a biostatistics or statistics graduate program.

1- Know your interests, and exploit them

Did you sit through a martingale theory lecture and talk excitedly to the professor afterwards? Or, instead, did you start to think you that you would have been better off as an actuary?

These type of gut feelings are useful when it comes to one of the most stressful periods of a graduate student’s career – picking a research topic. My best advice is to (i) find the type of methods papers in statistics journals that you actually enjoy reading, (ii) find a faculty member capable of leading a thesis or dissertation in this…

View original 1,032 more words

Part I: Deciding on a graduate program in statistics

Originally posted on StatsbyLopez:

(Note: This is first in a three part series about graduate life in statistics. For links to all articles in the series, click here).

Here are some key points to consider when choosing graduate programs in statistics and biostatistics.

1. What’s the difference between biostatistics & statistics?

When I first applied for masters programs in statistics, I had little to no idea what biostatistics was. To the untrained eye – in my case, a liberal arts undergraduate student – the subject biostatistics gave off a connotation aligned with phylums and petri dishes, things I had been hoping to avoid since roughly 10th grade.

Ironically, however, biostatistics is not the intersection of statistics and biology; instead, biostatistics is mostly just statistics applied to fields within or related to public health. In four years of a biostatistics program, for example, I didn’t take a single biology course…

View original 1,294 more words

My experience with grad school in statistics

In honor of @StatsByLopez‘s upcoming series “So you want a graduate degree in staitstics”, which I will be contributing too, I’ve written a brief history of my graduate school experience.

When I was almost finished with my undergraduate degree at WPI, I got to do a senior project about ranking athletes and sports teams.  It was my first exposure to logistic regression (I had very little understanding of what was actually going on).  But I absolutely fell in love with the project.  Which led me to fall in love with statistics.

I was on pace to finish my bachelor’s degree in 3 years (not a big deal), but I wanted to do something to postpone the real world for at least another year (as any rational 20 year old would do.)  So I applied to graduate school in applied statistics at WPI.  I was equal parts really interested in statistics, really interested in not getting a job yet, and really unprepared for graduate school.

I struggled through 2 years of mathematical statistics, bayesian analysis, linear regression, etc. And I graduated with a less than impressive GPA, but the important part is that I graduated.  Towards the end of my time at WPI remember having a conversation with my advisor and I told him that I wanted to go on and do a Ph.D.  I assume he thought I was nuts because I hadn’t exactly dominated my way through the program.  But he never told me I shouldn’t go.  He did tell me that I didn’t need a Ph.D. to work in industry.  (Which is solid advice.)

I moved on and worked for 2 years in a direct marketing department of a major catalog company building predictive models.  When I first started working their I was really excited to build these predictive models.  I thought it was so cool (and I still think it’s cool) that you can take data from the past to help you better predict the future.  So I asked where the data was.  My boss told me it was here.  And there.  And over here.  And also over there, but you had to modify that before you used it.  And a lot of it was missing.  I thought to myself “Where is the rectangular file with no missing data?  I want to build models.” Ahh young Gregory you were so cute.  I spent much of my time cleaning and organizing the data, and relatively little actually building the models.  But you absolutely need to understand the modeling pieces to do the cleaning and organizing well.  Other wise you don’t really know or understand what data you (might) need.

After about a year I had had enough and wanted to go back to school for a Ph.D. in statistics.  I wanted to teach statistics and have more control over the type of work that I was doing.  I applied to several programs and told myself that I wasn’t going to go unless I got funding.  I got into 2 schools right away, but neither was willing to commit to funding.  I was pretty disappointed.  But at the last moment UConn came through with full funding for me.  I was in.  Go Huskies?

So after two years in the “real world” I went back to cocoon of academia.  I also went back to being broke.  Not college broke.  But like regular adult broke.  (I probably took a 50% pay cut going back to grad school).

I was 25 when I returned to grad school.  And let me tell you, 25 is a lot different than 21.  For instance, I never skipped a class in grad school at UConn to go to a fraternity event.  School is a totally different experience after you’ve worked a full time, 40 hour a week job.  You should treat grad school like this (except it’s probably 60 hours a week).  After 3 semesters, I passed my qualifying exam, and I finished all of my exams and classes in 3 years.  In total Uconn took four years to finish since I was doing research from day 1 (Expect more like 5 or 6 years (or 7 or 8) if you come in without a master’s degree).  I graduated and did a post-doc at UMass in genetics, and just recently hit the academic job market lottery and landed a position at Loyola University Chicago.  I can’t wait to start.

While this post has been mostly a bio of my experience, my piece on StatsByLopez will contain more of my thoughts on what grad school was like for me and what advice I would give to someone else in grad school for statistics.


Great Spam Comment

My newest favorite spam comment:

But this other man, Jesus Christ, brought forgiveness to
a lot of through God’s bountiful gift. ‘ Whirlpool:
Poseidon summons a whirlpool at his ground target location that cripples targets, preventing movement abilities, and pulls
targets toward the center dealing magical damage every.
The contestant must choose the word that matches the definition.


Permutation tests in R

Originally posted on statMethods blog:

Permuation tests (also called randomization or re-randomization tests) have been around for a long time, but it took the advent of high-speed computers to make them practically available. They can be particularly useful when your data are sampled from unkown distributions, when sample sizes are small, or when outliers are present.

R has two powerful packages for permutation tests – the coin package and the lmPerm package. In this post, we will take a look at the later.

The lmPerm package provides permutation tests for linear models and is particularly easy to impliment. You can use it for all manner of ANOVA/ANCOVA designs, as well as simple, polynomial, and multiple regression. Simply use lmp() and aovp() where you would have used lm() and aov().


Consider the following analysis of covariance senario. Seventy five pregnant mice are divided into four groups and each group receives a different drug dosage (0…

View original 491 more words