Category Archives: Uncategorized

Paying NCAA players

What if I set up an organization that collected money for college basketball players and then paid them out when they left the NCAA.  It would work like this:

  • An account is set up for each player in the NCAA.
  • Fans can donate to a player or an entire team.
  • Once a player leaves the NCAA, I write them a check for the full balance of their account.  They can do whatever they want with the money.

Not that I would, but how could the NCAA stop me from doing this?

Cheers.

Hadoop

From this article:

However, Cukierski thinks that the use of big data has become too trendy. “The whole big data idea is really within a big hype cycle,” mainly driven by a particular software framework for dealing with information, called Hadoop. “It’s not that Hadoop isn’t useful,” Cukierski says, but when companies look to it to solve small problems, “the people who are data scientists and actually statistically literate are kind of laughing, because you don’t need Hadoop to do most problems.”

 

Cheers.

Kaggle Update

 

Well we’ve narrowed the gap to about 0.003, but I still don’t think we can possibly catch Grimp with one game left to go.  Out best bet for the money at this point is a disqualification for some reason.  Screen Shot 2014-04-06 at 11.54.00 AMCheers.

 

Cinderella Plots 2014

I introduced the Cinderella plot a few years ago, but I haven’t updated it for this year’s tournament until now.  (I’ve been busy with the job search, which was successful(!)….more to come on that….).  So without further delay, I present to you the Cinderella plot for all of the NCAA tournaments from 2002 through 2014:CinderellaPlot2001-2014

 

As you can see, two 5 seeds (Indiana and Butler) and even an 8 seed (Butler, again) have made it to the final game in the past 13 years.  But they all played a team ranked 3 or higher.  Never in the last 13 years, or ever for that matter, have the seeds for the teams in the finals been this high.  We’ve got a 7 seed versus an 8 seed, though I’d argue that both of these teams are better than their seed.  That’s not to say that they didn’t get a seed that they DESERVED, which is based on what has actually happened in the season.  But that’s a little bit different that projecting what they are capable of in the future.  Based on that, it’s not really THAT surprising that either of these teams is in the final game.  In fact, I had both of them ranked in the top 25 at the end of the season: Kentucky at 13 and UConn at 21.  (Even though I thought Kentucky should have been a 7 seed and UConn an 8. )

Cheers.

Kaggle Excercise

 

 

With eleven games to play in the NCAA basketball tournament, we (me and @statsbylopez) find our team in second place in the March Madness Machine learning contest, a mere 0.00365 behind the leader Grimp Whelkin.  So, we’re 0.00365 away from $15K.

 

Screen Shot 2014-03-28 at 11.37.28 AM

 

What’s really impressive though, is that both of our entries are doing so well.  Our best entry is currently at 0.47589 and good for second place, but our other entry is at 0.48081, which would STILL BE GOOD FOR SECOND PLACE.  (Maybe we’re on to something here?)  This is good news as we have TWO realistic ways to win this $15K.  One of our models is big on Virginia and the other is big on Michigan State.  It’s possible that our other submission actually moves ahead of our other submission and becomes our scoring submission.

Screen Shot 2014-03-28 at 11.35.18 AMI’m not sure I can handle finishing second for $15K.  I’d rather have finished 100th just so I don’t have to worry so much about this (that’s not true of course.  I’m gonna brag about this forever.)

 

I do hope Kaggle will release winning scenarios for the top 10 prior to the Final Four so that we can at least use that info to hedge our bets.  Though I’d guess they aren’t going to do this.

Cheers.

 

 

Why do I keep writing about Field Goals?

Twitterer @brentonk alerted me to the following Grantland article written by the only man in the world to block me on Twitter, Bill Barnwell.  The following excerpt is from the article (emphasis added):

It would also have some interesting effects on kicker value. In a way, it might seem like it should make kickers more valuable by virtue of giving them more opportunities to make meaningful kicks. The average team attempts about 40 extra points each year, so the difference between a kicker who hits 90 percent of his extra points from the 25-yard line and a kicker who hits 70 percent on the same attempts would be eight extra points per season. On the other hand, we also know there’s no year-to-year consistency for a kicker’s field goal percentage, and that’s likely to be the case for these 40 additional extra-point attempts each year, too. So while teams might pay more for the security of a reliable kicker, they’ll still be just as unlikely to end up with one.

I’ve written about this before, in that we shouldn’t expect there to be consistency from year to year for field goal percentage within a kicker.  It’s because field goals are taken from different distances. You can see just how much variability there is in the distances of place kickers from year to year with this sweet shiny app that I made.

Finally,  as I have written before, even if you do control for distance, I find no evidence that there is any significant variability within kickers between years.

Cheers.

 

 

The media hates cats and statistical rigor

Here is a paragraph from a recent article in the New York Times entitled “The Evil of the Outdoor Cat” (emphasis added):

And wildlife in this country must share this land with a growing population of about 84 million owned cats, and anywhere from 30 to 80 million feral or stray cats. When all of them do “what’s natural” in a fragmented natural world, it adds up. Using deliberately conservative assumptions, federal researchers recently estimated that free-ranging cats killed about 2.4 billion birds annually in the Lower 48 states, a substantial bite out of the total bird population. Outdoor cats also kill about 12.3 billion small mammals a year — not just the proverbial rats and mice but also chipmunks, rabbits and squirrels — and about 650 million reptiles and amphibians. In some cases, they are pushing endangered species toward extinction.

Those number are huge and look very familiar.  In fact, I believe they are from the 2013 article in Nature Communications  by Dr. Scott Loss called “The impact of free-ranging domestic cats on wildlife of the United States“, which I reviewed and basically concluded it was crap statistically meaningless.  I don’t know why the media insists on printing and reprinting these absolutely meaningless numbers.  Is there a hidden bird agenda in the main stream media?

Cheers.

 

Kaggle Round of 32

 

Last night I posted our picks within the distribution of all picks in the kaggle competition.  I’ve now updated that for the third (nee second) round.

If you’re rooting for us, this chart will help you decide who to root for.  If you’re rooting against us, this chart will also help you decide you to root for.  Either way, it’s useful.

kaggle_round_of_32_redCheers.

 

Back from Vegas / Kaggle

I got back from Vegas a few hours ago.  If you’ve never been to Vegas during the NCAA tournament, you’re going to want to do that.  There is nothing in the world like a room full of people watching a 15 point game with 90 seconds left screaming at the giant screen for the team that is winning to heave up some threes.  Absolutely incredible.  And watching an upset is even better.  The first two games I saw were Dayton and Harvard winning.  People were going crazy.  I won a few bucks, but nothing incredible.  I did cash twice in two tournaments, hit 8 out of 9 (for $0) on a parlay card, and watched my wife hit 200-1 on Sigma Derby.

But the really exciting news from the last few days is that my kaggle team (me and @statsbylopez) is in the top 10 out of 254 teams.  The complete standings are here.

Yesterday, William Cukierski posted the distributions of the predictions for the first rounds games.  I’ve highlighted in red where our first round predictions fall.  The most important ones so far are the Duke game, which we were very confident in (.93) and lost, which is a big penalty, and the Dayton game, where we had Dayton at .66 to win their game over Ohio State, which is notably different than many of the other teams.  Hopefully the Duke game doesn’t hurt us too much, and we can stay in the top ten.

kaggle_round_of_64_COL

Cheers.

NCAA Predictions for day 1

  • Ohio State over Dayton by 6
  • Wisconsin over American by 15
  • Pittsburgh over Colorado by 6
  • Cincinnati over Harvard by 1
  • Syracuse over W Michigan by 15
  • Oregon over BYU by 3
  • Michigan state over Delaware by 13
  • UConn over St. Joseph’s by 4
  • Michigan over Wofford by 16
  • Oklahoma over N. Dakota State by 1
  • San Diego State over New Mexico State by 7
  • Duke over Mercer by 16
  • Louisville over Manhattan by 13
  • Texas over Arizona State by 1
  • Florida over Albany by 20
  • St. Louis over NC State by 5

Cheers.