Skip to Content


There's only a few days left to enter the Civic Data Challenge: entries are due before midnight EST on July 29 to qualify for the $100,000 in prizes. The competition, open to US residents, challenges particpants to applications and visualizations from civic health data.

The RHadoop project continues the Big Data integration of R and Hadoop, with a new update to its rmr package. Version 1.3 of rmr improves the performance of map-reduce jobs for Hadoop written in R.

The video below isn't just a tribute to the beauty of San Francisco coupled with a pretty decent indie-rock song. It also has some incredible photographic work:


The June 2012 issue of the R Journal, the peer-reviewed open-journal about R packages and applications of R, is now available. This issue includes articles about:

Growing up in Australia, for me a carbonated drink like Pepsi or Fanta or lemonade was always just a "soft drink". (Also, 'lemonade' in Australia was something different to 'lemonade' in the US; it's something close to 7-Up.) So when I moved to Seattle, it was surprising to me that all such things were called "pop". And then I travelling across the US, and realised it was also "soda" (which, to an Australian, is exclusively club soda), and even sometimes "coke". Not capital-C Coke, but "coke", meant any generic soft drink. It's all very confusing.

In most data science applications, preparing the data is at least half the job. Finding where the data lives, figuring out how to access it, finding the right records, filtering, cleaning and transforming the data ... all of this has to be done before the statistical analysis can even begin.

John Myles White, self-described "statistics hacker" and co-author of "Machine Learning for Hackers" was interviewed recently by The Setup. In the interview, he describes his some of his go-to R packages for data science:

Linear Programming is a mathematical technique used to find the values of some variables (within the bounds of some defined constraints) to find the maximum value of a quantity. For example, consider this problem from the FishyOperations blog

In 2004, NASA sent two rovers to Mars. Each rover had scheduled a three-month mission to explore the surface, after safely bouncing onto the surface of Mars in an cushion of airbag-like balloons. In a marvel of engineering and dedication Spirit lasted six years, and Opportunity is still advancing our scientific knowledge about Mars to this day.

At a talk I saw at the useR!2012 conference last month, Googler Karl Millar estimated that there are at least 200 active R users at Google, plus another 300+ occasional users participating in Google's internal R support list. But what are all these Google employees doing with R? A post from the Google Research team published on Google+ yesterday sheds some light: