Skip to Content


Sean Taylor, a PhD candidate in Information Systems at NYU’s Stern School of Business, describes the "Statistics Software Signal" and his observation that some software packages are correlated with bad science.

Ringing in the New Year, Peter Dalgaard announced yesterday on behalf of the entire R Core Team that the R language will graduate to Version 3 around April 1. This is only the third time that R has incremented its primary version number. Version 1.0.0 (released on February 29, 2000) was the first version deemed stable for production use. R moved to version 2.0.0 on October 4, 2004 once some major language features (the S4 object system) and platforms (MacOS) were established.

Today's the Christmas holiday in the US and many other places around the world. Wherever you may be, have a happy and safe holiday season. Best wishes go to our customers, partners and the entire R community from the team at Revolution Analytics. 

Some of the Revolution Analytics team (and significant others) photographed at our holiday party at the Chihuly Garden and Glass Museum in Seattle.

The is.R blog has been on a roll in December with their Advent CalendaR feature: daily tips about R to unwrap each day leading up to Christmas. If you haven't been following it, start with today's post and scroll down. Sadly there isn't a tag to collect all these great posts together, but here are a few highlights:

I love optical illusions (like this and this and these), not just because they're fun, but also because they give us insights into how the brain processes sensory information.

The latest issue of the bi-annual, peer-reviewed journal about R, the R Journal, is now available for download.

Following on from Coursera's popular course introducing the R language, a new course on data analysis with R starts on January 22. The simply-titled Data Analysis course will provide practically-oriented instruction on how to plan, carry out, and communicate analyses of real data sets with R.

Twitter is emerging as an important medium for determining influence in many fields. Social ranking sites like Klout and Traackr include Twitter as a heavily-weighted component of their ranking algorithms, for example. Twitter isn't representative of the members of any field, but in areas where the members primarily engage online, it can be a useful proxy.

The R language provides many features in the language for selecting data from data frames: the "[" operator, logical functions, and utility functions like "subset". But if you know SQL (the query language ubiquitous in database systems), none of this is necessary. With the sqldf package, you can just pretend that your data frame is a database, and use SQL directly.