Skip to Content


KDnuggets recently posted its annual poll on data mining software, and the R language retains its #1 ranking as the most commonly-used software for data mining:

Revolution Analytics is hosting several live and online courses over the next couple of months that will be of interest to R users looking to hone their skills:

If you haven't made the plunge yet to making R graphics with Hadley Wickham's ggplot2 package, his "ggplot2 basics" slides (from the recent Introduction to Data Visualization and Analysis course at JSM) is a good place to start.

The Environmental Performance Index (EPI) ranks countries on performance indicators for environmental public health and ecosystem vitality. Yale University hosts the EPI website, which was used to present the 2012 EPI Rankings to world leaders at the 2012 World Economic Forum at Davos.

R user Markus Gesmann used the gold-winning times from the Olympic Men's 100m sprint since 1990 as the basis of the following prediction for the London Games:

Movie trailers today are kind of a strange beast. They seem to be designed only to optimize just one thing: getting bums on seats for the opening day of the film. You'd think the smarter long-term strategy would be to set appropriate expectations about the film, and thereby enhance enjoyment of those who do go, to create positive word-of-mouth and generate future patrons. But sadly, that's not the usual case. The result: mystifying spoiler-laden movie trailers that routinely portray a film that's very different to what you eventually see on-screen.

Revolution Analytics is proud to once again be a gold sponsor and Wi-Fi sponsor of the JSM 2012 conference in San Diego, the largest gathering of statisticians, biostatisticians, analysts, data miners and data scientists in the world. The conference begins on Sunday, and you'll find the Revolution Analytics team in the exhibit hall.

R has been available as a 64-bit application since it's earliest days. But the internal representation of R's fundamental data type — the vector — has long been subject to a 32-bit limitation: the maximum number of elements is capped at 2^31 (or just over 2.1 billion) elements. Now, at 8 bytes per element that's 16Gb of data, so that wasn't a limitation until machines with massive amounts of RAM came along. And even then compound objects like data frames and lists can contain multiple vectors (and so exceed the 16Gb limit), so not many people noticed the issue.

The R language gets a brief mention in an article in yesterday's New York Times on automated bond trading:

The traders here are mostly educated in math or physics, often outside the United States, and their desks are piled high with textbooks like the “R Graphs Cookbook,” for working with obscure computer programming languages.

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full July edition (with highlights from this blog and community events) online.