Skip to Content

Blogs

It's almost All Hallow's Eve, and in the it's tradition in many places here in the States to decorate houses with "spooky" decorations. Some homeowners take the process to extremes — to both delight and chagrin of neighbours — such as with this light show in Leesburg, Virginia set to the Korean pop hit Gangnam Style:

 

As promised, the source distribution for R 2.15.2 is now available for download from the master CRAN repository.

At the Strata conference in New York today, Steve Yun (Principal Predictive Modeler at Allstate's Research and Planning Center) described the various ways he tackled the problem of fitting a generalized linear model to 150M records of insurance data. He evaluated several approaches:

The O'Reilly Strata conferences are always great fun to attend, and this latest installment in New York City is no exception. This one is super-busy though; the conference has been sold out for weeks -- and not just marketing-sold-out, it's fire-department-sold out. It's non-stop conversations and presentations, and it's tough to move through the hallways in between.

Nonetheless, I thought I'd pause for a couple of minutes and share some of the highlights for me so far.

On Thursday next week (November 1), I'll be giving a new webinar on the topic of Big Data, Data Science and R. Titled "The Rise of Data Science in the Age of Big Data Analytics: Why Data Distillation and Machine Learning Aren’t Enough", this is a provocative look at why data scientists cannot be replaced by technology, and why R is the ideal environment for building data science applications. Here's the abstract:

There are new local R user groups in eight (!) countries to announce this month:

The population of the world has been over 7 billion for about a year now. But those seven billion aren't distributed equally around the globe. 1.2 billion people — about  in India alone (despite it havingjust 2% of the world's land area). At the other end of the spectrum, the entire continent of Australia houses about 0.3% of Australia.

A new report from analyst firm Gartner forecasts that IT organizations will spend $232 billion (US) on hardware, software and services related to Big Data through 2016. Some key findings from the report:

The chart below comes by way of the is.R blog and shows the average ideology of the members of the United State House of Representatives within the Republican (red) and Democratic (blue) parties. (Other parties are shown in green.) The chart is shown as a time series, from the first US congress in 1789, to the most recent full congress (the 111th, from 2010). The 80th congress first met in 1947.

In a webinar today previewing Spotfire 5 (scheduled for release this November), TIBCO announced that it will include TERR: The Tibco Enterprise Runtime for R. TERR is a closed-source reimplementation of the R language engine, and not based on the GPL-licensed R project from the R Foundation. Here's the relevant slide from the webinar: