Skip to Content


The Washington Post has an interactive graphic showing the rate at which the US presidential candidates Barack Obama and Mitt Romney have visited the various states for campaign rallies and fundraisers. Here's how it looks today:

As the clean-up continues on the eastern seaboard, I wanted to follow up on Monday's post on tracking Hurricane Sandy with Open Data with a couple of other R-based data applications spawned by the storm.

Tim Gasper (Product Manager at Big Data platform Infochimps) has an informative article at TechCrunch that provides an overview of five open-source technologies trending now for Big Data applications. They are:

Hurricane Sandy is shaping up to be a major, and very dangerous, meteorological event for the US's East coast. Naturally, everyone is looking for the latest information and forecasts. Fortunately, the wealth of public meteorological data available on the open web, combined with real-time on-the-ground updates via social media, means that an ecosystem of on-line apps is now available providing all the up-to-date information you need.

It's almost All Hallow's Eve, and in the it's tradition in many places here in the States to decorate houses with "spooky" decorations. Some homeowners take the process to extremes — to both delight and chagrin of neighbours — such as with this light show in Leesburg, Virginia set to the Korean pop hit Gangnam Style:


As promised, the source distribution for R 2.15.2 is now available for download from the master CRAN repository.

At the Strata conference in New York today, Steve Yun (Principal Predictive Modeler at Allstate's Research and Planning Center) described the various ways he tackled the problem of fitting a generalized linear model to 150M records of insurance data. He evaluated several approaches:

The O'Reilly Strata conferences are always great fun to attend, and this latest installment in New York City is no exception. This one is super-busy though; the conference has been sold out for weeks -- and not just marketing-sold-out, it's fire-department-sold out. It's non-stop conversations and presentations, and it's tough to move through the hallways in between.

Nonetheless, I thought I'd pause for a couple of minutes and share some of the highlights for me so far.

On Thursday next week (November 1), I'll be giving a new webinar on the topic of Big Data, Data Science and R. Titled "The Rise of Data Science in the Age of Big Data Analytics: Why Data Distillation and Machine Learning Aren’t Enough", this is a provocative look at why data scientists cannot be replaced by technology, and why R is the ideal environment for building data science applications. Here's the abstract:

There are new local R user groups in eight (!) countries to announce this month: