Skip to Content


In case you missed them, here are some articles from March of particular interest to R users.

New features in the latest version of ggplot2 include choropleths, violin plots, and improved annotations.

As anyone who's ever played Civilization[*] knows, the advent of sailboats capable of crossing the oceans leads to an explosion of exploration, commerce and social development. And with the visualization below, you can see that explosion in action:


The O'Reilly Radar blog has a lengthy and very interesting interview with the lead and deputy CIOs of the Consumer Financial Protection Bureau, the new US government agency devoted to consumer protections in the financial markets. In that interview, they talk about the many open-source tools used in the agency (and the parent Treasury Department): Linux, WordPress, Splunk, Django, Git and  yes -- R for data analysis with Big Data.

Julia is a new open-source language for high-performance technical computing, created by Jeff Bezanson, Stefan Karpinski, Viral Shah and Alan Edelman and first announced in February. Their motivation for creating a new language was, they say, "greed":

Even if you don't speak a foreign language, I'm sure you hear a French speaker and know, "that sounds like French". Same goes for many other languages. But have you ever wondered how English sounds to a non-English speaker? In 1972, Italian singer songwriter Adriano Celentano released the song Prisencolinensinainciusol comprised entirely of Italian nonsense words that sound uncannily like English would, if you couldn't understand the words.

The competition amongst database vendors to create the fastest, most powerful "data layer" — the hardware and software to provide storage for Big Data with high-performance data processing — is clearly heating up. The Netezza appliance has been so successful that IBM has been racing to keep up with demand. SAP is also seeing success with its HANA in-memory database.

All around the world at noon GMT on April 28, data scientists around the world will compete in the world's first one-day International Data Science Hackathon, organized by Data Science London. Participants will receive a data set at the beginning of the event, and work in teams of 3-5 over the following 24 hours to create the best predictive model from the data.

Most of the time when we're programming in R, we don't think about how R gets from an object name (say, "stdev") to what it represents (a function to calculate standard deviation, perhaps). If you're writing functions, you've probably know about R's lexical scoping.

Marketing is one of the pioneering domains when it comes to applications of predictive analytics to Big Data.