Skip to Content


The militarization of local police departments here in the US has been much in the news lately, and the New York Times published in June an in-depth article on how materiel from wars has ended up in the hands of US counties.

by Joseph Rickert

One of the most difficult things about R, a problem that is particularly vexing to beginners, is finding things. This is an unintended consequence of R's spectacular, but mostly uncoordinated, organic growth. The R core team does a superb job of maintaining the stability and growth of the R language itself, but the innovation engine for new functionality is largely in the hands of the global R communty. 

In discussion with several data scientists, Will Stanton (a data scientist with Return Path) learned that a common concern is: what software should I be using? There are many options out there, but what is the best platform to be an effective "data hacker"?

by Matt Sundquist, Plotly Co-founder

It's delightfully smooth to publish R code, plots, and presentations to the web. For example:

You're probably familiar with the classic Travelling Salesman problem: given (say) 20 cities, what is shortest route you can take that passes through all 20 cities and returns to the starting point? It's a difficult problem to solve, because you need to try all possible routes to find the minimum, and there are a LOT of possibilities. For a 20-city tour there are more than 1 trillion trillion routes to try — and that's a fairly small problem!

Rrrr! It's International Talk Like a Pirate day again, mateys, the day all landlubbers should talk in pirate lingo. (If you're unsure how, R can help.) It's also the day where you can pick up some great O'Reilly R books for half price

A quick heads up that if you'd like to get a great introduction to doing data science with the R language, Joe Rickert will be giving a free webinar next Thursday, September 25: Data Science with R. Regular readers of the blog will be familiar with Joe's posts on this topic.

by Joseph Rickert

While preparing for the DataWeek R Bootcamp that I conducted this week I came across the following gem. This code, based directly on a Max Kuhn presentation of a couple years back, compares the efficacy of two machine learning models on a training data set.

I'm speaking at the DataWeek conference in San Francisco today. My talk follows Skylar Lyon from Accenture — I'm really looking forward to hearing how he uses Revolution R Enterprise with Teradata Database to run R in-database with 400 million rows of data. Update: Here are Skylar's slides.


The R Foundation for Statistical Computing, the Vienna-based non-profit organization that oversees the R Project, has just added several new "ordinary members".