Skip to Content

Blogs

In discussion with several data scientists, Will Stanton (a data scientist with Return Path) learned that a common concern is: what software should I be using? There are many options out there, but what is the best platform to be an effective "data hacker"?

by Matt Sundquist, Plotly Co-founder

It's delightfully smooth to publish R code, plots, and presentations to the web. For example:

You're probably familiar with the classic Travelling Salesman problem: given (say) 20 cities, what is shortest route you can take that passes through all 20 cities and returns to the starting point? It's a difficult problem to solve, because you need to try all possible routes to find the minimum, and there are a LOT of possibilities. For a 20-city tour there are more than 1 trillion trillion routes to try — and that's a fairly small problem!

Rrrr! It's International Talk Like a Pirate day again, mateys, the day all landlubbers should talk in pirate lingo. (If you're unsure how, R can help.) It's also the day where you can pick up some great O'Reilly R books for half price

A quick heads up that if you'd like to get a great introduction to doing data science with the R language, Joe Rickert will be giving a free webinar next Thursday, September 25: Data Science with R. Regular readers of the blog will be familiar with Joe's posts on this topic.

by Joseph Rickert

While preparing for the DataWeek R Bootcamp that I conducted this week I came across the following gem. This code, based directly on a Max Kuhn presentation of a couple years back, compares the efficacy of two machine learning models on a training data set.

I'm speaking at the DataWeek conference in San Francisco today. My talk follows Skylar Lyon from Accenture — I'm really looking forward to hearing how he uses Revolution R Enterprise with Teradata Database to run R in-database with 400 million rows of data. Update: Here are Skylar's slides.

 

The R Foundation for Statistical Computing, the Vienna-based non-profit organization that oversees the R Project, has just added several new "ordinary members".

Graduate student Clay McLeod decided to find out what makes a post on the social-sharing site Reddit popular. These are the questions he seeks to answer:

There's a new online lifestyle magazine for data scientists with a machine-learning bent: ML Daily. (Thanks to reader SG for the tip.)

Check it out for lots of useful articles, including: