Skip to Content


by Andrie de Vries

A few weeks ago I wrote about the Jupyter notebooks project and the R kernel. In the comments, I was asked how to resize the plots in a Jupyter notebook.

by Bob Horton
Microsoft Senior Data Scientist

Learning curves are an elaboration of the idea of validating a model on a test set, and have been widely popularized by Andrew Ng’s Machine Learning course on Coursera. Here I present a simple simulation that illustrates this idea.

If you've developed a useful function in R (say, a function to make a forecast or prediction from a statistical model), you may want to call that function from an application other than R. For example, you might want to display the forecast (calculated in R) as part of a desktop, web-based or mobile application. One solution is to install R alongside the application and call it directly, but that can be difficult — or impossible, in the case of mobile apps. (You also need to be careful to comply with R's open-source GPL2 license.)

Sure, the Solar System is big, but it's probably a lot bigger than you think, thanks to textbook representations that squeeze all the planets and their orbits into one page. Even at the speed of light, it takes more than 40 minutes to get from the Sun to Jupiter (a journey you can experience in real-time here).

The RHadoop packages make it easy to connect R to Hadoop data (rhdfs), and write map-reduce operations in the R language (rmr2) to process that data using the power of the nodes in a Hadoop cluster. But getting the Hadoop cluster configured, with R and all the necessary packages installed on each node, hasn't always been so easy.

by Joseph Rickert

This week, the Infrastructure Steering Committee (ISC) of the R Consortium unanimously elected Hadley Wickham as its chair thereby also giving Hadley a seat on the R Consortium board of directors. Congratulations Hadley!!

by Andrie de Vries

Every once in a while I try to remember how to do interpolation using R. This is not something I do frequently in my workflow, so I do the usual sequence of finding the appropriate help page:


Help pages:

          stats::approx Interpolation Functions
   stats::NLSstClosestX Inverse Interpolation
          stats::spline Interpolating Splines

The Effective Applications of R (EARL) Conference (held last week in London) is well-named. At the event I saw many examples of R being used to solve real-world industry problems with advanced statistics and data visualization. Here are just a few examples:

A couple of years ago I suggested a way of thinking about how the Discrete Fourier Transform works, based on Stuart Riffle's elegant colour-coding of the equation: