Skip to Content


by Joseph Rickert

We all "know" that correlation does not imply causation, that unmeasured and unknown factors can confound a seemingly obvious inference. But, who has not been tempted by the seductive quality of strong correlations?

by Hong Ooi
Sr. Data Scientist, Microsoft

by Andrie de Vries

Recently we had a question on the public mailing list for Revolution R Open (RRO), on the topic of "MKL multithreaded library and mclapply do not play well together".

If you're not familiar with these topics, here is a quick primer:

Imagine taking one person's body and face, and then being able to make them speak using someone else's voice -- and their mouth movements and facial expressions, too. That's now possible to do in real time, thanks to a research team at Stanford, using only a consumer-grade PC and what appears to be a Kinect depth-of-field camera:


Thanks to Dr George Esaw, I recently learned that Ross Ihaka, co-creator of R, was featured in a full-page advertisement placed in The Economist by the University of Auckland back in April:

Image credit: The Economist / University of Auckland (via George Esaw)

Click on the image for a larger version, and if you can't read the text I've reproduced it here:

By Andrie de Vries

Note by the editor after publication:

In the original post we neglected to give a shout out to Steve Weston, who continues to be the prime driver of new functionality for foreach, iterators and their backends.

The new progress bar functionality as described in this post is all the work of Steve Weston (StackOverflow profile).

by Hong Ooi
Sr. Data Scientist, Microsoft

The dplyr package is a popular toolkit for data transformation and manipulation. Over the last year and a half, dplyr has become a hot topic in the R community, for the way in which it streamlines and simplifies many common data manipulation tasks.

I'm honoured to be giving the opening keynote at the Effective Applications of R Conference (EARL) Conference in Boston on November 2. My presentation will be on the business economics and opportunity of open source data science, with a focus on applications that are now possible given the convergence of big data platforms, cloud technology, and data science software (especially R) charged by the contributions of the open source community.