Skip to Content

Blogs

In case you missed them, here are some articles from September of particular interest to R users.

Norm Matloff argues that T-tests shouldn't be part of the Statistics curriculum and questions the "star system" for p-values in R.

by Daniel Hanson, with contributions by Steve Su (author of the GLDEX package). Part 1 of a series.

As a computer scientist, RStudio's Joe Cheng has some great insights into the R language and how it compares with other programming language. In the interview with DataScience.LA below, he notes that while R is often thought about as a domain-specific language (or DSL), the combination of a functional language with deferred evaluation of functional arguments actually makes it a great general-purpose language for implementing a statistical DSL.

 

I've always loved David Fincher's films, but until I saw this video about his directing style, I never knew exactly why. His films are often about topics that don't immediately interest me (serial killers, fighting), but for some reason they're always compelling. This video explains how Fincher's direction makes them so:

by Andrie deVries

One of the reasons that R is so popular is the CRAN archive of useful packages. However, with more than 5,900 packages on CRAN, many organisations need to maintain a private mirror of CRAN with only a subset of packages that are relevant to them.

by Joseph Rickert

Recently, I had the opportunity to present a webinar on R and Data Science. The challenge with attempting this sort of thing is to say something interesting that does justice to the subject while being suitable for an audience that may include both experienced R users and curious beginners. The approach I settled on had three parts. I decided to:

The New York Times published an article of interest to statisticians the other day: "The Odds, Continually Updated". Surprisingly for a general-audience newspaper, this article goes into the the distinctions between Bayesian and frequentist statistics, and does so in a very approachable way. Here's an excerpt:

The following post by Norm Matloff originally appeared on his blog, Mad(Data)Scientist, on September 15th. We rarely republish posts that have appeared on other blogs, however, the questions that Norm raises both with respect to the teaching of statistics, and his assertion that "R's statistical procedures are centered far too much on significance testing" deserve a second look. Moreover, Norm's post elicited quite a few comments, many of which are at a high level of discourse.

Hadley Wickham's dplyr package is a great toolkit for getting data ready for analysis in R. If you haven't yet taken the plunge to using dplyr, Kevin Markham has put together a great hands-on video tutorial for his Data School blog, which you can see below.

I think I may be one of the few kids that actually liked the ET: The Extra-Terrestial game for the Atari 2600. Sure it was frustrating, but so were most games of the era, and at least it wasn't a disappointing "recreation" of one of my arcade favourites.