Skip to Content


To take a spreadsheet beyond what it's designed for — data presentation, summarization and simple calculations — into the world of complex data analysis can be an alluring prospect.

The .Rprofile file is a great way to customize your R session every time you start it up. You can use it to change R's defaults, define handy command-line functions, automatically load your favourite packages — anything you like! The Getting Genetics Blog has a nice example .Rprofile file to give you some inspiration on what to do. One popular setting is options(stringsAsFactors=FALSE), which prevents R from converting character data into factor objects when you import data frames.

If your econometrics is a bit rusty and you're also looking to learn the R language, you can kill two birds with one stone with Introductory Econometrics using Quandl and R.  The first three parts of this seven-part tutorial introduces the basics of regression analysis, while the remaining sections provide R code you can try yourself to reproduce econometric analyses using data provided by the

If you're trying to predict when an event will occur (for example, a consumer buying a product) or trying to infer why events occur (what were the factors that led to a component failing?), time-to-event models are a useful framework. These models are closely related to survival analysis in life sciences, except that the outcome of interest isn't "time to death" but time to some other event (e.g. in marketing, "time to purchase").

As longtime readers of this blog will know, I love optical illusions, and the checkerboard shadow illusion is one of my all-time favourites.

Forbes has published an article today on the integration between Alteryx and Revolution R Enterprise, which gives business analysts the ability to drag and drop to connect data sources to R-based models, such as this one for Market Basket analysis:

Boris Chen, a data scientist for the New York Times, has been running since August a weekly blog with statistical analysis of NFL players, as fodder for Fantasy Football players around the country. Here's how he describes what he does: 

A quick heads-up that I'll be participating in an on-line webinar and panel discussion on the "small data" side of data science, and what Big Data practitioners can learn from statistical reasoning and expertise. Gregory Piatetsky (KDNuggets editor) will join me on the discussion. It starts at 8AM Pacific Time, and the bulk of the time will be devoted to your questions so it should be a fun interactive session.

To register for this online event hosted by Kalido, follow the link below.