Skip to Content

Blogs

Sorry about the blog being inaccessible for a few days. Our hosting provider TypePad was the victim of a denial of service attack. (The Revolution Analytics main website and systems weren't affected.) But everything seems to be back to normal now, so it should be business as usual from here on in.

by Thomas Dinsmore

Regular readers of this blog may be familiar with our ongoing effort to benchmark Revolution R Enterprise (RRE) across a range of use cases and on different platforms.  We take these benchmarks seriously at Revolution Analytics, and constantly seek to improve the performance of our software. 

Norm Matloff points us to a pithy example that sums up Simpson's Paradox perfectly, captured in the title of a medical paper: "Good for Women, Good for Men, Bad for People". He explains how Simpson's Paradox isn't a paradox at all, but just the consequence of including a minor variable in a model ahead of a more significant variable, and illustrates this with an R analysis of the UCB admissions data.

If you missed last week's webinar presented by Revolution Analytics' US Chief Scientist Mario Inchiosa, Decision Trees built in Hadoop plus more Big Data Analytics with Revolution R Enterprise, the slides and webinar replay are now available for download.

What happens when you offer a dog a treat, but then make it vanish via sleight of hand? This:

 

Like Sullivan, I'm surprised these dogs are fooled at all, and can't tell where the treat is by scent.

That's all for this week. See you on Monday!

The Mountain View Voice is a weekly newspaper serving the Silicon Valley area, and is a familiar sight to anyone wandering the streets of Palo Alto or Menlo Park. Angela Hey writes for 'Hey Tech!', an online blog of the Voice, and has just published a feature on R and the local Bay Area User Group (BARUG).

A couple of weeks ago, I participated in a panel discussion for DM Radio: "Still Sexy? How's that Data Scientist Gig Working Out?". The title was provocative, but the discussion mostly revolved around the rise of data science and how advanced analytics (often implemented with R) is changing the way many companies do business today.

As a language for statistical computing, R has always had a bias towards linear algebra, and is optimized for operations dealing in complete vectors and matrixes. This can be surprising to programmers coming to R from lower-level languages, where iterative programming (looping over the elements of a vector or matrix) is more natural and often more efficient. That's not the case with R, though: Noam Ross explains why vectorized programming in R is a good idea: