Skip to Content

Why use R?

There's lots of software available for data analysis today: spreadsheets like Excel, batch-oriented procedure-based systems like SAS; point-and-click GUI-based systems like SPSS; data mining systems, and so on. What makes R different? 

R is free. As an open-source project, you can use R free of charge: no worries about subscription fees, license managers, or user limits. But just as importantly, R is open: you can inspect the code and tinker with it as much as you like (provided you respect the terms of the GNU General Public License version 2 under which it is distributed). Thousands of experts around the world have done just that, and their contributions benefit the millions of people who use R today.

R is a language. In R, you do data analysis by writing functions and scripts, not by pointing and clicking. That may sound daunting, but it's an easy language to learn, and a very natural and expressive one for data analysis. But once you learn the language, there are many benefits. As an interactive language (as opposed to a data-in-data-out black-box procedures), R promotes experimentation and exploration, which improves data analysis and often leads to discoveries that wouldn't be made otherwise. A script documents all your work, from data access to reporting, and can instantly be re-run at any time. (This makes it much easier to update results when the data change.) Scripts also make it easy to automate a sequence of tasks that can be integrated into other processes. Many R users who have used other software report that they can do their data analyses in a fraction of the time.

Graphics and data visualization. One of the design principles of R was that visualization of data through charts and graphs is an essential part of the data analysis process. As a result, it has excellent tools for creating graphics, from staples like bar charts and scatterplots to multi-panel Lattice charts to brand new graphics of your own devising. R's graphical system is heavily influenced by thought leaders in data visualization like Bill Cleveland and Edward Tufte, and as a result graphics based on R appear regularly in venues like the New York Times, the Economist, and the FlowingData blog.

A flexible statistical analysis toolkit. All of the standard data analysis tools are built right into the R language: from accessing data in various formats, to data manipulation (transforms, merges, aggregations, etc.), to traditional and modern statistical models (regression, ANOVA, GLM, tree models, etc). All are included in an object-oriented framework that makes it easy to programatically extract out and combine just the information you need from the results, rather than having to cut-and-paste from a static report.

Access to powerful, cutting-edge analytics. Leading academics and researches from around the world use R to develop the latest methods in statistics, machine learning, and predictive modeling. There are expansive, cutting-edge edge extensions to R in finance, genomics, and dozens of other fields. To date, more than 2000 packages extending the R language in every domain are available for free download, with more added every day.

A robust, vibrant community. With thousands of contributors and more than two million users around the world, if you've got a question about R chances are, someone's answered it (or can). There's a wealth of community resources for R available on the Web, for help in just about every domain.

Unlimited possibilities. With R, you're not restricted to choosing a pre-defined set of routines. You can use code contributed by others in the open-source community, or extend R with your own functions. And R is excellent for "mash-ups" with other applications: combine R with a MySQL database, an Apache web-server, and the Google Maps API and you've got yourself a real-time GIS analysis toolkit. That's just one big idea -- what's yours?

Next: Community resources for R