Skip to Content

knitr: Elegant, flexible and fast dynamic report generation with R

yihui's picture

The knitr package is an alternative tool to Sweave with a more flexible design and much more features. CRAN page: http://cran.r-project.org/package=knitr ; development repository: https://github.com/yihui/knitr ; website (documentation and demos): http://yihui.name/knitr/ ; manual: https://github.com/downloads/yihui/knitr/knitr-manual.pdf (use the Adobe Reader to see something surprising there)

For those who are not familiar with dynamic report generation or literate programming with R, the idea is that R code can be mixed with a document, and we can compile the document with R code being evaluated and corresponding results (including numeric output and graphics, etc) written into the output document (a minimal introduction). Sweave has implemented the basic idea, and there is still lots of room for improvement for production use. For example, it is unrealistic to restrict one plot per code chunk, and knitr is more natural on this issue. Here is the screenshot of an example taken from the knitr manual:

Elegance

The elegance of knitr comes from several aspects, like code reformatting (with the formatR package, to make R code better formatted), highlighting (with the highlight package, to make R code more readable), support for tikz graphics (with the tikzDevice package, to produce high-quality R graphics; see the graphics manual for examples) and careful consideration on details. For example, by default there are no prompt characters like > and + in the R code output, so it is easy for the reader to copy and run the code; the results returned by R are masked in ## so the reader can see the results without mangling the R source code (i.e. the output is still valid R code); the default number of digits was set to be small so we do not get too many digits.

In a word, knitr was designed with one belief in mind: beauty should come by default.

Flexibility

Unlike Sweave, which was mainly targeted at LaTeX, knitr was designed without any restrictions on the input or the output format. It can be quickly adapted to HTML or other types of output, since the core components (code extraction and evaluation) are not hard-coded. This is a simple example showing how knitr works with the markdown format:

https://github.com/yihui/knitr/blob/master/inst/examples/knitr-minimal.md

There is a whole set of hooks which can be used to customize the output (http://yihui.name/knitr/hooks). For example, you can easily use the listings package to decorate your R code if you do not like the default style.

There are more than 20 built-in graphical devices, including PDF, PNG, tikz and many devices in the Cairo or cairoDevice package, and it requires little effort to switch between difference devices.

The knitr package has put a lot of emphasis on graphics, and there are brand-new features like direct support for animations in LaTeX documents, as well as a quick support for rgl 3D plots; see the PDF manual for examples (use Adobe Reader to view animations).

Speed

Learning from cacheSweave and pgfSweave, knitr also has support for cache through different implementations. The idea of cache is that a code chunk can be skipped if its results have been cached before and the code has not been changed since then. This will make the code evaluation much faster. All the objects created in a cached chunk will be lazy-loaded, meaning the objects will not really be loaded into the current R session unless they are really used in the following chunks. Again, this will save some time on computing as well. Note the complete output of a code chunk is cached, which means the printed results as well as the graphics will show up as if they were created in real time by a code chunk (in cacheSweave, we lose printed results and graphics).

Here is a real world application on a time-consuming computing job -- analysis of the NRC rankings data via the Bayesian Lasso (Rnw source; PDF output). As we know, MCMC often involves with a large number of iterations, and this demo clearly shows the advantage of cache.

Summary

Although knitr is still a new baby, I can envision many potential applications in business due to the following reasons:

  • Dynamic and automatic report generation saves time and human efforts; as long as the R code has been set up correctly, knitr can take over the rest of job; it takes the same amount of efforts to compile a report once or ten thousand times;
  • Big data will definitely need the cache since it is often time-consuming to deal with, and we may not want to redo all the computing over and over again when generating reports automatically;
  • We need professional presentations of data and statistical models in business, and knitr tries to give beautiful output by default; we also need novel presentations as well, such as animations and sophisticated rgl 3D plots (see how boring and clumsy statistical reports usually are);
  • The report does not have to be restricted to a specific format such as LaTeX, and knitr is fully customizable to incorporate with different types of demand; as a trivial example, knitr can be used as the backend of http://www.inside-r.org/pretty-r/, or as an online data processing tool (think http://opencpu.org/);

Last but not least, for LyX users, I have also added support of knitr to LyX to make it really easy to use this package without taking care of the details of LaTeX. A short video is here: http://vimeo.com/32948939

Comments

Rich G's picture

Yihui:

Knitr looks fabulous! It really adds a lot visually to packaging up R analysis. I can't wait until it is included in LyX as a built-in plug-and-play module (like Sweave more-or-less has become since LyX 2.0).

I don't do too many documents in LaTeX but I like very much being able to hook up R to presentations (which I now do via Sweave in LyX) for version control purposes. On a tight deadline, having to repaste dozens of figures into a presentation because you've changed something is a common cause of painful, late nights. Hooking Sweave to Beamer eliminates that but it is hard to make nice-looking. The knitr stuff you put together looks much more "client friendly".

Keep up the good work!
- Rich

PS- Whatever you can do to make sure that knitr plays nicely with Beamer in LyX would be greatly appreciated! (Right now there is too much ERT hacking. The FragileFrame approach Liviu has put together seems promising but I wish it was better integrated in LyX.)

yihui's picture

Thanks, Rich. I have just put up the beamer example in http://yihui.github.com/knitr/demo/beamer/

The LyX "boss" has agreed to add the FragileFrame module to LyX, so let's see what is going to happen in LyX 2.0.3 :)

lowell.it1's picture

Our flagship link building service – Paint It White. Links from manually created beautiful web 2.0 posts on premium blogs. Supported by an array of tier 2 and 3 links for maximum ranking increases
link building service

eavedrop44's picture

Links from manually created beautiful web 2.0 posts on premium blogs. Supported by an array of tier 2 and 3 links for maximum ranking increases.ADO.net Interview Questions

eavedrop44's picture

Hooking Sweave to Beamer eliminates that but it is hard to make nice-looking. The knitr stuff you put together looks much more "client friendly". nettoyage de vitres

william2johns's picture

There is a lot of confusion about business letters and many people are not sure exactly what a "computer science help" really is. In fact, the term "business letter" is a very general one that can mean many different specific letter types. This article clears up the confusion.