%% LyX 2.2.1 created this file. For more info, see http://www.lyx.org/. %% Do not edit unless you really know what you are doing. \documentclass[nohyper,justified]{tufte-handout} \usepackage[T1]{fontenc} \usepackage{url} \usepackage[unicode=true,pdfusetitle, bookmarks=true,bookmarksnumbered=true,bookmarksopen=true,bookmarksopenlevel=2, breaklinks=true,pdfborder={0 0 1},backref=false,colorlinks=false] {hyperref} \hypersetup{ pdfstartview=FitH} \usepackage{breakurl} \makeatletter %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands. \title{knitr Graphics Manual} \author{Yihui Xie} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands. \renewcommand{\textfraction}{0.05} \renewcommand{\topfraction}{0.8} \renewcommand{\bottomfraction}{0.8} \renewcommand{\floatpagefraction}{0.75} \usepackage[buttonsize=1em]{animate} \makeatother \begin{document} <>= library(knitr) options(formatR.arrow=TRUE,width=50) opts_chunk$set(fig.path='figure/graphics-', cache.path='cache/graphics-', fig.align='center', dev='tikz', fig.width=5, fig.height=5, fig.show='hold', cache=TRUE, par=TRUE) knit_hooks$set(par=function(before, options, envir){ if (before && options$fig.show!='none') par(mar=c(4,4,.1,.1),cex.lab=.95,cex.axis=.9,mgp=c(2,.7,0),tcl=-.3) }, crop=hook_pdfcrop) @ \maketitle \begin{abstract} This manual shows features of graphics in the \textbf{knitr} package (version \Sexpr{packageVersion('knitr')}) in detail, including the graphical devices, plot recording, plot rearrangement, control of plot sizes, the tikz device, figure captions, animations and other types of plots such as \textbf{rgl} or GGobi plots. \end{abstract} Before reading this specific manual\footnote{\url{http://bit.ly/knitr-graphics-src} (Rnw source)}, you must have finished the main manual\footnote{\url{http://bit.ly/knitr-main}}. \section{Graphical Devices} The \textbf{knitr} package comes with more than 20 built-in graphical devices, and you can specify them through the \texttt{dev} option. This document uses the global option \texttt{dev='tikz'}, i.e., the plots are recorded by the tikz device by default, but we can change the device locally. Since tikz will be used extensively throughout this manual and you will see plenty of tikz graphics later, now we first show a few other devices. <>= with(trees, symbols(Height, Volume, circles = Girth/16, inches = FALSE, bg = 'deeppink', fg = 'gray30')) @ \begin{marginfigure} <>= @ \caption{The default PDF device.\label{mar:pdf-dev}} \end{marginfigure} \begin{marginfigure} <>= @ \caption{The PNG device.\label{mar:png-dev}} \end{marginfigure} Figure \ref{mar:pdf-dev} and \ref{mar:png-dev} show two standard devices in the \textbf{grDevices} package. We can also use devices in the \textbf{Cairo} or \textbf{cairoDevice} package, e.g., the chunk below uses the \emph{CairoPNG()} device in the \textbf{Cairo} package. <>= @ \section{Plot Recording} As mentioned in the main manual, \textbf{knitr} uses the \textbf{evaluate} package to record plots. There are two sources of plots: first, whenever \emph{plot.new()} or \emph{grid.newpage()} is called, \textbf{evaluate} will try to save a snapshot of the current plot\footnote{For technical details, see \texttt{?setHook} and \texttt{?recordPlot}}; second, after each complete expression is evaluated, a snapshot is also saved. To speed up recording, the null graphical device \texttt{pdf(file = NULL)} is used. Figure \ref{fig:two-high} shows two expressions producing two high-level plots. \begin{figure} <>= plot(cars) boxplot(cars$dist,xlab='dist') @ \caption{Two high-level plots are captured. The key to arrange two plots side by side is to specify the \texttt{out.width} option so that each plot takes less than half of the line width. We do not have to use the \texttt{par(mfrow)} trick, and it may not work in some cases (e.g. to put base graphics and \textbf{ggplot2} side by side; recall Figure 1 in the main manual).\label{fig:two-high}} \end{figure} Figure \ref{fig:low-plot-loop} shows another example of two R expressions, but the second expression only involves with low-level plotting changes. By default, low-level plot changes are discarded, but you can retain them with the option \texttt{fig.keep='all'}. \begin{figure} <>= plot(0,0,type='n',ann=FALSE) for(i in seq(0, 2*pi,length=20)) {points(cos(i),sin(i))} @ \caption{Two complete R expressions will produce at most two plots, as long as there are not multiple high-level plotting calls in each expression; option \texttt{fig.keep='all'} here.\label{fig:low-plot-loop}} \end{figure} Together with \texttt{fig.show='asis'}, we can show the process of plotting step by step like Figure \ref{fig:high-with-low}. \begin{figure} <>= plot(cars) abline(lm(dist~speed, data=cars)) # a regression line @ \caption{Low-level plot changes in base graphics can be recorded separately, and plots can be put in the places where they were produced.\label{fig:high-with-low}} \end{figure} A further note on plot recording: \textbf{knitr} examines all recorded plots (as R objects) and compares them sequentially; if the previous plot is a ``subset'' of the next plot (= previous plot + low-level changes), the previous plot will be removed when \texttt{fig.keep='high'}. If two successive plots are identical, the second one will be removed by default, so it might be a little bit surprising that the following chunk will only produce one plot by default\footnote{adapted from \url{https://github.com/yihui/knitr/issues/41}}: <>= m = matrix(1:100, ncol = 10) image(m) image(m*2) # exactly the same as previous plot @ \section{Plot Rearrangement} We can rearrange the plots in chunks in several ways. They can be inserted right after the line(s) of R code which produced them, or accumulated till the end of the chunk. There is an example in the main manual demonstrating \texttt{fig.show='asis'} for two high-level plots, and Figure \ref{fig:high-with-low} in this manual also demonstrates this option for a high-level plot followed by a low-level change. Here is an example demonstrating the option \texttt{fig.keep='last'} (only the last plot is kept): \begin{figure} <>= library(ggplot2) pie <- ggplot(diamonds, aes(x = factor(1), fill = cut)) + xlab('cut') + geom_bar(width = 1) pie + coord_polar(theta = "y") # a pie chart pie + coord_polar() # the bullseye chart @ \caption{Two plots were produced in this chunk, but only the last one is kept. This can be useful when we experiment with many plots, but only want the last result. (Adapted from the \textbf{ggplot2} website)} \end{figure} When multiple plots are produced by a code chunk, we may want to show them as an animation with the option \texttt{fig.show='animate'}. Figure \ref{fig:clock-animation} shows a simple clock animation; you may compare the code to Figure \ref{fig:high-with-low} to understand that high-level plots are always recorded, regardless of where they appeared. \begin{figure} <>= par(mar = rep(3, 4)) for (i in seq(pi/2, -4/3 * pi, length = 12)) { plot(0, 0, pch = 20, ann = FALSE, axes = FALSE) arrows(0, 0, cos(i), sin(i)) axis(1, 0, "VI"); axis(2, 0, "IX") axis(3, 0, "XII"); axis(4, 0, "III"); box() } @ \caption{A clock animation. You have to view it in Adobe Reader: click to play/pause; there are also buttons to speed up or slow down the animation.\label{fig:clock-animation}} \end{figure} We can also set the alignment of plots easily with the \texttt{fig.align} option; this document uses \texttt{fig.align='center'} as a global option, and we can also set plots to be left/right-aligned. Figure \ref{fig:maruko-plot} is an example of a left-aligned plot. \begin{figure} <>= stars(cbind(1:16,10*(16:1)),draw.segments=TRUE) @ \caption{A left-aligned plot adapted from \texttt{?stars} (I call this the ``Maruko'' plot, and it is one of my favorites; see \protect\url{http://en.wikipedia.org/wiki/Chibi_Maruko-chan}).\label{fig:maruko-plot}} \end{figure} \section{Plot Size} We have seen several examples in which two or more plots can be put side by side, and this is because the plots were resized in the output document; with the chunk option \texttt{out.width} less than half of the line width, \LaTeX{} will arrange two plots in one line; if it is less than $1/3$ of the line width, three plots can be put in one line. Of course we can also set it to be an absolute width like \texttt{3in} (3 inches). This option is used extensively in this document to control the size of plots in the output document. \section{The tikz Device} The main advantage of using tikz graphics is the consistency of styles between texts in plots and those in the main document. Since we can use native \LaTeX{} commands in plots, the styles of texts in plots can be very sophisticated (see Figure \ref{fig:math-formula-tt} for an example). \begin{figure} <>= plot(0:1,0:1,type='n', ylab='origin of statistics', xlab='statistical presentation rocks with \\LaTeX{}') text(.5,c(.8,.5,.2), c('\\texttt{lm(y \\textasciitilde{} x)}', '$\\hat{\\beta}=(X^{\\prime}X)^{-1}X^{\\prime}y$', '$(\\Omega,\\mathcal{F},\\mu)$')) @ \caption{A plot created by \textbf{tikzDevice} with math expressions and typewriter fonts. Note the font style consistency \textendash{} we write the same expressions in \protect\LaTeX{} here: $\hat{\beta}=(X^{\prime}X)^{-1}X^{\prime}y$ and $(\Omega,\mathcal{F},\mu)$; also \texttt{lm(y \textasciitilde{} x)}.\label{fig:math-formula-tt}} \end{figure} When using Xe\LaTeX{} instead of PDF\LaTeX{} to compile the document, we need to tell the \textbf{tikzDevice} package by setting the \texttt{tikzDefaultEngine} option before all plot chunks (preferably in the first chunk): <>= options(tikzDefaultEngine='xetex') @ This is useful and often necessary to compile tikz plots which contain (UTF8) multi-byte characters. \section{Figure Caption} If the chunk option \texttt{fig.cap} is not \texttt{NULL} or \texttt{NA}, the plots will be put in a \texttt{figure} environment when the output format is \LaTeX{}, and this option is used to write a caption in this environment using \texttt{\textbackslash{}caption\{\}}. The other two related options are \texttt{fig.scap} and \texttt{fig.lp} which set the short caption and a prefix string for the figure label. The default short caption is extracted from the caption by truncating it at the first period or colon or semi-colon. The label is a combination of \texttt{fig.lp} and the chunk label. Because \texttt{figure} is a float environment, it can float away from the chunk output to other places such as the top or bottom of a page when the \TeX{} document is compiled. \section{Other Features} The \textbf{knitr} package can be extended with hook functions, and here I give a few examples illustrating the flexibility. \subsection{Cropping PDF Graphics} Some R users may have been suffering from the extra margins in R plots, especially in base graphics (\textbf{ggplot2} is much better in this aspect). The default graphical option \texttt{mar} is about \texttt{c(5, 4, 4, 2)} (see \texttt{?par}), which is often too big. Instead of endlessly setting \texttt{par(mar)}, you may consider the program \texttt{pdfcrop}, which can crop the white margin automatically\footnote{\url{http://www.ctan.org/pkg/pdfcrop}}. In \textbf{knitr}, we can set up the hook \emph{hook\_pdfcrop()} to work with a chunk option, say, \texttt{crop}. <>= knit_hooks$set(crop=hook_pdfcrop) @ Now we compare two plots below. The first one is not cropped (Figure \ref{fig:pdf-nocrop}); then the same plot is produced but with a chunk option \texttt{crop=TRUE} which will call the cropping hook (Figure \ref{fig:pdf-crop}). \begin{figure} \begin{kframe} <>= par(mar=c(5,4,4,2),bg='white') # large margin plot(lat~long,data=quakes,pch=20,col=rgb(0,0,0,.2)) @ \end{kframe} \caption{The original plot produced in R, with a large margin.\label{fig:pdf-nocrop}} \end{figure} \begin{figure} \begin{kframe} <>= @ \end{kframe} \caption{The cropped plot; it fits better in the document.\label{fig:pdf-crop}} \end{figure} As we can see, the white margins are gone. If we use \emph{par()}, it might be hard and tedious to figure out a reasonable amount of margin in order that neither is any label cropped nor do we get a too large margin. My experience is that \texttt{pdfcrop} works well with base graphics, but barely works with \textbf{grid} graphics (therefore \textbf{lattice} and \textbf{ggplot2} are ruled out). \subsection{Manually Saved Plots} We have explained how R plots are recorded before. In some cases, it is not possible to capture plots by \emph{recordPlot()} (such as \textbf{rgl} plots), but we can save them using other functions. To insert these plots into the output, we need to set up a hook first like this (see \texttt{?hook\_plot\_custom} for details): <>= knit_hooks$set(custom.plot = hook_plot_custom) @ Then we set the chunk option \texttt{custom.plot=TRUE}, and manually write plot files in the chunk. Here we show an example of capturing GGobi plots using the function \emph{ggobi\_display\_save\_picture()} in the \textbf{rggobi} package: <>= library(rggobi) ggobi(ggobi_find_file('data', 'flea.csv')) Sys.sleep(1) # wait for snapshot ggobi_display_save_picture(path=fig_path('.png')) @ One thing to note here is we have to make sure the plot filename is from \emph{fig\_path()}, which is a convenience function to return the figure path for the current chunk. We can do whatever normal R plots can do with this hook, and we give another example below to show how to work with animations. <>= library(animation) # adapted from demo('rgl_animation') data(pollen) uM = matrix(c(-0.37, -0.51, -0.77, 0, -0.73, 0.67, -0.1, 0, 0.57, 0.53, -0.63, 0, 0, 0, 0, 1), 4, 4) library(rgl) open3d(userMatrix = uM, windowRect = c(0, 0, 400, 400)) plot3d(pollen[, 1:3]) zm = seq(1, 0.05, length = 20) par3d(zoom = 1) # change the zoom factor gradually later for (i in 1:length(zm)) { par3d(zoom = zm[i]); Sys.sleep(.05) rgl.snapshot(fig_path('png', number = i)) } @ \subsection{rgl Plots} With the hook \emph{hook\_rgl()}, we can easily save snapshots from the \textbf{rgl} package. We have shown an example in the main manual, and here we add some details. The rgl hook is a good example of taking care of details by carefully using the \texttt{options} argument in the hook; for example, we cannot directly set the width and height of rgl plots in \emph{rgl.snapshot()} or \emph{rgl.postscript()}, so we make use of the options \texttt{fig.width}, \texttt{fig.height} and \texttt{dpi} to calculate the expected size of the window, then resize the current window by \emph{par3d()}, and finally save the plot. This hook is actually built upon \emph{hook\_plot\_custom()} \textendash{} first it saves the \textbf{rgl} snapshot, then it calls \emph{hook\_plot\_custom()} to write the output code. \appendix \section{How to Compile This Manual} This manual has a long chain of dependencies, so it may not be easy to compile. These packages are required (all of them are free software): \begin{description} \item [{R}] Cairo, ggplot2, tikzDevice, rgl, rggobi, animation (all available on CRAN except tikzDevice which is on R-Forge for the time being) \item [{\LaTeX{}}] animate, hyperref and the tufte-handout class \item [{Other}] GGobi, pdfcrop \end{description} \end{document}