2025-01-12 00:52:51 +08:00

404 lines
16 KiB
Plaintext

%% LyX 2.2.1 created this file. For more info, see http://www.lyx.org/.
%% Do not edit unless you really know what you are doing.
\documentclass[nohyper,justified]{tufte-handout}
\usepackage[T1]{fontenc}
\usepackage{url}
\usepackage[unicode=true,pdfusetitle,
bookmarks=true,bookmarksnumbered=true,bookmarksopen=true,bookmarksopenlevel=2,
breaklinks=true,pdfborder={0 0 1},backref=false,colorlinks=false]
{hyperref}
\hypersetup{
pdfstartview=FitH}
\usepackage{breakurl}
\makeatletter
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands.
\title{knitr Graphics Manual}
\author{Yihui Xie}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands.
\renewcommand{\textfraction}{0.05}
\renewcommand{\topfraction}{0.8}
\renewcommand{\bottomfraction}{0.8}
\renewcommand{\floatpagefraction}{0.75}
\usepackage[buttonsize=1em]{animate}
\makeatother
\begin{document}
<<setup, include=FALSE, cache=FALSE>>=
library(knitr)
options(formatR.arrow=TRUE,width=50)
opts_chunk$set(fig.path='figure/graphics-', cache.path='cache/graphics-', fig.align='center', dev='tikz', fig.width=5, fig.height=5, fig.show='hold', cache=TRUE, par=TRUE)
knit_hooks$set(par=function(before, options, envir){
if (before && options$fig.show!='none') par(mar=c(4,4,.1,.1),cex.lab=.95,cex.axis=.9,mgp=c(2,.7,0),tcl=-.3)
}, crop=hook_pdfcrop)
@
\maketitle
\begin{abstract}
This manual shows features of graphics in the \textbf{knitr} package
(version \Sexpr{packageVersion('knitr')}) in detail, including the
graphical devices, plot recording, plot rearrangement, control of
plot sizes, the tikz device, figure captions, animations and other
types of plots such as \textbf{rgl} or GGobi plots.
\end{abstract}
Before reading this specific manual\footnote{\url{http://bit.ly/knitr-graphics-src} (Rnw source)},
you must have finished the main manual\footnote{\url{http://bit.ly/knitr-main}}.
\section{Graphical Devices}
The \textbf{knitr} package comes with more than 20 built-in graphical
devices, and you can specify them through the \texttt{dev} option.
This document uses the global option \texttt{dev='tikz'}, i.e., the
plots are recorded by the tikz device by default, but we can change
the device locally. Since tikz will be used extensively throughout
this manual and you will see plenty of tikz graphics later, now we
first show a few other devices.
<<test-plot, eval=FALSE>>=
with(trees, symbols(Height, Volume, circles = Girth/16, inches = FALSE, bg = 'deeppink', fg = 'gray30'))
@
\begin{marginfigure}
<<pdf-dev, ref.label='test-plot', dev='pdf', out.width='\\linewidth', echo=FALSE>>=
@
\caption{The default PDF device.\label{mar:pdf-dev}}
\end{marginfigure}
\begin{marginfigure}
<<png-dev, ref.label='test-plot', dev='png', out.width='\\linewidth', echo=FALSE>>=
@
\caption{The PNG device.\label{mar:png-dev}}
\end{marginfigure}
Figure \ref{mar:pdf-dev} and \ref{mar:png-dev} show two standard
devices in the \textbf{grDevices} package. We can also use devices
in the \textbf{Cairo} or \textbf{cairoDevice} package, e.g., the chunk
below uses the \emph{CairoPNG()} device in the \textbf{Cairo} package.
<<cairo-png-dev, ref.label='test-plot', dev='CairoPNG', out.width='.5\\linewidth', echo=FALSE>>=
@
\section{Plot Recording}
As mentioned in the main manual, \textbf{knitr} uses the \textbf{evaluate}
package to record plots. There are two sources of plots: first, whenever
\emph{plot.new()} or \emph{grid.newpage()} is called, \textbf{evaluate}
will try to save a snapshot of the current plot\footnote{For technical details, see \texttt{?setHook} and \texttt{?recordPlot}};
second, after each complete expression is evaluated, a snapshot is
also saved. To speed up recording, the null graphical device \texttt{pdf(file
= NULL)} is used. Figure \ref{fig:two-high} shows two expressions
producing two high-level plots.
\begin{figure}
<<two-high, fig.width=3, fig.height=2.5, out.width='.49\\linewidth'>>=
plot(cars)
boxplot(cars$dist,xlab='dist')
@
\caption{Two high-level plots are captured. The key to arrange two plots side
by side is to specify the \texttt{out.width} option so that each plot
takes less than half of the line width. We do not have to use the
\texttt{par(mfrow)} trick, and it may not work in some cases (e.g.
to put base graphics and \textbf{ggplot2} side by side; recall Figure
1 in the main manual).\label{fig:two-high}}
\end{figure}
Figure \ref{fig:low-plot-loop} shows another example of two R expressions,
but the second expression only involves with low-level plotting changes.
By default, low-level plot changes are discarded, but you can retain
them with the option \texttt{fig.keep='all'}.
\begin{figure}
<<low-plot-loop, fig.width=3, fig.height=2.5, out.width='.49\\linewidth', fig.keep='all'>>=
plot(0,0,type='n',ann=FALSE)
for(i in seq(0, 2*pi,length=20)) {points(cos(i),sin(i))}
@
\caption{Two complete R expressions will produce at most two plots, as long
as there are not multiple high-level plotting calls in each expression;
option \texttt{fig.keep='all'} here.\label{fig:low-plot-loop}}
\end{figure}
Together with \texttt{fig.show='asis'}, we can show the process of
plotting step by step like Figure \ref{fig:high-with-low}.
\begin{figure}
<<high-with-low, fig.width=3, fig.height=2.5, out.width='.5\\linewidth', fig.keep='all', fig.show='asis'>>=
plot(cars)
abline(lm(dist~speed, data=cars)) # a regression line
@
\caption{Low-level plot changes in base graphics can be recorded separately,
and plots can be put in the places where they were produced.\label{fig:high-with-low}}
\end{figure}
A further note on plot recording: \textbf{knitr} examines all recorded
plots (as R objects) and compares them sequentially; if the previous
plot is a ``subset'' of the next plot (= previous plot + low-level
changes), the previous plot will be removed when \texttt{fig.keep='high'}.
If two successive plots are identical, the second one will be removed
by default, so it might be a little bit surprising that the following
chunk will only produce one plot by default\footnote{adapted from \url{https://github.com/yihui/knitr/issues/41}}:
<<identical-plots,eval=FALSE>>=
m = matrix(1:100, ncol = 10)
image(m)
image(m*2) # exactly the same as previous plot
@
\section{Plot Rearrangement}
We can rearrange the plots in chunks in several ways. They can be
inserted right after the line(s) of R code which produced them, or
accumulated till the end of the chunk. There is an example in the
main manual demonstrating \texttt{fig.show='asis'} for two high-level
plots, and Figure \ref{fig:high-with-low} in this manual also demonstrates
this option for a high-level plot followed by a low-level change.
Here is an example demonstrating the option \texttt{fig.keep='last'}
(only the last plot is kept):
\begin{figure}
<<fig-last, fig.width=5.2, fig.height=3, out.width='.7\\linewidth', fig.keep='last', message=FALSE>>=
library(ggplot2)
pie <- ggplot(diamonds, aes(x = factor(1), fill = cut)) + xlab('cut') + geom_bar(width = 1)
pie + coord_polar(theta = "y") # a pie chart
pie + coord_polar() # the bullseye chart
@
\caption{Two plots were produced in this chunk, but only the last one is kept.
This can be useful when we experiment with many plots, but only want
the last result. (Adapted from the \textbf{ggplot2} website)}
\end{figure}
When multiple plots are produced by a code chunk, we may want to show
them as an animation with the option \texttt{fig.show='animate'}.
Figure \ref{fig:clock-animation} shows a simple clock animation;
you may compare the code to Figure \ref{fig:high-with-low} to understand
that high-level plots are always recorded, regardless of where they
appeared.
\begin{figure}
<<clock-animation, fig.width=3, fig.height=3, out.width='.4\\linewidth', fig.show='animate', tidy=FALSE, crop=TRUE>>=
par(mar = rep(3, 4))
for (i in seq(pi/2, -4/3 * pi, length = 12)) {
plot(0, 0, pch = 20, ann = FALSE, axes = FALSE)
arrows(0, 0, cos(i), sin(i))
axis(1, 0, "VI"); axis(2, 0, "IX")
axis(3, 0, "XII"); axis(4, 0, "III"); box()
}
@
\caption{A clock animation. You have to view it in Adobe Reader: click to play/pause;
there are also buttons to speed up or slow down the animation.\label{fig:clock-animation}}
\end{figure}
We can also set the alignment of plots easily with the \texttt{fig.align}
option; this document uses \texttt{fig.align='center'} as a global
option, and we can also set plots to be left/right-aligned. Figure
\ref{fig:maruko-plot} is an example of a left-aligned plot.
\begin{figure}
<<maruko-plot, fig.width=7, fig.height=4, out.width='.45\\linewidth', fig.align='left', crop=TRUE>>=
stars(cbind(1:16,10*(16:1)),draw.segments=TRUE)
@
\caption{A left-aligned plot adapted from \texttt{?stars} (I call this the
``Maruko'' plot, and it is one of my favorites; see \protect\url{http://en.wikipedia.org/wiki/Chibi_Maruko-chan}).\label{fig:maruko-plot}}
\end{figure}
\section{Plot Size}
We have seen several examples in which two or more plots can be put
side by side, and this is because the plots were resized in the output
document; with the chunk option \texttt{out.width} less than half
of the line width, \LaTeX{} will arrange two plots in one line; if
it is less than $1/3$ of the line width, three plots can be put in
one line. Of course we can also set it to be an absolute width like
\texttt{3in} (3 inches). This option is used extensively in this document
to control the size of plots in the output document.
\section{The tikz Device}
The main advantage of using tikz graphics is the consistency of styles
between texts in plots and those in the main document. Since we can
use native \LaTeX{} commands in plots, the styles of texts in plots
can be very sophisticated (see Figure \ref{fig:math-formula-tt} for
an example).
\begin{figure}
<<math-formula-tt, fig.width=6, fig.height=2, out.width='\\linewidth'>>=
plot(0:1,0:1,type='n', ylab='origin of statistics', xlab='statistical presentation rocks with \\LaTeX{}')
text(.5,c(.8,.5,.2), c('\\texttt{lm(y \\textasciitilde{} x)}', '$\\hat{\\beta}=(X^{\\prime}X)^{-1}X^{\\prime}y$', '$(\\Omega,\\mathcal{F},\\mu)$'))
@
\caption{A plot created by \textbf{tikzDevice} with math expressions and typewriter
fonts. Note the font style consistency \textendash{} we write the
same expressions in \protect\LaTeX{} here: $\hat{\beta}=(X^{\prime}X)^{-1}X^{\prime}y$
and $(\Omega,\mathcal{F},\mu)$; also \texttt{lm(y \textasciitilde{}
x)}.\label{fig:math-formula-tt}}
\end{figure}
When using Xe\LaTeX{} instead of PDF\LaTeX{} to compile the document,
we need to tell the \textbf{tikzDevice} package by setting the \texttt{tikzDefaultEngine}
option before all plot chunks (preferably in the first chunk):
<<xetex-tikz, eval=FALSE>>=
options(tikzDefaultEngine='xetex')
@
This is useful and often necessary to compile tikz plots which contain
(UTF8) multi-byte characters.
\section{Figure Caption}
If the chunk option \texttt{fig.cap} is not \texttt{NULL} or \texttt{NA},
the plots will be put in a \texttt{figure} environment when the output
format is \LaTeX{}, and this option is used to write a caption in
this environment using \texttt{\textbackslash{}caption\{\}}. The other
two related options are \texttt{fig.scap} and \texttt{fig.lp} which
set the short caption and a prefix string for the figure label. The
default short caption is extracted from the caption by truncating
it at the first period or colon or semi-colon. The label is a combination
of \texttt{fig.lp} and the chunk label. Because \texttt{figure} is
a float environment, it can float away from the chunk output to other
places such as the top or bottom of a page when the \TeX{} document
is compiled.
\section{Other Features}
The \textbf{knitr} package can be extended with hook functions, and
here I give a few examples illustrating the flexibility.
\subsection{Cropping PDF Graphics}
Some R users may have been suffering from the extra margins in R plots,
especially in base graphics (\textbf{ggplot2} is much better in this
aspect). The default graphical option \texttt{mar} is about \texttt{c(5,
4, 4, 2)} (see \texttt{?par}), which is often too big. Instead of
endlessly setting \texttt{par(mar)}, you may consider the program
\texttt{pdfcrop}, which can crop the white margin automatically\footnote{\url{http://www.ctan.org/pkg/pdfcrop}}.
In \textbf{knitr}, we can set up the hook \emph{hook\_pdfcrop()} to
work with a chunk option, say, \texttt{crop}.
<<crop-hook, cache=FALSE>>=
knit_hooks$set(crop=hook_pdfcrop)
@
Now we compare two plots below. The first one is not cropped (Figure
\ref{fig:pdf-nocrop}); then the same plot is produced but with a
chunk option \texttt{crop=TRUE} which will call the cropping hook
(Figure \ref{fig:pdf-crop}).
\begin{figure}
\begin{kframe}
<<pdf-nocrop, fig.width=3.2, fig.height=3.2, out.width='.5\\linewidth'>>=
par(mar=c(5,4,4,2),bg='white') # large margin
plot(lat~long,data=quakes,pch=20,col=rgb(0,0,0,.2))
@
\end{kframe}
\caption{The original plot produced in R, with a large margin.\label{fig:pdf-nocrop}}
\end{figure}
\begin{figure}
\begin{kframe}
<<pdf-crop, ref.label='pdf-nocrop', echo=FALSE, fig.width=3.2, fig.height=3.2, out.width='.5\\linewidth', crop=TRUE>>=
@
\end{kframe}
\caption{The cropped plot; it fits better in the document.\label{fig:pdf-crop}}
\end{figure}
As we can see, the white margins are gone. If we use \emph{par()},
it might be hard and tedious to figure out a reasonable amount of
margin in order that neither is any label cropped nor do we get a
too large margin. My experience is that \texttt{pdfcrop} works well
with base graphics, but barely works with \textbf{grid} graphics (therefore
\textbf{lattice} and \textbf{ggplot2} are ruled out).
\subsection{Manually Saved Plots}
We have explained how R plots are recorded before. In some cases,
it is not possible to capture plots by \emph{recordPlot()} (such as
\textbf{rgl} plots), but we can save them using other functions. To
insert these plots into the output, we need to set up a hook first
like this (see \texttt{?hook\_plot\_custom} for details):
<<custom-plot-hook, cache=FALSE>>=
knit_hooks$set(custom.plot = hook_plot_custom)
@
Then we set the chunk option \texttt{custom.plot=TRUE}, and manually
write plot files in the chunk. Here we show an example of capturing
GGobi plots using the function \emph{ggobi\_display\_save\_picture()}
in the \textbf{rggobi} package:
<<ggobi-plot, custom.plot=TRUE, fig.ext='png', out.width='2.5in', results='hide', warning=FALSE>>=
library(rggobi)
ggobi(ggobi_find_file('data', 'flea.csv'))
Sys.sleep(1) # wait for snapshot
ggobi_display_save_picture(path=fig_path('.png'))
@
One thing to note here is we have to make sure the plot filename is
from \emph{fig\_path()}, which is a convenience function to return
the figure path for the current chunk.
We can do whatever normal R plots can do with this hook, and we give
another example below to show how to work with animations.
<<rgl-animation, custom.plot=TRUE, fig.ext='png', out.width='2.5in', fig.show='animate', fig.num=20, interval=.4, aniopts='controls,autoplay', results='hide', warning=FALSE>>=
library(animation) # adapted from demo('rgl_animation')
data(pollen)
uM = matrix(c(-0.37, -0.51, -0.77, 0, -0.73, 0.67, -0.1, 0, 0.57, 0.53, -0.63, 0, 0, 0, 0, 1), 4, 4)
library(rgl)
open3d(userMatrix = uM, windowRect = c(0, 0, 400, 400))
plot3d(pollen[, 1:3])
zm = seq(1, 0.05, length = 20)
par3d(zoom = 1) # change the zoom factor gradually later
for (i in 1:length(zm)) {
par3d(zoom = zm[i]); Sys.sleep(.05)
rgl.snapshot(fig_path('png', number = i))
}
@
\subsection{rgl Plots}
With the hook \emph{hook\_rgl()}, we can easily save snapshots from
the \textbf{rgl} package. We have shown an example in the main manual,
and here we add some details. The rgl hook is a good example of taking
care of details by carefully using the \texttt{options} argument in
the hook; for example, we cannot directly set the width and height
of rgl plots in \emph{rgl.snapshot()} or \emph{rgl.postscript()},
so we make use of the options \texttt{fig.width}, \texttt{fig.height}
and \texttt{dpi} to calculate the expected size of the window, then
resize the current window by \emph{par3d()}, and finally save the
plot.
This hook is actually built upon \emph{hook\_plot\_custom()} \textendash{}
first it saves the \textbf{rgl} snapshot, then it calls \emph{hook\_plot\_custom()}
to write the output code.
\appendix
\section{How to Compile This Manual}
This manual has a long chain of dependencies, so it may not be easy
to compile. These packages are required (all of them are free software):
\begin{description}
\item [{R}] Cairo, ggplot2, tikzDevice, rgl, rggobi, animation (all available
on CRAN except tikzDevice which is on R-Forge for the time being)
\item [{\LaTeX{}}] animate, hyperref and the tufte-handout class
\item [{Other}] GGobi, pdfcrop
\end{description}
\end{document}