%% LyX 2.2.1 created this file. For more info, see http://www.lyx.org/. %% Do not edit unless you really know what you are doing. \documentclass{article} \usepackage[sc]{mathpazo} \renewcommand{\sfdefault}{lmss} \renewcommand{\ttdefault}{lmtt} \usepackage[T1]{fontenc} \usepackage{geometry} \geometry{verbose,tmargin=2.5cm,bmargin=2.5cm,lmargin=2.5cm,rmargin=2.5cm} \setcounter{secnumdepth}{2} \setcounter{tocdepth}{2} \usepackage{url} \usepackage[authoryear]{natbib} \usepackage[unicode=true,pdfusetitle, bookmarks=true,bookmarksnumbered=true,bookmarksopen=true,bookmarksopenlevel=2, breaklinks=false,pdfborder={0 0 1},backref=false,colorlinks=false] {hyperref} \hypersetup{ pdfstartview={XYZ null null 1}} \usepackage{breakurl} \makeatletter %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands. \providecommand{\LyX}{\texorpdfstring% {L\kern-.1667em\lower.25em\hbox{Y}\kern-.125emX\@} {LyX}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands. \renewcommand{\textfraction}{0.05} \renewcommand{\topfraction}{0.8} \renewcommand{\bottomfraction}{0.8} \renewcommand{\floatpagefraction}{0.75} \usepackage[buttonsize=1em]{animate} \makeatother \begin{document} <>= library(knitr) ## set global chunk options opts_chunk$set(fig.path='figure/manual-', cache.path='cache/manual-', fig.align='center', fig.show='hold', par=TRUE) ## I use = but I can replace it with <-; set code/output width to be 68 options(formatR.arrow=TRUE, width=68, digits=4) ## tune details of base graphics (https://yihui.org/knitr/hooks/) knit_hooks$set(par=function(before, options, envir){ if (before && options$fig.show!='none') par(mar=c(4,4,.1,.1),cex.lab=.95,cex.axis=.9,mgp=c(2,.7,0),tcl=-.3) }) @ \title{knitr: A General-Purpose Tool for Dynamic Report Generation in R} \author{Yihui Xie} \maketitle The original paradigm of literate programming was brought forward mainly for software development, or specifically, to mix source code (for computer) and documentation (for human) together. Early systems include \href{http://www.literateprogramming.com/web.pdf}{WEB} and \href{http://www.cs.tufts.edu/~nr/noweb/}{Noweb}; Sweave \citep{leisch2002} was derived from the latter, but it is less focused on documenting software, instead it is mainly used for reproducible data analysis and generating statistical reports. The \textbf{knitr} package \citep{R-knitr} is following the steps of Sweave. For this manual, I assume readers have some background knowledge of Sweave to understand the technical details; for a reference of available options, hooks and demos, see the package homepage \url{https://yihui.org/knitr/}. \section{Hello World} A natural question is why to reinvent the wheel. The short answer is that extending Sweave by hacking \textsf{SweaveDrivers.R} in the \textbf{utils} package is a difficult job to me. Many features in \textbf{knitr} come naturally as users would have expected. Figure \ref{fig:cars-demo} is a simple demo of some features of \textbf{knitr}. \begin{figure} <>= fit=lm(dist~speed,data=cars) # linear regression par(mar=c(4, 4, 1, .1), mgp=c(2,1,0)) with(cars,plot(speed,dist,panel.last=abline(fit))) text(10,100,'$Y = \\beta_0 + \\beta_1x + \\epsilon$') library(ggplot2) qplot(speed, dist, data=cars)+geom_smooth() @ \caption{\label{fig:cars-demo}A simple demo of possible output in \textbf{knitr}: (1) multiple plots per chunk; (2) no need to \emph{print()} objects in \textbf{ggplot2}; (3) device size is $4\times2.8$ (inches) but output size is adjusted to \texttt{.45\textbackslash{}textwidth} in chunk options; (4) base graphics and \textbf{ggplot2} can sit side by side; (5) use the \emph{tikz()} device in \textbf{tikzDevice} by setting chunk option \texttt{dev='tikz'} (hence can write native \protect\LaTeX{} expressions in R plots); (6) code highlighting.} \end{figure} I would have chosen to hide the R code if this were a real report, but here I show the code just for the sake of demonstration. If we type \emph{qplot()} in R, we get a plot, and the same thing happens in \textbf{knitr}. If we draw two plots in the code, \textbf{knitr} will show two plots and we do not need to tell it how many plots are there in the code in advance. If we set \texttt{out.width='.49\textbackslash{}\textbackslash{}textwidth'} in chunk options, we get it in the final output document. If we say \texttt{fig.align='center'}, the plots are centered. That's it. Many enhancements and new features will be introduced later. If you come from the Sweave land, you can take a look at the page of transition first: \url{https://yihui.org/knitr/demo/sweave/}. \section{Design} The flow of processing an input file is similar to Sweave, and two major differences are that \textbf{knitr} provides more flexibility to the users to customize the processing, and has many built-in options such as the support to a wide range of graphics devices and cache. Below is a brief description of the process: \begin{enumerate} \item \textbf{knitr} takes an input file and automatically determines an appropriate set of \href{https://yihui.org/knitr/patterns/}{patterns} to use if they are not provided in advance (e.g. \textsf{file.Rnw} will use \texttt{knit\_patterns\$get('rnw')}); \item a set of output \href{https://yihui.org/knitr/hooks/}{hooks} will also be set up automatically according to the filename extension (e.g. use \LaTeX{} environments or HTML elements to wrap up R results); \item the input file is read in and split into pieces consisting of R code chunks and normal texts; the former will be executed one after the other, and the latter may contain global chunk options or inline R code; \item for each chunk, the code is evaluated using the \textbf{evaluate} package \citep{R-evaluate}, and the results may be filtered according to chunk options (e.g. \texttt{echo=FALSE} will remove the R source code) \begin{enumerate} \item if \texttt{cache=TRUE} for this chunk, \textbf{knitr} will first check if there are previously cached results under the cache directory before really evaluating the chunk; if cached results exist and this code chunk has not been changed since last run (use MD5 sum to verify), the cached results will be (lazy-) loaded, otherwise new cache will be built; if a cached chunk depends on other chunks (see the \texttt{dependson} \href{https://yihui.org/knitr/options/}{option}) and any one of these chunks has changed, this chunk must be forcibly updated (old cache will be purged) \item there are six types of possible output from \textbf{evaluate}, and their classes are \texttt{character} (normal text output), \texttt{source} (source code), \texttt{warning}, \texttt{message}, \texttt{error} and \texttt{recordedplot}; an internal S3 generic function \emph{wrap()} is used to deal with different types of output, using output hooks defined in the object \texttt{knit\_hooks} \item note plots are recorded as R objects before they are really saved to files, so graphics devices will not be opened unless plots have really been produced in a chunk \item a code chunk is evaluated in a separate empty environment with the global environment as its parent, and all the objects in this environment after the evaluation will be saved if \texttt{cache=TRUE} \item chunk hooks can be run before and/or after a chunk \end{enumerate} \item for normal texts, \textbf{knitr} will find inline R code (e.g. in \texttt{\textbackslash{}Sexpr\{\}}) and evaluate it; the output is wrapped by the \texttt{inline} hook; \end{enumerate} The hooks play important roles in \textbf{knitr}: this package makes almost everything accessible to the users. Consider the following extremely simple example which may demonstrate this freedom: <>= 1+1 @ There are two parts in the final output: the source code \texttt{1 + 1} and the output \texttt{{[}1{]} 2}; the comment characters \texttt{\#\#} are from the default chunk option \texttt{comment}. Users may define a hook function for the source code like this to use the \texttt{lstlisting} environment: <>= knit_hooks$set(source = function(x, options) { paste('\\begin{lstlisting}\n', x, '\\end{lstlisting}\n', sep = '') }) @ Similarly we can put other types of output into other environments. There is no need to hack at \textsf{Sweave.sty} for \textbf{knitr} and you can put the output in any environments. What is more, the output hooks make \textbf{knitr} ready for other types of output, and a typical one is HTML (there are built-in hooks). The website has provided many examples demonstrating the flexibility of the output. \section{Features} The \textbf{knitr} package borrowed features such as tikz graphics and cache from \textbf{pgfSweave} and \textbf{cacheSweave} respectively, but the implementations are different. New features like code reference from an external R script as well as output customization are also introduced. The feature of hook functions in Sweave is re-implemented and hooks have new usage now. There are several other small features which are motivated from my everyday use of Sweave. For example, a progress bar is provided when knitting a file so we roughly know how long we still need to wait; output from inline R code (e.g. \texttt{\textbackslash{}Sexpr\{x{[}1{]}\}}) is automatically formatted in \TeX{} math notation (like \Sexpr{123456789}) if the result is numeric. You may check out a number of specific manuals dedicated to specific features such as graphics in the website: \url{https://yihui.org/knitr/demo/}. \subsection{Code Decoration} The \textbf{highr} package \citep{R-highr} is used to highlight R code, and the \textbf{formatR} package \citep{R-formatR} is used to reformat R code (like \texttt{keep.source=FALSE} in Sweave but will also try to retain comments). For \LaTeX{} output, the \textbf{framed} package is used to decorate code chunks with a light gray background. If this \LaTeX{} package is not found in the system, a version will be copied directly from \textbf{knitr}. The prompt characters are removed by default because they mangle the R source code in the output and make it difficult to copy R code. The R output is masked in comments by default based on the same rationale. It is easy to revert to the output with prompts (set option \texttt{prompt=TRUE}), and you will quickly realize the inconvenience to the readers if they want to copy and run the code in the output document: <>= x=rnorm(5) x var(x) @ The example below shows the effect of \texttt{tidy=TRUE/FALSE}: <>= ## option tidy=FALSE for(k in 1:10){j=cos(sin(k)*k^2)+3;print(j-5)} @ <>= ## option tidy=TRUE for(k in 1:10){j=cos(sin(k)*k^2)+3;print(j-5)} @ Note \texttt{=} is replaced by \texttt{<-} because \texttt{options('formatR.arrow')} was set to be \texttt{TRUE} in this document; see the documentation of \emph{tidy.source()} in \textbf{formatR} for details. Many highlighting themes can be used in \textbf{knitr}, which are borrowed from the \textbf{highlight} package by \href{http://www.andre-simon.de/}{Andre Simon}\footnote{not the R package mentioned before; for a preview of these themes, see \url{http://www.andre-simon.de/dokuwiki/doku.php?id=theme_examples}}; it is also possible to use themes from \url{http://www.eclipsecolorthemes.org/} by providing a theme id to \textbf{knitr}\footnote{many thanks to \href{https://github.com/ramnathv}{Ramnath Vaidyanathan} for the work on themes}. See \texttt{?knit\_theme} for details. \subsection{Graphics} Graphics is an important part of reports, and several enhancements have been made in \textbf{knitr}. For example, grid graphics may not need to be explicitly printed as long as the same code can produce plots in R (in some cases, however, they have to be printed, e.g. in a loop, because you have to do so in an R terminal). \subsubsection{Graphical Devices} Over a long time, a frequently requested feature for Sweave was the support for other graphics devices, which has been implemented since R 2.13.0. Instead of using logical options like \texttt{png} or \texttt{jpeg} (this list can go on and on), \textbf{knitr} uses a single option \texttt{dev} (like \texttt{grdevice} in Sweave) which has support for more than 20 devices. For instance, \texttt{dev='png'} will use the \emph{png()} device, and \texttt{dev='CairoJPEG'} uses the \emph{CairoJPEG()} device in the \textbf{Cairo} package (it has to be installed first, of course). If none of these devices is satisfactory, you can provide the name of a customized device function, which must have been defined before it is called. \subsubsection{Plot Recording} As mentioned before, all the plots in a code chunk are first recorded as R objects and then ``replayed'' inside a graphical device to generate plot files. The \textbf{evaluate} package will record plots per \emph{expression} basis, in other words, the source code is split into individual complete expressions and \textbf{evaluate} will examine possible plot changes in snapshots after each single expression has been evaluated. For example, the code below consists of three expressions, out of which two are related to drawing plots, therefore \textbf{evaluate} will produce two plots by default: <>= par(mar=c(3,3,.1,.1)) plot(1:10, ann=FALSE,las=1) text(5,9,'mass $\\rightarrow$ energy\n$E=mc^2$') @ This brings a significant difference with traditional tools in R for dynamic report generation, since low-level plotting changes can also be recorded. The option \texttt{fig.keep} controls which plots to keep in the output; \texttt{fig.keep='all'} will keep low-level changes as separate plots; by default (\texttt{fig.keep='high'}), \textbf{knitr} will merge low-level plot changes into the previous high-level plot, like most graphics devices do. This feature may be useful for teaching R graphics step by step. Note, however, low-level plotting commands in a single expression (a typical case is a loop) will not be recorded accumulatively, but high-level plotting commands, regardless of where they are, will always be recorded. For example, this chunk will only produce 2 plots instead of 21 plots because there are 2 complete expressions: <>= plot(0,0,type='n',ann=FALSE) for(i in seq(0, 2*pi,length=20)) points(cos(i),sin(i)) @ But this will produce 20 plots as expected: <>= for(i in seq(0, 2*pi,length=20)) {plot(cos(i),sin(i),xlim=c(-1,1),ylim=c(-1,1))} @ As I showed in the beginning of this manual, it is straightforward to let \textbf{knitr} keep all the plots in a chunk and insert them into the output document, so we no longer need the \texttt{cat('\textbackslash{}\textbackslash{}includegraphics\{\}')} trick. We can discard all previous plots and keep the last one only by \texttt{fig.keep='last'}, or keep only the first plot by \texttt{fig.keep='first'}, or discard all plots by \texttt{fig.keep='none'}. \subsubsection{Plot Rearrangement} The option \texttt{fig.show} can decide whether to hold all plots while evaluating the code and ``flush'' all of them to the end of a chunk (\texttt{fig.show='hold'}), or just insert them to the place where they were created (by default \texttt{fig.show='asis'}). Here is an example of \texttt{fig.show='asis'}: <>= contour(volcano) # contour lines filled.contour(volcano) # fill contour plot with colors @ Beside \texttt{hold} and \texttt{asis}, the option \texttt{fig.show} can take a third value: \texttt{animate}, which makes it possible to insert animations into the output document. In \LaTeX{}, the package \textbf{animate} is used to put together image frames as an animation. For animations to work, there must be more than one plot produced in a chunk. The option \texttt{interval} controls the time interval between animation frames; by default it is 1 second. Note you have to add \texttt{\textbackslash{}usepackage\{animate\}} in the \LaTeX{} preamble, because \textbf{knitr} will not add it automatically. Animations in the PDF output can only be viewed in Adobe Reader. As a simple demonstration, here is a \href{http://en.wikipedia.org/wiki/Mandelbrot_set}{Mandelbrot animation} taken from the \textbf{animation} package \citep{R-animation}; note the PNG device is used because PDF files are too large. You should be able to see the animation immediately with Acrobat Reader since it was set to play automatically: <>= library(animation) demo('Mandelbrot', echo = FALSE, package = 'animation') @ \subsubsection{Plot Size} The \texttt{fig.width} and \texttt{fig.height} options specify the size of plots in the graphics device, and the real size in the output document can be different (see \texttt{out.width} and \texttt{out.height}). When there are multiple plots per chunk, it is possible to arrange more than one plot per line in \LaTeX{} \textendash{} just specify \texttt{out.width} to be less than half of the current line width, e.g. \texttt{out.width='.49\textbackslash{}\textbackslash{}linewidth'}. \subsubsection{The tikz Device} Beside PDF, PNG and other traditional R graphical devices, \textbf{knitr} has special support to tikz graphics via the \textbf{tikzDevice} package \citep{R-tikzDevice}, which is similar to \textbf{pgfSweave}. If we set the chunk option \texttt{dev='tikz'}, the \emph{tikz()} device in \textbf{tikzDevice} will be used to save plots. Options \texttt{sanitize} and \texttt{external} are related to the tikz device: see the documentation of \emph{tikz()} for details. Note \texttt{external=TRUE} in \textbf{knitr} has a different meaning with \textbf{pgfSweave} \textendash{} it means \texttt{standAlone=TRUE} in \emph{tikz()}, and the tikz graphics output will be compiled to PDF \emph{immediately} after it is created, so the ``externalization'' does not depend on the \textbf{tikz} package; to maintain consistency in (font) styles, \textbf{knitr} will read the preamble of the input document and use it in the tikz device. At the moment, I'm not sure if this is a faithful way to externalize tikz graphics, but I have not seen any problems so far. The assumption to make, however, is that you declare all the styles in the preamble; \textbf{knitr} is agnostic of \emph{local} style changes in the body of the document. Below is an example taken from StackOverflow\footnote{\url{http://stackoverflow.com/q/8190087/559676}}; we usually have to write R code like this to obtain a math expression $\mathrm{d}\mathbf{x}_{t}=\alpha[(\theta-\mathbf{x}_{t})\mathrm{d}t+4]\mathrm{d}B_{t}$ in R graphics: <>= qplot(1:10, 1:10) + opts(title = substitute(paste(d * bolditalic(x)[italic(t)] == alpha * (theta - bolditalic(x)[italic(t)]) * d * italic(t) + lambda * d * italic(B)[italic(t)]), list(lambda = 4))) @ With the tikz device, it is both straightforward and more beautiful: <>= library(ggplot2) qplot(1:10, 1:10) + labs(title = sprintf('$\\mathrm{d}\\mathbf{x}_{t} = \\alpha[(\\theta - \\mathbf{x}_{t})\\mathrm{d}t + %d]\\mathrm{d}B_{t}$', 4)) @ The advantage of tikz graphics is the consistency of styles\footnote{Users are encouraged to read the vignette of \textbf{tikzDevice}, which is the most beautiful vignette I have ever seen in R packages: \url{http://cran.r-project.org/web/packages/tikzDevice/vignettes/tikzDevice.pdf}}, and one disadvantage is that \LaTeX{} may not be able to handle too large tikz files (it can run out of memory). For example, an R plot with tens of thousands of graphical elements may fail to compile in \LaTeX{} if we use the tikz device. In such cases, we can switch to the PDF or PNG device, or reconsider our decision on the type of plots, e.g., a scatter plot with millions of points is usually difficult to read, and a contour plot or a hexagon plot showing the 2D density can be a better alternative (they are smaller in size). The graphics manual contains more detailed information and you can check it out in the \href{https://yihui.org/knitr/demo/graphics/}{website}. \subsection{Cache} The feature of cache is not a new idea \textendash{} both \textbf{cacheSweave} and \textbf{weaver} have implemented it based on Sweave, with the former using \textbf{filehash} and the latter using \textsf{.RData} images; \textbf{cacheSweave} also supports lazy-loading of objects based on \textbf{filehash}. The \textbf{knitr} package directly uses internal base R functions to save (\emph{tools:::makeLazyLoadDB()}) and lazy-load objects (\emph{lazyLoad()}). These functions are either undocumented or marked as internal, but as far as I understand, they are the tools to implement lazy-loading for packages. The \textbf{cacheSweave} vignette has clearly explained lazy-loading, and roughly speaking, lazy-loading means an object will not be really loaded into memory unless it is really used somewhere. This is very useful for cache; sometimes we read a large object and cache it, then take a subset for analysis and this subset is also cached; in the future, the initial large object will not be loaded into R if our computation is only based on the object of its subset. The paths of cache files are determined by the chunk option \texttt{cache.path}; by default all cache files are created under a directory \textsf{cache} relative to the current working directory, and if the option value contains a directory (e.g. \texttt{cache.path='cache/abc-'}), cache files will be stored under that directory (automatically created if it does not exist). The cache is invalidated and purged on any changes to the code chunk, including both the R code and chunk options\footnote{One exception is the \texttt{include} option, which is not cached because \texttt{include=TRUE/FALSE} does not affect code evaluation; meanwhile, the value \texttt{getOption('width')} is also cached, so if you change this option, the cache will also be invalidated (this option affects the width of text output)}; this means previous cache files of this chunk are removed (filenames are identified by the chunk label). Unlike \textbf{pgfSweave}, cache files will never accumulate since old cache files will always be removed in \textbf{knitr}. Unlike \textbf{weaver} or \textbf{cacheSweave}, \textbf{knitr} will try to preserve these side-effects: \begin{enumerate} \item printed results: meaning that any output of a code chunk will be loaded into the output document for a cached chunk, although it is not really evaluated. The reason is \textbf{knitr} also cache the output of a chunk as a character string. Note this means graphics output is also cached since it is part of the output. It has been a pain for me for a long time to have to lose output to gain cache; \item loaded packages: after the evaluation of each cached chunk, the list of packages used in the current R session is written to a file under the cache path named \textsf{\_\_packages}; next time if a cached chunk needs to be rebuilt, these packages will be loaded first. The reasons for caching package names are, it can be slow to load some packages, and a package might be loaded in a previous cached chunk which is not available to the next cached chunk when only the latter needs to be rebuilt. Note this only applies to cached chunks, and for uncached chunks, you must always use \emph{library()} to load packages explicitly; \end{enumerate} Although \textbf{knitr} tries to keep some side-effects, there are still other types of side-effects like setting \emph{par()} or \emph{options()} which are not cached. Users should be aware of these special cases, and make sure to clearly separate the code which is not meant to be cached to other chunks which are not cached, e.g., set all global options in the first chunk of a document and do not cache that chunk. Sometimes a cached chunk may need to use objects from other cached chunks, which can bring a serious problem \textendash{} if objects in previous chunks have changed, this chunk will not be aware of the changes and will still use old cached results, unless there is a way to detect such changes from other chunks. There is an option called \texttt{dependson} in \textbf{cacheSweave} which does this job. We can explicitly specify which other chunks this chunk depends on by setting an option like \texttt{dependson='chunkA;chunkB'} or equivalently \texttt{dependson=c('chunkA', 'chunkB')}. Each time the cache of a chunk is rebuilt, all other chunks which depend on this chunk will lose cache, hence their cache will be rebuilt as well. Another way to specify the dependencies among chunks is to use the chunk option \texttt{autodep} and the function \emph{dep\_auto()}. This is an experimental feature borrowed from \textbf{weaver} which frees us from setting chunk dependencies manually. The basic idea is, if a latter chunk uses any objects created from a previous chunk, the latter chunk is said to depend on the previous one. The function \emph{findGlobals()} in the \textbf{codetools} package is used to find out all global objects in a chunk, and according to its documentation, the result is an approximation. Global objects roughly mean the ones which are not created locally, e.g. in the expression \texttt{function() \{y <- x\}}, \texttt{x} should be a global object, whereas \texttt{y} is local. Meanwhile, we also need to save the list of objects created in each cached chunk, so that we can compare them to the global objects in latter chunks. For example, if chunk A created an object \texttt{x} and chunk B uses this object, chunk B must depend on A, i.e. whenever A changes, B must also be updated. When \texttt{autodep=TRUE}, \textbf{knitr} will write out the names of objects created in a cached chunk as well as those global objects in two files named \textsf{\_\_objects} and \textsf{\_\_globals} respectively; later we can use the function \emph{dep\_auto()} to analyze the object names to figure out the dependencies automatically. See \url{https://yihui.org/knitr/demo/cache/} for examples. Yet another way to specify dependencies is \emph{dep\_prev()}: this is a conservative approach which sets the dependencies so that a cached chunk will depend on all its previous chunks, i.e. whenever a previous chunk is updated, all later chunks will be updated accordingly. \subsection{Code Externalization} It can be more convenient to write R code in a separate file, rather than mixing it into a \LaTeX{} document; for example, we can run R code successively in a pure R script from one chunk to the other without jumping through other texts. Since I prefer using \LyX{} to write reports, Sweave is even more inconvenient because I have to recompile the whole document each time, even if I only want to know the results of a single chunk. Therefore \textbf{knitr} introduced the feature of code externalization to a separate R script. Currently the setting is like this: the R script also uses chunk labels (marked in the form \texttt{\#\# -{}-{}-{}- chunk-label} by default); if the code chunk in the input document is empty, \textbf{knitr} will match its label with the label in the R script to input external R code. For example, suppose this is a code chunk labelled as \texttt{Q1} in an R script named \textsf{homework1-xie.R} which is under the same directory as the Rnw document: <>= ## ---- Q1 --------------------- gcd = function(m, n) { while ((r <- m %% n) != 0) { m = n; n = r } n } @ In the Rnw document, we can first read the script using the function \emph{read\_chunk()}: <>= read_chunk('homework1-xie.R') @ This is usually done in an early chunk, and we can use the chunk \texttt{Q1} later in the Rnw document: <>= cat('<>=','@',sep='\n') @ Different documents can read the same R script, so the R code can be reusable across different input documents. \subsection{Evaluation of Chunk Options\label{subsec:conditional}} By default \textbf{knitr} uses a new syntax to parse chunk options: it treats them as function arguments instead of a text string to be split to obtain option values. This gives the user much more power than the old syntax; we can pass arbitrary R objects to chunk options besides simple ones like \texttt{TRUE}/\texttt{FALSE}, numbers and character strings. The page \url{https://yihui.org/knitr/demo/sweave/} has given two examples to show the advantages of the new syntax. Here we show yet another useful application. Before \textbf{knitr} 0.3, there was a feature named ``conditional evaluation''\footnote{request from \url{https://plus.google.com/u/0/116405544829727492615/posts/43WrRUffjzK}}. The idea is, instead of setting chunk options \texttt{eval} and \texttt{echo} to be \texttt{TRUE} or \texttt{FALSE} (constants), their values can be controlled by global variables in the current R session. This enables \textbf{knitr} to conditionally evaluate code chunks according to variables. For example, here we assign \texttt{TRUE} to a variable \texttt{dothis}: <>= dothis=TRUE @ In the next chunk, we set chunk options \texttt{eval=dothis} and \texttt{echo=!dothis}, both are valid R expressions since the variable \texttt{dothis} exists. As we can see, the source code is hidden, but it was indeed evaluated: <>= print('you cannot see my source because !dothis is FALSE') @ Then we set \texttt{eval=dothis} and \texttt{echo=dothis} for another chunk: <>= dothis @ If we change the value of \texttt{dothis} to \texttt{FALSE}, neither of the above chunks will be evaluated any more. Therefore we can control many chunks with a single variable, and present results selectively. This old feature requires \textbf{knitr} to treat \texttt{eval} and \texttt{echo} specially, and we can easily see that it is no longer necessary with the new syntax: \texttt{eval=dothis} will tell R to find the variable \texttt{dothis} automatically just like we call a function \texttt{foobar(eval = dothis)}. What is more, all options will be evaluated as R expressions unless they are already constants which do not need to be evaluated, so this old feature has been generalized to all other options naturally. \subsection{Customization} The \textbf{knitr} package is ready for customization. Both the patterns and hooks can be customized; see the package website for details. Here I show an example on how to save \textbf{rgl} plots \citep{R-rgl} using a customized hook function. First we define a hook named \texttt{rgl} using the function \emph{hook\_rgl()} in \textbf{rgl}: <>= library(rgl) knit_hooks$set(rgl = hook_rgl) head(hook_rgl) # the hook function is defined as this @ Then we only have to set the chunk option \texttt{rgl=TRUE}: <>= library(rgl) demo('bivar', package='rgl', echo=FALSE) par3d(zoom=.7) @ Due to the flexibility of output hooks, \textbf{knitr} supports several different output formats. The implementation is fairly easy, e.g., for \LaTeX{} we put R output in \texttt{verbatim} environments, and in HTML, it is only a matter of putting output in \texttt{div} layers. These are simply character string operations. Many demos in \url{https://yihui.org/knitr/demo/} show this idea clearly. This manual did not cover all the features of \textbf{knitr}, and users are encouraged to thumb through the website to know more possible features. \section{Editors} You can use any text editors to write the source documents, but some have built-in support for \textbf{knitr}. Both RStudio (\url{http://www.rstudio.org}) and \LyX{} (\url{http://www.lyx.org}) have full support for \textbf{knitr}, and you can compile the document to PDF with just one click. See \url{https://yihui.org/knitr/demo/rstudio/} and \url{https://yihui.org/knitr/demo/lyx/} respectively. It is also possible to support other editors like \href{https://yihui.org/knitr/demo/eclipse/}{Eclipse}, \href{https://yihui.org/knitr/demo/editors/}{Texmaker and WinEdt}; see the demo list in the website for configuration instructions. \section*{About This Document} This manual was written in \LyX{} and compiled with \textbf{knitr} (version \Sexpr{packageVersion('knitr')}). The \LyX{} source and the Rnw document exported from \LyX{} can be found under these directories: <>= system.file('examples', 'knitr-manual.lyx', package='knitr') # lyx source system.file('examples', 'knitr-manual.Rnw', package='knitr') # Rnw source @ You can use the function \emph{knit()} to knit the Rnw document (remember to put the two \textsf{.bib} files under the same directory), and you need to make sure all the R packages used in this document are installed: <>= install.packages(c('animation', 'rgl', 'tikzDevice', 'ggplot2')) @ Feedback and comments on this manual and the package are always welcome. Bug reports and feature requests can be sent to \url{https://github.com/yihui/knitr/issues}, and questions can be delivered to the \href{mailto:knitr@googlegroups.com}{mailing list} \url{https://groups.google.com/group/knitr}. % when knitr is updated, this chunk will be updated; why? <>= # write all packages in the current session to a bib file write_bib(c(.packages(), 'evaluate', 'formatR', 'highr'), file = 'knitr-packages.bib') @ \bibliographystyle{jss} \bibliography{knitr-manual,knitr-packages} \end{document}