345 lines
13 KiB
Plaintext
345 lines
13 KiB
Plaintext
|
% \VignetteIndexEntry{Getting Started with doParallel and foreach}
|
||
|
% \VignetteDepends{doParallel}
|
||
|
% \VignetteDepends{foreach}
|
||
|
% \VignettePackage{doParallel}
|
||
|
\documentclass[12pt]{article}
|
||
|
\usepackage{amsmath}
|
||
|
\usepackage[pdftex]{graphicx}
|
||
|
\usepackage{color}
|
||
|
\usepackage{xspace}
|
||
|
\usepackage{url}
|
||
|
\usepackage{fancyvrb}
|
||
|
\usepackage{fancyhdr}
|
||
|
\usepackage[
|
||
|
colorlinks=true,
|
||
|
linkcolor=blue,
|
||
|
citecolor=blue,
|
||
|
urlcolor=blue]
|
||
|
{hyperref}
|
||
|
\usepackage{lscape}
|
||
|
|
||
|
\usepackage{Sweave}
|
||
|
|
||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||
|
|
||
|
% define new colors for use
|
||
|
\definecolor{darkgreen}{rgb}{0,0.6,0}
|
||
|
\definecolor{darkred}{rgb}{0.6,0.0,0}
|
||
|
\definecolor{lightbrown}{rgb}{1,0.9,0.8}
|
||
|
\definecolor{brown}{rgb}{0.6,0.3,0.3}
|
||
|
\definecolor{darkblue}{rgb}{0,0,0.8}
|
||
|
\definecolor{darkmagenta}{rgb}{0.5,0,0.5}
|
||
|
|
||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||
|
|
||
|
\newcommand{\bld}[1]{\mbox{\boldmath $#1$}}
|
||
|
\newcommand{\shell}[1]{\mbox{$#1$}}
|
||
|
\renewcommand{\vec}[1]{\mbox{\bf {#1}}}
|
||
|
|
||
|
\newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize}
|
||
|
\newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize}
|
||
|
|
||
|
\newcommand{\halfs}{\frac{1}{2}}
|
||
|
|
||
|
\setlength{\oddsidemargin}{-.25 truein}
|
||
|
\setlength{\evensidemargin}{0truein}
|
||
|
\setlength{\topmargin}{-0.2truein}
|
||
|
\setlength{\textwidth}{7 truein}
|
||
|
\setlength{\textheight}{8.5 truein}
|
||
|
\setlength{\parindent}{0.20truein}
|
||
|
\setlength{\parskip}{0.10truein}
|
||
|
|
||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||
|
\pagestyle{fancy}
|
||
|
\lhead{}
|
||
|
\chead{Getting Started with doParallel and foreach}
|
||
|
\rhead{}
|
||
|
\lfoot{}
|
||
|
\cfoot{}
|
||
|
\rfoot{\thepage}
|
||
|
\renewcommand{\headrulewidth}{1pt}
|
||
|
\renewcommand{\footrulewidth}{1pt}
|
||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||
|
|
||
|
\title{Getting Started with doParallel and foreach}
|
||
|
\author{Steve Weston\footnote{Steve Weston wrote the original version of this vignette for the doMC package. Rich Calaway
|
||
|
adapted the vignette for doParallel.} and Rich Calaway}
|
||
|
|
||
|
|
||
|
\begin{document}
|
||
|
|
||
|
\maketitle
|
||
|
|
||
|
\thispagestyle{empty}
|
||
|
|
||
|
\section{Introduction}
|
||
|
|
||
|
The \texttt{doParallel} package is a ``parallel backend'' for the
|
||
|
\texttt{foreach} package. It provides a mechanism needed to execute
|
||
|
\texttt{foreach} loops in parallel. The \texttt{foreach} package must
|
||
|
be used in conjunction with a package such as \texttt{doParallel} in order to
|
||
|
execute code in parallel. The user must register a parallel backend to
|
||
|
use, otherwise \texttt{foreach} will execute tasks sequentially, even
|
||
|
when the \%dopar\% operator is used.\footnote{\texttt{foreach} will
|
||
|
issue a warning that it is running sequentially if no parallel backend
|
||
|
has been registered. It will only issue this warning once, however.}
|
||
|
|
||
|
The \texttt{doParallel} package acts as an interface between \texttt{foreach}
|
||
|
and the \texttt{parallel} package of R 2.14.0 and later. The \texttt{parallel}
|
||
|
package is essentially a merger of the \texttt{multicore} package, which was
|
||
|
written by Simon Urbanek, and the \texttt{snow} package, which was written
|
||
|
by Luke Tierney and others. The \texttt{multicore} functionality supports
|
||
|
multiple workers only on those operating systems that
|
||
|
support the \texttt{fork} system call; this excludes Windows. By default,
|
||
|
\texttt{doParallel} uses \texttt{multicore} functionality on Unix-like
|
||
|
systems and \texttt{snow} functionality on Windows. Note that
|
||
|
the \texttt{multicore} functionality only runs tasks on a single
|
||
|
computer, not a cluster of computers. However, you can use the
|
||
|
\texttt{snow} functionality to execute on a cluster, using Unix-like
|
||
|
operating systems, Windows, or even a combination.
|
||
|
It is pointless to use \texttt{doParallel} and \texttt{parallel}
|
||
|
on a machine with only one processor with a single core. To get a speed
|
||
|
improvement, it must run on a machine with multiple processors, multiple
|
||
|
cores, or both.
|
||
|
|
||
|
\section{A word of caution}
|
||
|
|
||
|
Because the \texttt{parallel} package in \texttt{multicore} mode
|
||
|
starts its workers using
|
||
|
\texttt{fork} without doing a subsequent \texttt{exec}, it has some
|
||
|
limitations. Some operations cannot be performed properly by forked
|
||
|
processes. For example, connection objects very likely won't work.
|
||
|
In some cases, this could cause an object to become corrupted, and
|
||
|
the R session to crash.
|
||
|
|
||
|
\section{Registering the \texttt{doParallel} parallel backend}
|
||
|
|
||
|
To register \texttt{doParallel} to be used with \texttt{foreach}, you must
|
||
|
call the \texttt{registerDoParallel} function. If you call this with no
|
||
|
arguments, on Windows you will get three workers and on Unix-like
|
||
|
systems you will get a number of workers equal to approximately half the
|
||
|
number of cores on your system. You can also specify a cluster
|
||
|
(as created by the \texttt{makeCluster} function) or a number of cores.
|
||
|
The \texttt{cores} argument specifies the number of worker
|
||
|
processes that \texttt{doParallel} will use to execute tasks, which will
|
||
|
by default be
|
||
|
equal to one-half the total number of cores on the machine. You don't need to
|
||
|
specify a value for it, however. By default, \texttt{doParallel} will use the
|
||
|
value of the ``cores'' option, as specified with
|
||
|
the standard ``options'' function. If that isn't set, then
|
||
|
\texttt{doParallel} will try to detect the number of cores, and use one-half
|
||
|
that many workers.
|
||
|
|
||
|
Remember: unless \texttt{registerDoMC} is called, \texttt{foreach} will
|
||
|
{\em not} run in parallel. Simply loading the \texttt{doParallel} package is
|
||
|
not enough.
|
||
|
|
||
|
\section{An example \texttt{doParallel} session}
|
||
|
|
||
|
Before we go any further, let's load \texttt{doParallel}, register it, and use
|
||
|
it with \texttt{foreach}. We will use \texttt{snow}-like functionality in this
|
||
|
vignette, so we start by loading the package and starting a cluster:
|
||
|
|
||
|
<<loadLibs>>=
|
||
|
library(doParallel)
|
||
|
cl <- makeCluster(2)
|
||
|
registerDoParallel(cl)
|
||
|
foreach(i=1:3) %dopar% sqrt(i)
|
||
|
@
|
||
|
<<echo=FALSE>>=
|
||
|
stopCluster(cl)
|
||
|
@
|
||
|
|
||
|
To use \texttt{multicore}-like functionality, we would specify the number
|
||
|
of cores to use instead (but note that on Windows, attempting to use more
|
||
|
than one core with \texttt{parallel} results in an error):
|
||
|
\begin{verbatim}
|
||
|
library(doParallel)
|
||
|
registerDoParallel(cores=2)
|
||
|
foreach(i=1:3) %dopar% sqrt(i)
|
||
|
\end{verbatim}
|
||
|
|
||
|
\begin{quote}
|
||
|
Note well that this is {\em not} a practical use of \texttt{doParallel}. This
|
||
|
is our ``Hello, world'' program for parallel computing. It tests that
|
||
|
everything is installed and set up properly, but don't expect it to run
|
||
|
faster than a sequential \texttt{for} loop, because it won't!
|
||
|
\texttt{sqrt} executes far too quickly to be worth executing in
|
||
|
parallel, even with a large number of iterations. With small tasks, the
|
||
|
overhead of scheduling the task and returning the result can be greater
|
||
|
than the time to execute the task itself, resulting in poor performance.
|
||
|
In addition, this example doesn't make use of the vector capabilities of
|
||
|
\texttt{sqrt}, which it must to get decent performance. This is just a
|
||
|
test and a pedagogical example, {\em not} a benchmark.
|
||
|
\end{quote}
|
||
|
|
||
|
But returning to the point of this example, you can see that it is very
|
||
|
simple to load \texttt{doParallel} with all of its dependencies
|
||
|
(\texttt{foreach}, \texttt{iterators}, \texttt{parallel}, etc), and to
|
||
|
register it. For the rest of the R session, whenever you execute
|
||
|
\texttt{foreach} with \texttt{\%dopar\%}, the tasks will be executed
|
||
|
using \texttt{doParallel} and \texttt{parallel}. Note that you can register
|
||
|
a different parallel backend later, or deregister \texttt{doParallel} by
|
||
|
registering the sequential backend by calling the \texttt{registerDoSEQ}
|
||
|
function.
|
||
|
|
||
|
\section{A more serious example}
|
||
|
|
||
|
Now that we've gotten our feet wet, let's do something a bit less
|
||
|
trivial. One good example is bootstrapping. Let's see how long it
|
||
|
takes to run 10,000 bootstrap iterations in parallel on
|
||
|
\Sexpr{getDoParWorkers()} cores:
|
||
|
|
||
|
<<echo=FALSE>>=
|
||
|
library(doParallel)
|
||
|
cl <- makeCluster(2)
|
||
|
registerDoParallel(cl)
|
||
|
@
|
||
|
<<bootpar>>=
|
||
|
x <- iris[which(iris[,5] != "setosa"), c(1,5)]
|
||
|
trials <- 10000
|
||
|
|
||
|
ptime <- system.time({
|
||
|
r <- foreach(icount(trials), .combine=cbind) %dopar% {
|
||
|
ind <- sample(100, 100, replace=TRUE)
|
||
|
result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
|
||
|
coefficients(result1)
|
||
|
}
|
||
|
})[3]
|
||
|
ptime
|
||
|
@
|
||
|
|
||
|
Using \texttt{doParallel} and \texttt{parallel} we were able to perform
|
||
|
10,000 bootstrap iterations in \Sexpr{ptime} seconds on
|
||
|
\Sexpr{getDoParWorkers()} cores. By changing the \texttt{\%dopar\%} to
|
||
|
\texttt{\%do\%}, we can run the same code sequentially to determine the
|
||
|
performance improvement:
|
||
|
|
||
|
<<bootseq>>=
|
||
|
stime <- system.time({
|
||
|
r <- foreach(icount(trials), .combine=cbind) %do% {
|
||
|
ind <- sample(100, 100, replace=TRUE)
|
||
|
result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
|
||
|
coefficients(result1)
|
||
|
}
|
||
|
})[3]
|
||
|
stime
|
||
|
@
|
||
|
|
||
|
|
||
|
The sequential version ran in \Sexpr{stime} seconds, which means the
|
||
|
speed up is about \Sexpr{round(stime / ptime, digits=1)} on
|
||
|
\Sexpr{getDoParWorkers()} workers.\footnote{If you build this vignette
|
||
|
yourself, you can see how well this problem runs on your hardware. None
|
||
|
of the times are hardcoded in this document. You can also run the same
|
||
|
example which is in the examples directory of the \texttt{doParallel}
|
||
|
distribution.} Ideally, the speed up would be \Sexpr{getDoParWorkers()},
|
||
|
but no multicore CPUs are ideal, and neither are the operating systems
|
||
|
and software that run on them.
|
||
|
|
||
|
At any rate, this is a more realistic example that is worth executing in
|
||
|
parallel. We do not explain what it's doing or how it works
|
||
|
here. We just want to give you something more substantial than the
|
||
|
\texttt{sqrt} example in case you want to run some benchmarks yourself.
|
||
|
You can also run this example on a cluster by simply reregistering
|
||
|
with a cluster object that specifies the nodes to use. (See the
|
||
|
\texttt{makeCluster} help file for more details.)
|
||
|
|
||
|
\section{Getting information about the parallel backend}
|
||
|
|
||
|
To find out how many workers \texttt{foreach} is going to use, you can
|
||
|
use the \texttt{getDoParWorkers} function:
|
||
|
|
||
|
<<getDoParWorkers>>=
|
||
|
getDoParWorkers()
|
||
|
@
|
||
|
|
||
|
This is a useful sanity check that you're actually running in parallel.
|
||
|
If you haven't registered a parallel backend, or if your machine only
|
||
|
has one core, \texttt{getDoParWorkers} will return one. In either case,
|
||
|
don't expect a speed improvement. \texttt{foreach} is clever, but it
|
||
|
isn't magic.
|
||
|
|
||
|
The \texttt{getDoParWorkers} function is also useful when you want the
|
||
|
number of tasks to be equal to the number of workers. You may want to
|
||
|
pass this value to an iterator constructor, for example.
|
||
|
|
||
|
You can also get the name and version of the currently registered
|
||
|
backend:
|
||
|
|
||
|
<<getDoParName>>=
|
||
|
getDoParName()
|
||
|
getDoParVersion()
|
||
|
@
|
||
|
<<echo=FALSE>>=
|
||
|
stopCluster(cl)
|
||
|
@
|
||
|
This is mostly useful for documentation purposes, or for checking that
|
||
|
you have the most recent version of \texttt{doParallel}.
|
||
|
|
||
|
\section{Specifying multicore options}
|
||
|
|
||
|
When using \texttt{multicore}-like functionality, the \texttt{doParallel} package allows
|
||
|
you to specify various options when
|
||
|
running \texttt{foreach} that are supported by the underlying
|
||
|
\texttt{mclapply} function: ``preschedule'', ``set.seed'', ``silent'',
|
||
|
and ``cores''. You can learn about these options from the
|
||
|
\texttt{mclapply} man page. They are set using the \texttt{foreach}
|
||
|
\texttt{.options.multicore} argument. Here's an example of how to do
|
||
|
that:
|
||
|
|
||
|
\begin{verbatim}
|
||
|
mcoptions <- list(preschedule=FALSE, set.seed=FALSE)
|
||
|
foreach(i=1:3, .options.multicore=mcoptions) %dopar% sqrt(i)
|
||
|
\end{verbatim}
|
||
|
|
||
|
The ``cores'' options allows you to temporarily override the number of
|
||
|
workers to use for a single \texttt{foreach} operation. This is more
|
||
|
convenient than having to re-register \texttt{doParallel}. Although if no
|
||
|
value of ``cores'' was specified when \texttt{doParallel} was registered, you
|
||
|
can also change this value dynamically using the \texttt{options}
|
||
|
function:
|
||
|
|
||
|
\begin{verbatim}
|
||
|
options(cores=2)
|
||
|
getDoParWorkers()
|
||
|
options(cores=3)
|
||
|
getDoParWorkers()
|
||
|
\end{verbatim}
|
||
|
|
||
|
If you did specify the number of cores when registering \texttt{doParallel},
|
||
|
the ``cores'' option is ignored:
|
||
|
|
||
|
\begin{verbatim}
|
||
|
registerDoParallel(4)
|
||
|
options(cores=2)
|
||
|
getDoParWorkers()
|
||
|
\end{verbatim}
|
||
|
|
||
|
As you can see, there are a number of options for controlling the number
|
||
|
of workers to use with \texttt{parallel}, but the default behaviour
|
||
|
usually does what you want.
|
||
|
|
||
|
\section{Stopping your cluster}
|
||
|
|
||
|
If you are using \texttt{snow}-like functionality, you will want to stop your
|
||
|
cluster when you are done using it. The \texttt{doParallel} package's
|
||
|
\texttt{.onUnload} function will do this automatically if the cluster was created
|
||
|
automatically by \texttt{registerDoParallel}, but if you created the cluster manually
|
||
|
you should stop it using the \texttt{stopCluster} function:
|
||
|
|
||
|
\begin{verbatim}
|
||
|
stopCluster(cl)
|
||
|
\end{verbatim}
|
||
|
|
||
|
\section{Conclusion}
|
||
|
|
||
|
The \texttt{doParallel} and \texttt{parallel} packages provide a nice,
|
||
|
efficient parallel programming platform for multiprocessor/multicore
|
||
|
computers running operating systems such as Linux and Mac OS X. It is
|
||
|
very easy to install, and very easy to use. In short order, an average
|
||
|
R programmer can start executing parallel programs, without any previous
|
||
|
experience in parallel computing.
|
||
|
|
||
|
\end{document}
|