338 lines
14 KiB
Plaintext
338 lines
14 KiB
Plaintext
---
|
||
title: "An informal introduction to async programming"
|
||
output: rmarkdown::html_vignette
|
||
vignette: >
|
||
%\VignetteEngine{knitr::rmarkdown}
|
||
%\VignetteIndexEntry{An informal introduction to async programming}
|
||
%\VignetteEncoding{UTF-8}
|
||
---
|
||
|
||
Hello, R and/or Shiny user! Let’s talk about async programming!
|
||
|
||
**Async programming? Sounds complicated.**
|
||
|
||
It is, very! You may want to grab some coffee.
|
||
|
||
**Ugh. Tell me why I even need to know this?**
|
||
|
||
Async programming is a major new addition to Shiny that can make certain classes
|
||
of apps dramatically more responsive under load.
|
||
|
||
Because R is single threaded (i.e. it can only do one thing at a time), a given
|
||
Shiny app process can also only do one thing at a time: if it is fitting a
|
||
linear model for one client, it can’t simultaneously serve up a CSV download for
|
||
another client.
|
||
|
||
For many Shiny apps, this isn’t a big problem; if no one processing step takes
|
||
very long, then no client has to wait an undue amount of time before they start
|
||
seeing results. But for apps that perform long-running operations — either
|
||
expensive computations that take a while to complete, or waiting on slow network
|
||
operations like database or web API queries — your users’ experience can suffer
|
||
dramatically as traffic ramps up. Operations that normally are lightning quick,
|
||
like downloading a small JavaScript file, can get stuck in traffic behind
|
||
something slow.
|
||
|
||
**Oh, OK—more responsiveness is always good. But you said this’ll only help for
|
||
certain classes of Shiny apps?**
|
||
|
||
It’s mostly helpful for apps that have a few specific operations that take a
|
||
long time, rather than lots of little operations that are all a bit slow on
|
||
their own and add up to one big slow mess. We’re looking for watermelons, not
|
||
blueberries.
|
||
|
||
**Watermelons… sure. So then, how does this all work?**
|
||
|
||
It all starts with *async functions*. An async function is one that performs an
|
||
operation that takes a long time, yet returns control to you immediately.
|
||
Whereas a normal function like `read.csv` will not return until its work is done
|
||
and it has the value you requested, an asynchronous `read.csv.async` function
|
||
would kick off the CSV reading operation, but then return immediately, long
|
||
before the real work has actually completed.
|
||
|
||
```r
|
||
library(future)
|
||
plan(multisession)
|
||
|
||
read.csv.async <- function(file, header = TRUE, stringsAsFactors = FALSE) {
|
||
future_promise({
|
||
read.csv(file, header = header, stringsAsFactors = stringsAsFactors)
|
||
})
|
||
}
|
||
```
|
||
|
||
(Don't worry about what this definition means for now. You'll learn more about defining async functions in [Launching tasks](promises_04_futures.html) and [Advanced `future` and `promises` usage](promises_05_future_promise.html).)
|
||
|
||
**So instead of “read this CSV file” it’s more like “begin reading this CSV
|
||
file”?**
|
||
|
||
Yes! That’s what async functions do: they start things, and give you back a
|
||
special object called a *promise*. If it doesn’t return a promise, it’s not an
|
||
async function.
|
||
|
||
**Oh, I’ve heard of promises in R! From [the NSE
|
||
chapter](http://adv-r.had.co.nz/Computing-on-the-language.html) in Hadley’s
|
||
Advanced R book!**
|
||
|
||
Ah... this is awkward, but no. I’m using the word “promise”, but I’m not referring
|
||
to *that* kind of promise. For the purposes of async programming, try to forget
|
||
that you’ve ever heard of that kind of promise, OK?
|
||
|
||
I know it seems needlessly confusing, but the promises we’re talking about here
|
||
are ~~shamelessly copied from~~ directly inspired by a central abstraction in modern
|
||
JavaScript, and the JS folks named them “promises”.
|
||
|
||
**Fine, whatever. So what are these promises?**
|
||
|
||
Conceptually, they’re a stand-in for the *eventual result* of the operation. For
|
||
example, in the case of our `read.csv.async` function, the
|
||
promise is a stand-in for a data frame. At some point, the operation is going to
|
||
finish, and a data frame is going to become available. The promise gives us a
|
||
way to get at that value.
|
||
|
||
**Let me guess: it’s an object that has `has_completed()` and
|
||
`get_value()` methods?**
|
||
|
||
Good guess, but no. Promises are *not* a way to directly inquire about the
|
||
status of an operation, nor to directly retrieve the result value. That is
|
||
probably the simplest and most obvious way to build an async framework, but in
|
||
practice it’s very difficult to build deeply async programs with an API like
|
||
that.
|
||
|
||
Instead, a promise lets you *chain together operations* that should be performed
|
||
whenever the operation completes. These operations might have side effects (like
|
||
plotting, or writing to disk, or printing to the console) or they might
|
||
transform the result values somehow.
|
||
|
||
**Chain together operations? Using the `%>%` operator?**
|
||
|
||
A lot like that! You can’t use the `%>%` operator itself, but we provide a
|
||
promise-compatible version of it: `%...>%`. So whereas you might do this to a
|
||
regular data frame:
|
||
|
||
```r
|
||
library(dplyr)
|
||
read.csv("https://rstudio.github.io/promises/data.csv") %>%
|
||
filter(state == "NY") %>%
|
||
View()
|
||
```
|
||
|
||
The async version would look like:
|
||
|
||
```r
|
||
library(dplyr)
|
||
read.csv.async("https://rstudio.github.io/promises/data.csv") %...>%
|
||
filter(state == "NY") %...>%
|
||
View()
|
||
```
|
||
|
||
The `%...>%` operator here is the secret sauce. It’s called the *promise pipe*;
|
||
the `...` stands for promise, and `>` mimics the standard pipe operator.
|
||
|
||
**What a strange looking operator. Does it work just like a regular pipe?**
|
||
|
||
In many ways `%...>%` does work like a regular pipe: it rewrites each stage’s
|
||
function call to take the previous stage’s output as the first argument. (All
|
||
the [standard magrittr
|
||
tricks](https://CRAN.R-project.org/package=magrittr/vignettes/magrittr.html)
|
||
apply here: `.`, `{`, parenthesized lambdas, etc.) But the differences, while
|
||
subtle, are profound.
|
||
|
||
The first and most important difference is that `%...>%` *must* take a promise
|
||
as input; that is, the left-hand side of the operator must be an expression that
|
||
yields a promise. The `%...>%` will do the work of “extracting” the result value
|
||
from the promise, and passing that (unwrapped) result to the function call on
|
||
the right-hand side.
|
||
|
||
This last fact—that `%...>%` passes an unwrapped, plain old, not-a-promise value
|
||
to the right-hand side—is critically important. It means we can use promise
|
||
objects with non-promise-aware functions, with `%...>%` serving as the bridge
|
||
between asynchronous and synchronous code.
|
||
|
||
**So the left-hand side of `%...>%` needs to be one of these special promise
|
||
objects, but the right-hand side can be regular R base functions?**
|
||
|
||
Yes! R base functions, dplyr, ggplot2, or whatever.
|
||
|
||
However, that work often can’t be done in the present, since the whole point of
|
||
a promise is that it represents work that hasn’t completed yet. So `%...>%` does
|
||
the work of extracting and piping not at the time that it’s called, but rather,
|
||
sometime in the future.
|
||
|
||
**You lost me.**
|
||
|
||
OK, let’s slow down and take this step by step. We’ll generate a promise by
|
||
calling an async function:
|
||
|
||
```r
|
||
df_promise <- read.csv.async("https://rstudio.github.io/promises/data.csv")
|
||
```
|
||
|
||
Even if `data.csv` is many gigabytes, `read.csv.async` returns immediately with
|
||
a new promise. We store it as `df_promise`. Eventually, when the CSV reading
|
||
operation successfully completes, the promise will contain a data frame, but for
|
||
now it’s just an empty placeholder.
|
||
|
||
One thing we definitely *can’t* do is treat `df_promise` as if it’s simply a
|
||
data frame:
|
||
|
||
```r
|
||
# Doesn't work!
|
||
dplyr::filter(df_promise, state == "NY")
|
||
```
|
||
|
||
Try this and you’ll get an error like `no applicable method for 'filter_'
|
||
applied to an object of class "promise"`. And the pipe won’t help you either;
|
||
`df_promise %>% filter(state == "NY")` will give you the same error.
|
||
|
||
**Right, that makes sense. `filter` is designed to work on data frames, and
|
||
`df_promise` isn’t a data frame.**
|
||
|
||
Exactly. Now let’s try something that actually works:
|
||
|
||
```r
|
||
df_promise %...>% filter(state == "NY")
|
||
```
|
||
|
||
At the moment it’s called, this code won’t appear to do much of anything,
|
||
really. But whenever the `df_promise` operation actually completes successfully,
|
||
then the result of that operation—the plain old data frame—will be passed to
|
||
`filter(., state = "NY")`.
|
||
|
||
**OK, so that’s good. I see what you mean about `%...>%` letting you use
|
||
non-promise functions with promises. But the whole point of using the
|
||
`filter` function is to get a data frame back. If `filter` isn’t even
|
||
going to be called until some random time in the future, how do we get its value
|
||
back?**
|
||
|
||
I’ll tell you the answer, but it’s not going to be satisfying at first.
|
||
|
||
When you use a regular `%>%`, the result you get back is the return value from
|
||
the right-hand side:
|
||
|
||
```r
|
||
df_filtered <- df %>% filter(state == "NY")
|
||
```
|
||
|
||
When you use `%...>%`, the result you get back is a promise, whose *eventual*
|
||
result will be the return value from the right-hand side:
|
||
|
||
```r
|
||
df_filtered_promise <- df_promise %...>% filter(state == "NY")
|
||
```
|
||
|
||
**Wait, what? If I have a promise, I can do stuff to it using `%...>%`, but
|
||
then I just end up with another promise? Why not just have `%...>%` return a
|
||
regular value instead of a promise?**
|
||
|
||
Remember, the whole point of a promise is that we don’t know its value yet! So
|
||
to write a function that uses a promise as input and returns some non-promise
|
||
value as output, you’d need to either be a time traveler or an oracle.
|
||
|
||
To summarize, once you start working with a promise, any calculations and
|
||
actions that are “downstream” of that promise will need to become
|
||
promise-oriented. Generally, this means once you have a promise, you need to use
|
||
`%...>%` and keep using it until your pipeline terminates.
|
||
|
||
**I guess that makes sense. Still, if the only thing you can do with promises is
|
||
make more promises, that limits their usefulness, doesn’t it?**
|
||
|
||
It’s a different way of thinking about things, to be sure, but it turns out
|
||
there’s not much limit in usefulness—especially in the context of a Shiny app.
|
||
|
||
First, you can use promises with Shiny outputs. If you’re using an
|
||
async-compatible version of Shiny (version >=1.1), all of the
|
||
built-in `renderXXX` functions can deal with either regular values or promises.
|
||
An example of the latter:
|
||
|
||
```r
|
||
output$table <- renderTable({
|
||
read.csv.async("https://rstudio.github.io/promises/data.csv") %...>%
|
||
filter(state == "NY")
|
||
})
|
||
```
|
||
|
||
When `output$table` executes the `renderTable` code block, it will notice that
|
||
the result is a promise, and wait for it to complete before continuing with the
|
||
table rendering. While it’s waiting, the R process can move on to do other
|
||
things.
|
||
|
||
Second, you can use promises with reactive expressions. Reactive expressions
|
||
treat promises about the same as they treat other values, actually. But this
|
||
works perfectly fine:
|
||
|
||
```r
|
||
# A reactive expression that returns a promise
|
||
filtered_df <- reactive({
|
||
read.csv.async("https://rstudio.github.io/promises/data.csv") %...>%
|
||
filter(state == "NY") %...>%
|
||
arrange(median_income)
|
||
})
|
||
|
||
# A reactive expression that reads the previous
|
||
# (promise-returning) reactive, and returns a
|
||
# new promise
|
||
top_n_by_income <- reactive({
|
||
filtered_df() %...>%
|
||
head(input$n)
|
||
})
|
||
|
||
output$table <- renderTable({
|
||
top_n_by_income()
|
||
})
|
||
```
|
||
|
||
Third, you can use promises in reactive observers. Use them to perform
|
||
asynchronous tasks in response to reactivity.
|
||
|
||
```r
|
||
observeEvent(input$save, {
|
||
filtered_df() %...>%
|
||
write.csv("ny_data.csv")
|
||
})
|
||
```
|
||
|
||
**Alright, I think I see what you mean. You can’t escape from promise-land, but
|
||
there’s no need to, because Shiny knows what to do with them.**
|
||
|
||
Yes, that’s basically right. You just need to keep track of which functions and
|
||
reactive expressions return promises instead of regular values, and be sure to
|
||
interact with them using `%...>%` or other promise-aware operators and
|
||
functions.
|
||
|
||
**Wait, there are other promise-aware operators and functions?**
|
||
|
||
Yes. The `%...>%` is the one you’ll most commonly use, but there is a variant
|
||
`%...T>%`, which we call the *promise tee* operator (it’s analogous to the
|
||
magrittr `%T>%` operator). The `%...T>%` operator mostly acts like `%...>%`, but
|
||
instead of returning a promise for the result value, it returns the original
|
||
value instead. Meaning `p %...T>% cat("\n")` won’t return a promise for the
|
||
return value of `cat()` (which is always `NULL`) but instead the value of `p`.
|
||
This is useful for logging, or other “side effecty” operations.
|
||
|
||
There’s also `%...!%`, and its tee version, `%...T!%`, which are used for error
|
||
handling. I won’t confuse you with more about that now, but you can read more
|
||
[here](promises_03_overview.html#error-handling).
|
||
|
||
The `promises` package is where all of these operators live, and it also comes
|
||
with some additional functions for working with promises.
|
||
|
||
So far, the only actual async function we’ve talked about has been
|
||
`read.csv.async`, which doesn’t actually exist. To learn where actual async
|
||
functions come from, read [this guide to the `future` package](promises_04_futures.html).
|
||
|
||
There are the lower-level functions `then`, `catch`, and `finally`, which are
|
||
the non-pipe, non-operator equivalents of the promise operators we’ve been
|
||
discussing. See [reference](promises_03_overview.html#accessing-results-with-then).
|
||
|
||
And finally, there are `promise_all`, `promise_race`, and `promise_lapply`, used to combine
|
||
multiple promises into a single promise. Learn more about them [here](../reference/promise_all.html).
|
||
|
||
**OK, looks like I have a lot of stuff to read up on. And I’ll probably have to
|
||
reread this conversation a few times before it fully sinks in.**
|
||
|
||
Sorry. I told you it was complicated. If you make it through the rest of the guide, you’ll be 95% of the way there.
|
||
|
||
<div style="font-size: 20px; margin-top: 40px; text-align: right;">
|
||
Next: [Working with promises](promises_03_overview.html)
|
||
</div>
|