265 lines
7.7 KiB
Plaintext
Raw Permalink Normal View History

2025-01-12 00:52:51 +08:00
---
title: "Simple variable interpolation"
author: "Zuguang Gu (z.gu@dkfz.de)"
date: '`r Sys.Date()`'
output:
html_document:
fig_caption: true
toc: true
toc_depth: 2
toc_float: true
vignette: >
%\VignetteIndexEntry{Simple variable interpolation}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
---------------------------------------------------------------------
```{r, echo = FALSE, message = FALSE}
library(markdown)
options(markdown.HTML.options = c(options('markdown.HTML.options')[[1]], "toc"))
library(knitr)
knitr::opts_chunk$set(
error = FALSE,
tidy = FALSE,
message = FALSE,
fig.align = "center")
options(markdown.HTML.stylesheet = "custom.css")
options(width = 100)
```
There are several ways to construct strings in **R** such as `paste()`.
However, when the string which is going to be constructed is too complex,
using `paste()` can be a pain. For example, we want to put some parameters as
title in a plot.
```{r}
region = c(1, 2)
value = 4
name = "name"
str = paste("region = (", region[1], ", ", region[2], "), value = ", value,
", name = '", name, "'", sep = "")
cat(str)
```
As you can see, it is hard to read and very easy to make mistakes. (Syntax
highlighting may be helpful to match brackets, but it is still quite annoying
to see so many commas and quotes.)
In **Perl**, we always use variable interpolation to construct complex strings
in which variables are started with special marks (sigil), and variables will
be replaced with their real values. In this package, we aim to implement
variable interpolation in R. The idea is rather simple: use special marks to
identify variables and then replace with their values. The function here is
`qq()` which is named from the subroutine with the same name in **Perl** (It
stands for double quote). Using variable interpolation, above example can be
written as:
```{r}
library(GetoptLong)
str = qq("region = (@{region[1]}, @{region[2]}), value = @{value}, name = '@{name}'")
cat(str)
```
Or use the shortcut function `qqcat()`:
```{r}
qqcat("region = (@{region[1]}, @{region[2]}), value = @{value}, name = '@{name}'")
```
One feature of `qqcat()` is you can set a global prefix to the messages by `qq.options("cat_prefix")`,
either a string or a function. If it is set as a function, the value will be generated at real time
by executing the function.
```{r}
qq.options("cat_prefix" = "[INFO] ")
qqcat("This is a message")
qq.options("cat_prefix" = function() format(Sys.time(), "[%Y-%m-%d %H:%M:%S] "))
qqcat("This is a message")
Sys.sleep(2)
qqcat("This is a message after 2 seconds")
qq.options("cat_prefix" = "")
qqcat("This is a message")
```
You can shut down all messages produced by `qqcat()` by `qq.options("cat_verbose" = FALSE)`.
```{r}
qq.options("cat_prefix" = "[INFO] ", "cat_verbose" = FALSE)
qqcat("This is a message")
```
Also you can set a prefix which has local effect.
```{r}
qq.options(RESET = TRUE)
qq.options("cat_prefix" = "[DEBUG] ")
qqcat("This is a message", cat_prefix = "[INFO] ")
qqcat("This is a message")
```
From version 1.1.2, `qq.options()` can work in a local mode in which
the copy of the options only work in a local chunk.
```{r}
qq.options("cat_prefix" = "[DEBUG] ")
qq.options(LOCAL = TRUE)
qq.options("cat_prefix" = "[INFO] ")
qqcat("This is the first message")
qqcat("This is the second message")
qq.options(LOCAL = FALSE)
qqcat("This is the third message")
```
Reset the options so that it does not affect example code in following part of the vignette.
```{r, eval = TRUE, results = 'hide', echo = TRUE}
qq.options(RESET = TRUE)
```
Not only simple scalars but also pieces of codes can be interpolated:
```{r}
n = 1
qqcat("There @{ifelse(n == 1, 'is', 'are')} @{n} dog@{ifelse(n == 1, '', 's')}.\n")
n = 2
qqcat("There @{ifelse(n == 1, 'is', 'are')} @{n} dog@{ifelse(n == 1, '', 's')}.\n")
```
If the text is too long, it can be wrapped into lines.
```{r}
qq.options("cat_strwrap" = TRUE)
qqcat("one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty.")
```
There can be multiple templates:
```{r}
a = 1; b = 2; c = 3
txt = qq("command -a @{a}",
" -b @{b}",
" -c @{c}\n", sep = " \\\n")
cat(txt)
```
**NOTE:** Since `qq` as the function name is very easy to be used by other packages
(E.g., in **lattice**, there is a `qq()` function as well) and if so, you can enforce
`qq()` in your working environment as the function in **GetoptLong** by:
```{r, eval = FALSE}
qq = GetoptLong::qq
```
## Code patterns
In above exmaple, `@{}` is used to mark variables. Later, variable names will be extracted from these marks
and replaced with their real values.
The marking code pattern can be any type. But you should make sure it is easy to tell the difference
from other part in the string. You can set your code pattern as an argument in `qq()`. The default pattern
is `@\\{CODE\\}` because we only permit `CODE`
to return simple vectors and `@` is a sigil representing array in **Perl**.
In following example, the code pattern is `#{}`.
```{r}
x = 1
qqcat("x = #{x}", code.pattern = "#\\{CODE\\}")
```
Or set in `qq.options()` as a global setting:
```{r, eval = FALSE}
qq.options("code.pattern" = "#\\{CODE\\}")
```
As you can guess, in `@\\{CODE\\}`, `CODE` will be replaced with `.*?` to construct a regular
expression and to match variable names in the string. So if your `code.pattern` contains special characters,
make sure to escape them. Some candidate `code.pattern` are:
```{r, eval = FALSE}
code.pattern = "@\\{CODE\\}" # default style
code.pattern = "@\\[CODE\\]"
code.pattern = "@\\(CODE\\)"
code.pattern = "%\\{CODE\\}"
code.pattern = "%\\[CODE\\]"
code.pattern = "%\\(CODE\\)"
code.pattern = "\\$\\{CODE\\}"
code.pattern = "\\$\\[CODE\\]"
code.pattern = "\\$\\(CODE\\)"
code.pattern = "#\\{CODE\\}"
code.pattern = "#\\[CODE\\]"
code.pattern = "#\\(CODE\\)"
code.pattern = "\\[%CODE%\\]" # Template Toolkit (Perl module) style :)
```
Since we just replace `CODE` to `.*?`, the function will only match to the first right parentheses/brackets.
(In **Perl**, I always use recursive regular expression to extract such pairing parentheses. But in **R**, it seems difficult.)
So, for example, if you are using `@\\[CODE\\]` and your string is `"@[a[1]]"`, it will fail to
extract the correct variable name while only extracts `a[1`, finally it generates an error when executing `a[1`. In such condition, you should use other pattern styles
that do not contain `[]`.
Finally, I suggest a more safe code pattern style that you do not need to worry about parentheses stuff:
```{r, eval = FALSE}
code.pattern = "`CODE`"
```
## Where to look for variables
It will first look up in the envoking environment, then through searching path. Users can also pass values of variables
as a list like:
```{r}
x = 1
y = 2
qqcat("x = @{x}, y = @{y}", envir = list(x = "a", y = "b"))
```
If variables are passed through list, `qq()` only looks up in the specified list.
## Variables should only return vectors
`qq()` only allows variables to return vectors. The whole string will be interpolated repeatedly according to longest vectors,
and finally concatenated into a single long string.
```{r}
x = 1:6
qqcat("@{x} is an @{ifelse(x %% 2, 'odd', 'even')} number.\n")
y = c("a", "b")
z = c("A", "B", "C", "D", "E")
qqcat("@{x}, @{y}, @{z}\n")
```
This feature is especially useful if you want to generate a report such as formatted in a HTML table:
```{r}
name = letters[1:4]
value = 1:4
qqcat("<tr><td>@{name}</td><td>@{value}</td><tr>\n")
```
The returned value can also be a vector while not collapsed into one string:
```{r}
str = qq("@{x}, @{y}, @{z}", collapse = FALSE)
length(str)
str
```
## Session info
```{r}
sessionInfo()
```