62 lines
1.8 KiB
Plaintext
62 lines
1.8 KiB
Plaintext
|
---
|
||
|
title: "Cryptographic Hashing in R"
|
||
|
date: "`r Sys.Date()`"
|
||
|
vignette: >
|
||
|
%\VignetteEngine{knitr::rmarkdown}
|
||
|
%\VignetteIndexEntry{Cryptographic Hashing in R}
|
||
|
\usepackage[utf8]{inputenc}
|
||
|
output:
|
||
|
html_document
|
||
|
---
|
||
|
|
||
|
```{r, echo = FALSE, message = FALSE}
|
||
|
knitr::opts_chunk$set(comment = "")
|
||
|
library(openssl)
|
||
|
```
|
||
|
|
||
|
The functions `sha1`, `sha256`, `sha512`, `md4`, `md5` and `ripemd160` bind to the respective [digest functions](https://docs.openssl.org/1.1.1/man1/openssl-dgst/) in OpenSSL's libcrypto. Both binary and string inputs are supported and the output type will match the input type.
|
||
|
|
||
|
```{r}
|
||
|
md5("foo")
|
||
|
md5(charToRaw("foo"))
|
||
|
```
|
||
|
|
||
|
Functions are fully vectorized for the case of character vectors: a vector with n strings will return n hashes.
|
||
|
|
||
|
```{r}
|
||
|
# Vectorized for strings
|
||
|
md5(c("foo", "bar", "baz"))
|
||
|
```
|
||
|
|
||
|
Besides character and raw vectors we can pass a connection object (e.g. a file, socket or url). In this case the function will stream-hash the binary contents of the connection.
|
||
|
|
||
|
```{r}
|
||
|
# Stream-hash a file
|
||
|
myfile <- system.file("CITATION")
|
||
|
md5(file(myfile))
|
||
|
```
|
||
|
|
||
|
Same for URLs. The hash of the [`R-installer.exe`](https://cran.r-project.org/bin/windows/base/old/4.0.0/R-4.0.0-win.exe) below should match the one in [`md5sum.txt`](https://cran.r-project.org/bin/windows/base/old/4.0.0/md5sum.txt)
|
||
|
|
||
|
```{r eval=FALSE}
|
||
|
# Stream-hash from a network connection
|
||
|
as.character(md5(url("https://cran.r-project.org/bin/windows/base/old/4.0.0/R-4.0.0-win.exe")))
|
||
|
|
||
|
# Compare
|
||
|
readLines('https://cran.r-project.org/bin/windows/base/old/4.0.0/md5sum.txt')
|
||
|
```
|
||
|
|
||
|
## Compare to digest
|
||
|
|
||
|
Similar functionality is also available in the **digest** package, but with a slightly different interface:
|
||
|
|
||
|
```{r}
|
||
|
# Compare to digest
|
||
|
library(digest)
|
||
|
digest("foo", "md5", serialize = FALSE)
|
||
|
|
||
|
# Other way around
|
||
|
digest(cars, skip = 0)
|
||
|
md5(serialize(cars, NULL))
|
||
|
```
|