112 lines
3.7 KiB
Plaintext
112 lines
3.7 KiB
Plaintext
---
|
||
title: "Fun with bignum: how RSA encryption works"
|
||
date: "`r Sys.Date()`"
|
||
vignette: >
|
||
%\VignetteEngine{knitr::rmarkdown}
|
||
%\VignetteIndexEntry{Fun with bignum: how RSA encryption works}
|
||
\usepackage[utf8]{inputenc}
|
||
output:
|
||
html_document
|
||
---
|
||
|
||
```{r setup, include=FALSE}
|
||
library(openssl)
|
||
knitr::opts_chunk$set(echo = TRUE)
|
||
```
|
||
|
||
|
||
Primitive types such as `int` or `double` store numbers in exactly 4 or 8 bytes, with finite precision. This suffices for most applications, but cryptography requires arithmetic on very large numbers, without loss of precision. Therefore OpenSSL uses a __bignum__ data type which holds arbitrary sized integers and implements all basic arithmetic and comparison operators such as `+`, `-`, `*`, `^`, `%%`, `%/%`, `==`, `!=`, `<`, `<=`, `>` and `>=`.
|
||
|
||
One special case, the [__modular exponent__](https://en.wikipedia.org/wiki/Modular_exponentiation) `a^b %% m` can be calculated using `bignum_mod_exp`, even when `b` is too large for calculating `a^b`.
|
||
|
||
```{r}
|
||
# create a bignum
|
||
y <- bignum("123456789123456789")
|
||
z <- bignum("D41D8CD98F00B204E9800998ECF8427E", hex = TRUE)
|
||
|
||
# size grows
|
||
print(y * z)
|
||
|
||
# Basic arithmetic
|
||
div <- z %/% y
|
||
mod <- z %% y
|
||
z2 <- div * y + mod
|
||
stopifnot(z2 == z)
|
||
stopifnot(div < z)
|
||
```
|
||
|
||
RSA involves a public key and a private key. The public key should be known by everyone and is used for encrypting messages. Messages encrypted with the public key can only be decrypted in a reasonable amount of time using the private key. In RSA, this asymmetry is based on the practical difficulty of factoring the product of two large prime numbers.
|
||
|
||
### RSA key generation
|
||
|
||
An RSA key-pair is generated as follows (adapted from [wikipedia](https://en.wikipedia.org/wiki/RSA_(cryptosystem))):
|
||
|
||
- Choose two distinct prime numbers $p$ and $q$. Keep these secret.
|
||
- Compute the product $n = p*q$. This $n$ value is public and used as the modulus.
|
||
- Compute $\phi(n) = (p − 1)(q − 1)$.
|
||
- Choose an integer $e$ smaller than $\phi(n)$ such that $e$ and $\phi(n)$ are coprime. OpenSSL always uses $65537$.
|
||
- Compute a value for $d$ such that $(d * e)\pmod{\phi(n)} = 1$.
|
||
|
||
OpenSSL has a key generator that does these things for us.
|
||
|
||
```{r}
|
||
(key <- rsa_keygen(512))
|
||
(pubkey <- key$pubkey)
|
||
```
|
||
|
||
Usually we would use `rsa_encrypt` and `rsa_decrypt` to perform the encryption:
|
||
|
||
```{r}
|
||
msg <- charToRaw("hello world")
|
||
ciphertext <- rsa_encrypt(msg, pubkey)
|
||
rawToChar(rsa_decrypt(ciphertext, key))
|
||
```
|
||
|
||
Let's look at how this works under the hood.
|
||
|
||
### How RSA encryption works
|
||
|
||
The `data` field of the private key extracts the underlying bignum integers:
|
||
|
||
```{r}
|
||
key$data
|
||
```
|
||
|
||
You can verify that the equations above hold for this key. The public key is simply a subset of the key which only contains $n$ and $e$:
|
||
|
||
```{r}
|
||
pubkey$data
|
||
```
|
||
|
||
In order to encrypt a message into ciphertext we have to treat the message data as an integer. The message cannot be larger than the key size. For example convert the text `hello world` into an integer:
|
||
|
||
```{r}
|
||
m <- bignum(charToRaw("hello world"))
|
||
print(m)
|
||
```
|
||
|
||
To encrypt this message $m$ into ciphertext $c$ we calculate $c = m^e\pmod n$. Using the public key from above:
|
||
|
||
```{r}
|
||
e <- pubkey$data$e
|
||
n <- pubkey$data$n
|
||
c <- (m ^ e) %% n
|
||
print(c)
|
||
```
|
||
|
||
This number represents our encrypted message! It is usually exchanged using base64 notation for human readability:
|
||
|
||
```{r}
|
||
base64_encode(c)
|
||
```
|
||
|
||
The ciphertext can be decrypted using $d$ from the corresponding private key via $m = c^d \pmod{n}$. Note that `c^d` is too large to calculate directly so we need to use `bignum_mod_exp` instead.
|
||
|
||
```{r}
|
||
d <- key$data$d
|
||
out <- bignum_mod_exp(c, d, n)
|
||
rawToChar(out)
|
||
```
|
||
|
||
The only difference with the actual `rsa_encrypt` and `rsa_decrypt` functions is that these add some additional padding to the data.
|