# Monthly Archives: January 2017

While idly thinking while heading back from the office, and then more later while thinking after dinner with my academic little brother Alex Walker and my future academic little sister-in-law Sara Schulz, we began to think about $2017$, the number.

## General Patterns

• 2017 is a prime number. 2017 is the 306th prime. The 2017th prime is 17539.
• As 2011 is also prime, we call 2017 a sexy prime.
• 2017 can be written as a sum of two squares,
$$2017 = 9^2 +44^2,$$
and this is the only way to write it as a sum of two squares.
• Similarly, 2017 appears as the hypotenuse of a primitive Pythagorean triangle,
$$2017^2 = 792^2 + 1855^2,$$
and this is the only such right triangle.
• 2017 is uniquely identified as the first odd prime that leaves a remainder of $2$ when divided by $5$, $13$, and $31$. That is,
$$2017 \equiv 2 \pmod {5, 13, 31}.$$
• In different bases,
\begin{align} (2017)_{10} &= (2681)_9 = (3741)_8 = (5611)_7 = (13201)_6 \notag \\ &= (31032)_5 = (133201)_4 = (2202201)_3 = (11111100001)_2 \notag \end{align}
The base $2$ and base $3$ expressions are sort of nice, including repetition.
Posted in Mathematics | Tagged , | 1 Comment

## Revealing zero in fully homomorphic encryption is a Bad Thing

When I was first learning number theory, cryptography seemed really fun and really practical. I thought elementary number theory was elegant, and that cryptography was an elegant application. As I continued to learn more about mathematics, and in particular modern mathematics, I began to realize that decades of instruction and improvement (and perhaps of more useful points of view) have simplified the presentation of elementary number theory, and that modern mathematics is less elegant in presentation.

Similarly, as I learned more about cryptography, I learned that though the basic ideas are very simple, their application is often very inelegant. For example, the basis of RSA follows immediately from Euler’s Theorem as learned while studying elementary number theory, or alternately from Lagrange’s Theorem as learned while studying group theory or abstract algebra. And further, these are very early topics in these two areas of study!

But a naive implementation of RSA is doomed (For that matter, many professional implementations have their flaws too). Every now and then, a very clever expert comes up with a new attack on popular cryptosystems, generating new guidelines and recommendations. Some guidelines make intuitive sense [e.g. don’t use too small of an exponent for either the public or secret keys in RSA], but many are more complicated or designed to prevent more sophisticated attacks [especially side-channel attacks].

In the summer of 2013, I participated in the ICERM IdeaLab working towards more efficient homomorphic encryption. We were playing with existing homomorphic encryption schemes and trying to come up with new methods. One guideline that we followed is that an attacker should not be able to recognize an encryption of zero. This seems like a reasonable guideline, but I didn’t really understand why, until I was chatting with others at the 2017 Joint Mathematics Meetings in Atlanta.

It turns out that revealing zero isn’t just against generally sound advice. Revealing zero is a capital B capital T Bad Thing.

## Basic Setup

For the rest of this note, I’ll try to identify some of this reasoning.

In a typical cryptosystem, the basic setup is as follows. Andrew has a message that he wants to send to Beatrice. So Andrew converts the message into a list of numbers $M$, and uses some sort of encryption function $E(\cdot)$ to encrypt $M$, forming a ciphertext $C$. We can represent this as $C = E(M)$. Andrew transmits $C$ to Beatrice. If an eavesdropper Eve happens to intercept $C$, it should be very hard for Eve to recover any information about the original message from $C$. But when Beatrice receives $C$, she uses a corresponding decryption function $D(\cdot)$ to decrypt $C$, $M = d(C)$.

Often, the encryption and decryption techniques are based on number theoretic or combinatorial primitives. Some of these have extra structure (or at least they do with basic implementation). For instance, the RSA cryptosystem involves a public exponent $e$, a public mod $N$, and a private exponent $d$. Andrew encrypts the message $M$ by computing $C = E(M) \equiv M^e \bmod N$. Beatrice decrypts the message by computing $M = C^d \equiv M^{ed} \bmod N$.

Notice that in the RSA system, given two messages $M_1, M_2$ and corresponding ciphertexts $C_1, C_2$, we have that
\begin{equation}
E(M_1 M_2) \equiv (M_1 M_2)^e \equiv M_1^e M_2^e \equiv E(M_1) E(M_2) \pmod N. \notag
\end{equation}
The encryption function $E(\cdot)$ is a group homomorphism. This is an example of extra structure.

A fully homomorphic cryptosystem has an encryption function $E(\cdot)$ satisfying both $E(M_1 + M_2) = E(M_1) + E(M_2)$ and $E(M_1M_2) = E(M_1)E(M_2)$ (or more generally an analogous pair of operations). That is, $E(\cdot)$ is a ring homomorphism.

This extra structure allows for (a lot of) extra utility. A fully homomorphic $E(\cdot)$ would allow one to perform meaningful operations on encrypted data, even though you can’t read the data itself. For example, a clinic could store (encrypted) medical information on an external server. A doctor or nurse could pull out a cellphone or tablet with relatively little computing power or memory and securely query the medical data. Fully homomorphic encryption would allow one to securely outsource data infrastructure.

A different usage model suggests that we use a different mental model. So suppose Alice has sensitive data that she wants to store for use on EveCorp’s servers. Alice knows an encryption method $E(\cdot)$ and a decryption method $D(\cdot)$, while EveCorp only ever has mountains of ciphertexts, and cannot read the data [even though they have it].

## Why revealing zero is a Bad Thing

Let us now consider some basic cryptographic attacks. We should assume that EveCorp has access to a long list of plaintext messages $M_i$ and their corresponding ciphertexts $C_i$. Not everything, but perhaps from small leaks or other avenues. Among the messages $M_i$ it is very likely that there are two messages $M_1, M_2$ which are relatively prime. Then an application of the Euclidean Algorithm gives a linear combination of $M_1$ and $M_2$ such that
\begin{equation}
M_1 x + M_2 y = 1 \notag
\end{equation}
for some integers $x,y$. Even though EveCorp doesn’t know the encryption method $E(\cdot)$, since we are assuming that they have access to the corresponding ciphertexts $C_1$ and $C_2$, EveCorp has access to an encryption of $1$ using the ring homomorphism properties:
\begin{equation}\label{eq:encryption_of_one}
E(1) = E(M_1 x + M_2 y) = x E(M_1) + y E(M_2) = x C_1 + y C_2.
\end{equation}
By multiplying $E(1)$ by $m$, EveCorp has access to a plaintext and encryption of $m$ for any message $m$.

Now suppose that EveCorp can always recognize an encryption of $0$. Then EveCorp can mount a variety of attacks exposing information about the data it holds.

For example, EveCorp can test whether a particular message $m$ is contained in the encrypted dataset. First, EveCorp generates a ciphertext $C_m$ for $m$ by multiplying $E(1)$ by $m$, as in \eqref{eq:encryption_of_one}. Then for each ciphertext $C$ in the dataset, EveCorp computes $C – C_m$. If $m$ is contained in the dataset, then $C – C_m$ will be an encryption of $0$ for the $C$ corresponding to $m$. EveCorp recognizes this, and now knows that $m$ is in the data. To be more specific, perhaps a list of encrypted names of medical patients appears in the data, and EveCorp wants to see if JohnDoe is in that list. If they can recognize encryptions of $0$, then EveCorp can access this information.

And thus it is unacceptable for external entities to be able to consistently recognize encryptions of $0$.

Up to now, I’ve been a bit loose by saying “an encryption of zero” or “an encryption of $m$”. The reason for this is that to protect against recognition of encryptions of $0$, some entropy is added to the encryption function $E(\cdot)$, making it multivalued. So if we have a message $M$ and we encrypt it once to get $E(M)$, and we encrypt $M$ later and get $E'(M)$, it is often not true that $E(M) = E'(M)$, even though they are both encryptions of the same message. But these systems are designed so that it is true that $C(E(M)) = C(E'(M)) = M$, so that the entropy doesn’t matter.

This is a separate matter, and something that I will probably return to later.

## My Teaching

I am currently not teaching anything. Though, I am supervising a few students at the University of Warwick, I have no teaching responsibilities here.

In fall 2016, I taught Math 100 (second semester calculus, starting with integration by parts and going through sequences and series) at Brown University. Here are my concluding remarks.

In spring 2016, I designed and taught Math 42 (elementary number theory) at Brown University. My students were exceptional — check out a showcase of some of their final projects. Here are my concluding remarks.

In fall 2014, I taught Math 170 (advanced placement second semester calculus) at Brown University.

I taught number theory in the Summer@Brown program for high school students in the summers of 2013-2015.

I taught a privately requested course in precalculus in the summer of 2013.

I have served as a TA (many, many, many times) for

• Math 90 (first semester calculus) at Brown University
• Math 100 (second semester calculus) at Brown University
• Math 1501 (first semester calculus) at Georgia Tech
• Math 1502 (second semester calculus, starting with sequences and series but also with 7 weeks of linear algebra) at Georgia Tech
• Math 2401 (multivariable calculus) at Georgia Tech (there’s essentially no content on this site about this – this was just before I began to maintain a website)

I sometimes tutor at Brown (but not limited to Brown students) and around Boston, on a wide variety of topics (not just the ordinary, boring ones). I charge \$80/hour, but I am not currently looking for tutees.

Below, you can find my most recent posts tagged under “Teaching”.

I maintain the following programming projects:

LMFDB: (source), the code running the L-functions and Modular Forms website LMFDB. I’m one of the major contributors to the project, and the grant supporting my postdoc at the University of Warwick includes support for me to contribute to the LMFDB.

HNRSS: (source), a HackerNews RSS generator written in python. HNRSS periodically updates RSS feeds from the HN frontpage and best list. It also attempts to automatically summarize the link (if there is a link) and includes the top five comments, all to make it easier to determine whether it’s worth checking out.

LaTeX2Jax: (source), a tool to convert LaTeX documents to HTML with MathJax. This is a modification of the earlier MSE2WP, which converts Math.StackExchange flavored markdown to WordPress+MathJax compatible html. In particular, this is more general, and allows better control of the resulting html by exposing more CSS elements (that generically aren’t available on free WordPress setups). This is what is used for all math posts on this site.

MSE2WP: (source), a tool to convert Math.Stackexchange flavored markdown to WordPress+MathJax compatible html. This was once written for the Math.Stackexchange Community Blog. But as that blog is shutting down, there is much less of a purpose for this script. Note that this began as a modified version of latex2wp.

I actively contribute to:

vundle: (source), a minimalist but powerful vim plugin manager. Now that vim8 comes with a native plugin manager, there are fewer reasons to use vundle. But I find the vundle interface and code very straightforward. Users with lots of plugins, who frequently change plugins, or with complicated dependencies may want to consider vim-plug, which was inspired by vundle, but which takes a more sophisticated and complicated look at plugins and is often faster.

python-markdown2: (source),  a fast and complete python implementation of markdown, with a few additional features.

And I generally support or have contributed to:

SageMath: (main site), a free and open source system of tools for mathematics. Some think of it as a free alternative to the “Big M’s” — Maple, Mathematica, Magma.

Matplotlib: (main site), a plotting library in python. Most of the static plots on this site were creating using matplotlib.

crouton: (source), a tool for making Chromebooks, which by default are very limited in capability, into hackable linux laptops. This lets you directly run Linux on the device at the same time as having ChromeOS installed. The only cost is that there is absolutely no physical security at all (and every once in a while a ChromeOS update comes around and breaks lots of things). It’s great!

Below, you can find my most recent posts tagged “Programming” on this site.

I will note the following posts which have received lots of positive feedback.

1. A Notebook Preparing for a Talk at Quebec-Maine
2. A Brief Notebook on Cryptography
3. Computing pi with Tools from Calculus (which includes computational tidbits, though no actual programming).