We now have a variety of results concerning the behavior of the partial sums

$$ S_f(X) = \sum_{n \leq X} a(n) $$

where $f(z) = \sum_{n \geq 1} a(n) e(nz)$ is a GL(2) cuspform. The primary focus of our previous work was to understand the Dirichlet series

$$ D(s, S_f \times S_f) = \sum_{n \geq 1} \frac{S_f(n)^2}{n^s} $$

completely, give its meromorphic continuation to the plane (this was the major topic of the first paper in the series), and to perform classical complex analysis on this object in order to describe the behavior of $S_f(n)$ and $S_f(n)^2$ (this was done in the first paper, and was the major topic of the second paper of the series). One motivation for studying this type of problem is that bounds for $S_f(n)$ are analogous to understanding the error term in lattice point discrepancy with circles.

That is, let $S_2(R)$ denote the number of lattice points in a circle of radius $\sqrt{R}$ centered at the origin. Then we expect that $S_2(R)$ is approximately the area of the circle, plus or minus some error term. We write this as

$$ S_2(R) = \pi R + P_2(R),$$

where $P_2(R)$ is the error term. We refer to $P_2(R)$ as the “lattice point discrepancy” — it describes the discrepancy between the number of lattice points in the circle and the area of the circle. Determining the size of $P_2(R)$ is a very famous problem called the Gauss circle problem, and it has been studied for over 200 years. We believe that $P_2(R) = O(R^{1/4 + \epsilon})$, but that is not known to be true.

The Gauss circle problem can be cast in the language of modular forms. Let $\theta(z)$ denote the standard Jacobi theta series,

$$ \theta(z) = \sum_{n \in \mathbb{Z}} e^{2\pi i n^2 z}.$$

Then

$$ \theta^2(z) = 1 + \sum_{n \geq 1} r_2(n) e^{2\pi i n z},$$

where $r_2(n)$ denotes the number of representations of $n$ as a sum of $2$ (positive or negative) squares. The function $\theta^2(z)$ is a modular form of weight $1$ on $\Gamma_0(4)$, but it is not a cuspform. However, the sum

$$ \sum_{n \leq R} r_2(n) = S_2(R),$$

and so the partial sums of the coefficients of $\theta^2(z)$ indicate the number of lattice points in the circle of radius $\sqrt R$. Thus $\theta^2(z)$ gives access to the Gauss circle problem.

More generally, one can consider the number of lattice points in a $k$-dimensional sphere of radius $\sqrt R$ centered at the origin, which should approximately be the volume of that sphere,

$$ S_k(R) = \mathrm{Vol}(B(\sqrt R)) + P_k(R) = \sum_{n \leq R} r_k(n),$$

giving a $k$-dimensional lattice point discrepancy. For large dimension $k$, one should expect that the circle problem is sufficient to give good bounds and understanding of the size and error of $S_k(R)$. For $k \geq 5$, the true order of growth for $P_k(R)$ is known (up to constants).

Therefore it happens to be that the small (meaning 2 or 3) dimensional cases are both the most interesting, given our predilection for 2 and 3 dimensional geometry, and the most enigmatic. For a variety of reasons, the three dimensional case is very challenging to understand, and is perhaps even more enigmatic than the two dimensional case.

Strong evidence for the conjectured size of the lattice point discrepancy comes in the form of mean square estimates. By looking at the square, one doesn’t need to worry about oscillation from positive to negative values. And by averaging over many radii, one hopes to smooth out some of the individual bumps. These mean square estimates take the form

$$\begin{align}

\int_0^X P_2(t)^2 dt &= C X^{3/2} + O(X \log^2 X) \\

\int_0^X P_3(t)^2 dt &= C’ X^2 \log X + O(X^2 (\sqrt{ \log X})).

\end{align}$$

These indicate that the average size of $P_2(R)$ is $R^{1/4}$. and that the average size of $P_3(R)$ is $R^{1/2}$. In the two dimensional case, notice that the error term in the mean square asymptotic has pretty significant separation. It has essentially a $\sqrt X$ power-savings over the main term. But in the three dimensional case, there is no power separation. Even with significant averaging, we are only just capable of distinguishing a main term at all.

It is also interesting, but for more complicated reasons, that the main term in the three dimensional case has a log term within it. This is unique to the three dimensional case. But that is a description for another time.

In a paper that we recently posted to the arxiv, we show that the Dirichlet series

$$ \sum_{n \geq 1} \frac{S_k(n)^2}{n^s} $$

and

$$ \sum_{n \geq 1} \frac{P_k(n)^2}{n^s} $$

for $k \geq 3$ have understandable meromorphic continuation to the plane. Of particular interest is the $k = 3$ case, of course. We then investigate smoothed and unsmoothed mean square results. In particular, we prove a result stated following.

Theorem$$\begin{align} \int_0^\infty P_k(t)^2 e^{-t/X} &= C_3 X^2 \log X + C_4 X^{5/2} \\ &\quad + C_kX^{k-1} + O(X^{k-2} \end{align}$$

In this statement, the term with $C_3$ only appears in dimension $3$, and the term with $C_4$ only appears in dimension $4$. This should really thought of as saying that we understand the Laplace transform of the square of the lattice point discrepancy as well as can be desired.

We are also able to improve the sharp second mean in the dimension 3 case, showing in particular the following.

TheoremThere exists $\lambda > 0$ such that

$$\int_0^X P_3(t)^2 dt = C X^2 \log X + D X^2 + O(X^{2 – \lambda}).$$

We do not actually compute what we might take $\lambda$ to be, but we believe (informally) that $\lambda$ can be taken as $1/5$.

The major themes behind these new results are already present in the first paper in the series. The new ingredient involves handling the behavior on non-cuspforms at the cusps on the analytic side, and handling the apparent main terms (int his case, the volume of the ball) on the combinatorial side.

There is an additional difficulty that arises in the dimension 2 case which makes it distinct. But soon I will describe a different forthcoming work in that case.

]]>Disclaimer: There are several greenhouse gasses, and lots of other things that we’re throwing wantonly into the environment. Considering them makes things incredibly complicated incredibly quickly, so I blithely ignore them in this note.

Such rapid changes have side effects, many of which lead to bad things. That’s why nearly 150 countries ratified the Paris Agreement on Climate Change.^{1} Even if we assume that all these countries will accomplish what they agreed to (which might be challenging for the US),^{2}

most nations and advocacy groups are focusing on *increasing efficiency* and *reducing emissions.* These are good goals! But what about all the carbon that is already in the atmosphere?^{3}

You know what else is a problem? Obesity! How are we to solve all of these problems?

Looking at this (very unscientific) graph,^{4} we see that the red isn’t keeping up! Maybe we aren’t using the valuable resource of our own bodies enough! Fat has carbon in it — often over 20% by weight. What if we took advantage of our propensity to become propense? How fat would we need to get to balance last year’s carbon emissions?

That’s what we investigate here.

We need some data. It turns out that, despite knowing that we put *a lot* of carbon into the atmosphere, I don’t have any idea how much *a lot* actually is. Usually it’s given in nice, relatable terms that we’re supposed to be able to make sense of — like estimates on the number of degrees of warming to expect given a certain amount of emissions. So question number one: how much carbon do we put into the atmosphere?

This uses real data from the US Energy Information Association (in the “International Energy Statistics” dataset). This shows the highest carbon contributors from the year 2014 (the year with the most recent complete data. All countries not explicitly displayed are included in “All Others.”

What does this tell us?^{5} The vertical bars are measured in terms of “Million Metric Tons of CO2”. In total, the world released 33716 MMTons CO2.^{6}

This unit is a bit hard to wrap my head around, MMTon CO2, a million metric ton of CO2. Firstly, we should note that only 9195 MMTons of that is carbon, which is what we’re focusing on. To put this in proper perspective, that’s 2700 pounds per person alive today (Or 1226 kilograms, for that crowd).^{7}

So how fat would we need to get to balance one year of carbon emissions? If every man, woman, child, and elder gained a mere 2700 pounds (1226 kilograms!) of pure carbon, we would successfully sequester one year’s worth of carbon.

Unfortunately, that means about 13000 pounds (6000 kilograms) of fat, which is a bit much. So the chart really looks like this.

Wow. So this isn’t a reasonable carbon sequestration plan.^{8} We toss an **unbelievable** amount of carbon into the atmosphere. According to LiveScience, a fully grown T-Rex could weight as much as 18000 pounds (8160 kilograms). If we assume that the overall body composition of a dinosaur is about the same as a human,^{9} so that roughly 20% of a T-Rex’s weight is carbon, then a fully grown T-Rex might have 3590 pounds of carbon within his or her body. This is approximately the same amount of carbon that corresponds to each man, woman, child, and elder’s carbon use in 2014.

That’s a weird thought. How much carbon did we pull out of the ground and burn in 2014? About the same as if every human dug up a fully grown T-Rex, burned it, and then resumed their normal lives.

A fully grown male African elephant can weigh as much as 6000 kilograms. So we might grasp the magnitude of this as thinking of every person unearthing a fully grown male African elephant each year. Alternately, although we can’t gain enough weight to sequester enough carbon, elephants can. We could initiate a policy where every human adopts and raises a new African elephant each year.

I think I’m starting to get a bigger idea of just how daunting a task of large scale carbon sequestration will actually be. 2700 pounds per person per year. Whoa. Let’s move away from fat, towards better ideas.

Following guidelines set by the US Forestry Service for computing tree weight, a fully grown oak tree can weight as much as 14 metric tons, with as much as 4 metric tons (8800 pounds) being carbon. Thus one fully grown oak tree can hold three people’s average yearly carbon emissions.

Instead of an elephant a year, every person could plant an oak tree every year. (Actually, it just takes one in every three people). If these trees never died and were able to grow to complete size, then this would also offset carbon emissions. Conversely, when we cut and burn down trees, these release lots and lots and lots of carbon.

Suppose we did this. So this year, we were to plant 2.5 billion oak trees. That’s one for every three people on Earth. According to Penn State’s Forestry Extension School, a healthy, mature, hardwood forest can have as many as 120 trees per acre. If all 2.5 billion trees were planted at this density together, then this would cover 32552 square miles. The area of South Carolina is 32020 square miles, so we could cover the entire state of South Carolina with newly planted oak trees.^{10}

Of course, oak trees are probably not the best choice for a carbon sequestration tree, and there are probably plants that, in optimal growth conditions, hold a much higher carbon per square mile concentration.^{11} Perhaps some trees are three times as effective (a Maryland per year), or maybe even ten times as effective (a Delaware per year).

But that is the magnitude of the effort. Now if you’ll excuse me, I’m going to go hug a tree.

]]>Here are the slides from my defense.

After the defense, I gave Jeff and Jill a poster of our family tree. I made this using data from Math Genealogy, which has so much data.

]]>$$ \int_0^1 f(x) dx = F(1) – F(0). $$

The dream of latex2html5 is to be able to describe a diagram using the language of PSTricks inside LaTeX, throw in a bit of sugar to describe how interactivity should work on the web, and then render this to a beautiful svg using javascript.

Unfortunately, I did not try to make this work on WordPress (as WordPress is a bit finicky about how it interacts with javascript). So instead, I wrote a more detailed description about latex2html5, including some examples and some criticisms, on my non-Wordpress website david.lowryduda.com.

]]>

The story began when (with Tom Hulse, Chan Ieong Kuan, and Alex Walker — and with helpful input from Mehmet Kiral, Jeff Hoffstein, and others) we introduced and studied the Dirichlet series

$$\begin{equation}

\sum_{n \geq 1} \frac{S(n)^2}{n^s}, \notag

\end{equation}$$

where $S(n)$ is a sum of the first $n$ Fourier coefficients of an automorphic form on GL(2)$. We’ve done this successfully with a variety of automorphic forms, leading to new results for averages, short-interval averages, sign changes, and mean-square estimates of the error for several classical problems. Many of these papers and results have been discussed in other places on this site.

Ultimately, the problem becomes acquiring sufficiently detailed understandings of the spectral behavior of various forms (or more correctly, the behavior of the spectral expansion of a Poincare series against various forms).

We are continuing to research and study a variety of problems through this general approach.

The slides for this talk are available here.

]]>In application, this is somewhat more complicated. But to show the technique, I apply it to reprove some classic bounds on $\text{GL}(2)$ $L$-functions.

This note is also available as a pdf. This was first written as a LaTeX document, and then modified to fit into wordpress through latex2jax.

Consider a Dirichlet series

$$\begin{equation}

D(s) = \sum_{n \geq 1} \frac{a(n)}{n^s}. \notag

\end{equation}$$

Suppose that this Dirichlet series converges absolutely for $\Re s > 1$, has meromorphic continuation to the complex plane, and satisfies a functional equation of shape

$$\begin{equation}

\Lambda(s) := G(s) D(s) = \epsilon \Lambda(1-s), \notag

\end{equation}$$

where $\lvert \epsilon \rvert = 1$ and $G(s)$ is a product of Gamma factors.

Dirichlet series are often used as a tool to study number theoretic functions with multiplicative properties. By studying the analytic properties of the Dirichlet series, one hopes to extract information about the coefficients $a(n)$. Some of the most common interesting information within Dirichlet series comes from partial sums

$$\begin{equation}

S(n) = \sum_{m \leq n} a(m). \notag

\end{equation}$$

For example, the Gauss Circle and Dirichlet Divisor problems can both be stated as problems concerning sums of coefficients of Dirichlet series.

One can try to understand the partial sum directly by understanding the integral transform

$$\begin{equation}

S(n) = \frac{1}{2\pi i} \int_{(2)} D(s) \frac{X^s}{s} ds, \notag

\end{equation}$$

a Perron integral. However, it is often challenging to understand this integral, as delicate properties concerning the convergence of the integral often come into play.

Instead, one often tries to understand a smoothed sum of the form

$$\begin{equation}

\sum_{m \geq 1} a(m) v(m) \notag

\end{equation}$$

where $v(m)$ is a smooth function that vanishes or decays extremely quickly for values of $m$ larger than $n$. A large class of smoothed sums can be obtained by starting with a very nicely behaved weight function $v(m)$ and take its Mellin transform

$$\begin{equation}

V(s) = \int_0^\infty v(x) x^s \frac{dx}{x}. \notag

\end{equation}$$

Then Mellin inversion gives that

$$\begin{equation}

\sum_{m \geq 1} a(m) v(m/X) = \frac{1}{2\pi i} \int_{(2)} D(s) X^s V(s) ds, \notag

\end{equation}$$

as long as $v$ and $V$ are nice enough functions.

In this note, we will use two smoothing integral transforms and corresponding smoothed sums. We will use one smooth function $v_1$ (which depends on another parameter $Y$) with the property that

$$\begin{equation}

\sum_{m \geq 1} a(m) v_1(m/X) \approx \sum_{\lvert m – X \rvert < X/Y} a(m). \notag

\end{equation}$$

And we will use another smooth function $v_2$ (which also depends on $Y$) with the property that

$$\begin{equation}

\sum_{m \geq 1} a(m) v_2(m/X) = \sum_{m \leq X} a(m) + \sum_{X < m < X + X/Y} a(m) v_2(m/X). \notag

\end{equation}$$

Further, as long as the coefficients $a(m)$ are nonnegative, it will be true that

$$\begin{equation}

\sum_{X < m < X + X/Y} a(m) v_2(m/X) \ll \sum_{\lvert m – X \rvert < X/Y} a(m), \notag

\end{equation}$$

which is exactly what $\sum a(m) v_1(m/X)$ estimates. Therefore

$$\begin{equation}\label{eq:overall_plan}

\sum_{m \leq X} a(m) = \sum_{m \geq 1} a(m) v_2(m/X) + O\Big(\sum_{m \geq 1} a(m) v_1(m/X) \Big).

\end{equation}$$

Hence sufficient understanding of $\sum a(m) v_1(m/X)$ and $\sum a(m) v_2(m/X)$ allows one to understand the sharp sum

$$\begin{equation}

\sum_{m \leq X} a(m). \notag

\end{equation}$$

Let us now introduce the two cutoff functions that we will use.

We use the Mellin transform

$$\begin{equation}

\frac{1}{2\pi i} \int_{(2)} \exp \Big( \frac{\pi s^2}{Y^2} \Big) \frac{X^s}{Y} ds = \frac{1}{2\pi} \exp \Big( – \frac{Y^2 \log^2 X}{4\pi} \Big). \notag

\end{equation}$$

Then

$$\begin{equation}

\frac{1}{2\pi i} \int_{(2)} D(s) \exp \Big( \frac{\pi s^2}{Y^2} \Big) \frac{X^s}{Y} ds = \frac{1}{2\pi} \sum_{n \geq 1} a(n) \exp \Big( – \frac{Y^2 \log^2 (X/n)}{4\pi} \Big). \notag

\end{equation}$$

For $n \in [X – X/Y, X + X/Y]$, the exponential damping term is essentially constant. However for $n$ with $\lvert n – X \rvert > X/Y$, this quickly exponential decay. Therefore this integral is very nearly the sum over those $n$ with $\lvert n – X \rvert < X/Y$.

For this reason we sometimes call this transform a concetrating integral transform. All of the mass of the integral is concentrated in a small interval of width $X/Y$ around the point $X$.

Note that if $a(n)$ is nonnegative, then we have the trivial bound

$$\begin{equation}

\sum_{\lvert n – X \rvert < X/Y} a(n) \ll \sum_{n \geq 1} a(n) \exp \Big( – \frac{Y^2 \log^2 (X/n)}{4\pi} \Big). \notag

\end{equation}$$

As this is a bit less known, we include a brief proof of this transform.

Write $X^s = e^{s\log X}$ and complete the square in the exponents. Since the integrand is entire and the integral is absolutely convergent, we may perform a change of variables $s \mapsto s-Y^2 \log X/2\pi$ and shift the line of integration back to the imaginary axis. This yields

$$\begin{equation}

\frac{1}{2\pi i} \exp\left( – \frac{Y^2 \log^2 X}{4\pi}\right) \int_{(0)} e^{\pi s^2/Y^2} \frac{ds}{Y}. \notag

\end{equation}$$

The change of variables $s \mapsto isY$ transforms the integral into the standard Gaussian, completing the proof.

$$\begin{equation}

\frac{1}{2\pi i} \exp\left( – \frac{Y^2 \log^2 X}{4\pi}\right) \int_{(0)} e^{\pi s^2/Y^2} \frac{ds}{Y}. \notag

\end{equation}$$

The change of variables $s \mapsto isY$ transforms the integral into the standard Gaussian, completing the proof.

For $X, Y > 0$, let $v_Y(X)$ denote a smooth non-negative function with maximum value $1$ satisfying

- $v_Y(X) = 1$ for $X \leq 1$,
- $v_Y(X) = 0$ for $X \geq 1 + \frac{1}{Y}$.

Let $V(s)$ denote the Mellin transform of $v_Y(X)$, given by

$$\begin{equation}

V(s)=\int_0^\infty t^s v_Y(t) \frac{dt}{t}. \notag

\end{equation}$$

when $\Re(s) > 0$. Through repeated applications of integration by parts, one can show that $V(s)$ satisfies the following properties:

- $V(s) = \frac{1}{s} + O_s(\frac{1}{Y})$.
- $V(s) = -\frac{1}{s}\int_1^{1 + \frac{1}{Y}}v'(t)t^s dt$.
- For all positive integers $m$, and with $s$ constrained to within a vertical strip where $\lvert s\rvert >\epsilon$, we have

$$\begin{equation} \label{vbound}

V(s) \ll_\epsilon \frac{1}{Y}\left(\frac{Y}{1 + \lvert s \rvert}\right)^m.

\end{equation}$$

Property $(3)$ above can be extended to real $m > 1$ through the Phragmén-Lindelőf principle.

Then we have that

$$\begin{equation}

\frac{1}{2\pi i} \int_{(2)} D(s) V(s) X^s ds = \sum_{n \leq X} a(n) + \sum_{X < n < X + X/Y} a(n) v_Y(n/X). \notag

\end{equation}$$

In other words, the sharp sum $\sum_{n \leq X} a(n)$ is captured perfectly, and then there is an amount of smooth fuzz for an additional $X/Y$ terms. As long as the short sum of length $X/Y$ isn’t as large as the sum over the first $X$ terms, then this transform gives a good way of understanding the sharp sum.

When $a(n)$ is nonnegative, we have the trivial bound that

$$\begin{equation}

\sum_{X < n < X + X/Y} a(n) v_Y(n/X) \ll \sum_{\lvert n – X \rvert < X/Y} a(n). \notag

\end{equation}$$

We have the equality

$$\begin{align}

\sum_{n \geq 1} a(n) v_Y(n/X) &= \sum_{n \leq X} a(n) + \sum_{X < n < X + X/Y} a(n) v_Y(n/X) \notag \

&= \sum_{n \leq X} a(n) + O\Big( \sum_{\lvert n – X \rvert < X/Y} a(n) \Big) \notag \

&= \sum_{n \leq X} a(n) + O\bigg( \sum_{n \geq 1} a(n) \exp \Big( – \frac{Y^2 \log^2 (X/n)}{4\pi} \Big)\bigg).\notag

\end{align}$$

Rearranging, we have

$$\begin{equation}

\sum_{n \leq X} a(n) = \sum_{n \geq 1} a(n) v_Y(n/X) + O\bigg( \sum_{n \geq 1} a(n) \exp \Big( – \frac{Y^2 \log^2 (X/n)}{4\pi} \Big)\bigg). \notag

\end{equation}$$

In terms of integral transforms, we then have that

$$\begin{align}

\sum_{n \leq X} a(n) &= \frac{1}{2\pi i} \int_{(2)} D(s) V(s) X^s ds \notag \

&\quad + O \bigg( \frac{1}{2\pi i} \int_{(2)} D(s) \exp \Big( \frac{\pi s^2}{Y^2} \Big) \frac{X^s}{Y} ds \bigg). \notag

\end{align}$$

Fortunately, the process of understanding these two integral transforms often boils down to the same fundamental task: determine how quickly Dirichlet series grow in vertical strips.

Suppose that $f(z) = \sum_{n \geq 1} a(n) e(nz)$ is a $\text{GL}(2)$ holomorphic cusp form of weight $k$. We do not restrict $k$ to be an integer, and in fact $k$ might be any rational number as long as $k > 2$. Then the Rankin-Selberg convolution

$$\begin{equation}

L(s, f \otimes \overline{f}) = \zeta(2s) \sum_{n \geq 1} \frac{\lvert a(n) \rvert^2}{n^{s + k – 1}} \notag

\end{equation}$$

is an $L$-function satisfying a functional equation of shape

$$\begin{equation}

\Lambda(s, f \otimes \overline{f}) := (2\pi)^{-2s} L(s, f \otimes \overline{f}) \Gamma(s) \Gamma(s + k – 1) = \epsilon \Lambda(s, f\otimes \overline{f}), \notag

\end{equation}$$

where $\lvert \epsilon \rvert = 1$ (and in fact the right hand side $L$-function may actually correspond to a related pair of forms $\widetilde{f} \otimes \overline{\widetilde{f}}$, though this does not affect the computations done here).

It is a classically interesting question to consider the sizes of the coefficients $a(n)$. The Ramanujan-Petersson conjecture states that $a(n) \ll n^{\frac{k-1}{2} + \epsilon}$. The Ramanujan-Petersson conjecture is known for full-integral forms on $\text{GL}(2)$, but this is a very deep and very technical result. In general, this type of question is very deep, and very hard.

Using nothing more than the functional equation and the pair of integral transforms, let us analyze the sizes of

$$\begin{equation}

\sum_{n \leq X} \frac{\lvert a(n) \rvert^2}{n^{k-1}}. \notag

\end{equation}$$

Note that the power $n^{k-1}$ serves to normalize the sum to be $1$ on average (at least conjecturally).

As described above, it is now apparent that

$$\begin{align}

\sum_{n \leq X} \frac{\lvert a(n) \rvert^2}{n^{k-1}} &= \frac{1}{2\pi i} \int_{(2)} \frac{L(s, f \otimes \overline{f})}{\zeta(2s)} V(s) X^s ds \notag \

&\quad + O \bigg( \frac{1}{2\pi i} \int_{(2)} \frac{L(s, f \otimes \overline{f})}{\zeta(2s)} \exp \Big( \frac{\pi s^2}{Y^2} \Big) \frac{X^s}{Y} ds \bigg). \notag

\end{align}$$

We now seek to understand the two integral transforms. Due to the $\zeta(2s)^{-1}$ in the denominator, and due to the mysterious nature of the zeroes of the zeta function, it will only be possible to shift each line of integration to $\Re s = \frac{1}{2}$. Note that $L(s, f\otimes \overline{f})$ has a simple pole at $s = 1$ with a residue that I denote by $R$.

By the Phragmén-Lindelőf Convexity principle, it is known from the functional equation that

$$\begin{equation}

L(\frac{1}{2} + it, f \otimes \overline{f}) \ll (1 + \lvert t \rvert)^{3/4}. \notag

\end{equation}$$

Then we have by Cauchy’s Theorem that

$$\begin{align}

&\frac{1}{2\pi i} \int_{(2)} \frac{L(s, f\otimes \overline{f})}{\zeta(2s)} \exp \Big( \frac{\pi s^2}{Y^2} \Big) \frac{X^s}{Y} ds \notag \

&\quad = \frac{RX e^{1/Y^2}}{Y\zeta(2)} + \frac{1}{2\pi i} \int_{(1/2)} \frac{L(s, f\otimes \overline{f})}{\zeta(2s)} \exp \Big( \frac{\pi s^2}{Y^2} \Big) \frac{X^s}{Y} ds. \notag

\end{align}$$

The shifted integral can be written

$$\begin{equation}\label{eq:exp_shift1}

\int_{-\infty}^\infty \frac{L(\frac{1}{2} + it, f \otimes \overline{f})}{\zeta(1 + 2it)} \exp \Big( \frac{\pi (\frac{1}{4} – t^2 + it)}{Y^2}\Big) \frac{X^{\frac{1}{2} + it}}{Y}dt.

\end{equation}$$

It is known that

$$\begin{equation}

\zeta(1 + 2it)^{-1} \ll \log (1 + \lvert t \rvert). \notag

\end{equation}$$

Therefore, bounding by absolute values shows that~\eqref{eq:exp_shift1} is bounded by

$$\begin{equation}

\int_{-\infty}^\infty (1 + \lvert t \rvert)^{\frac{3}{4} + \epsilon} e^{-t^2/Y^2} \frac{X^{\frac{1}{2}}}{Y}dt. \notag

\end{equation}$$

Heuristically, the exponential decay causes this to be an integral over $t \in [-Y, Y]$, as outside this interval there is exponential decay. We can recognize this more formally by performing the change of variables $t \mapsto tY$. Then we have

$$\begin{equation}

\int_{-\infty}^\infty (1 + \lvert tY \rvert)^{\frac{3}{4} + \epsilon} e^{-t^2} X^{\frac{1}{2}} dt \ll X^{\frac{1}{2}} Y^{\frac{3}{4}+\epsilon}. \notag

\end{equation}$$

In total, this means that

$$\begin{equation}

\frac{1}{2\pi i} \int_{(2)} \frac{L(s, f\otimes \overline{f})}{\zeta(2s)} \exp \Big( \frac{\pi s^2}{Y^2} \Big) \frac{X^s}{Y} ds = \frac{RX e^{1/Y^2}}{Y\zeta(2)} + O(X^{\frac{1}{2}}Y^{\frac{3}{4}+\epsilon}). \notag

\end{equation}$$

Working now with the other integral transform, Cauchy’s theorem gives

$$\begin{align}

&\frac{1}{2\pi i} \int_{(2)} \frac{L(s, f\otimes \overline{f})}{\zeta(2s)} V(s) X^s ds \notag \

&\quad = \frac{RX V(1)}{\zeta(2)} + \frac{1}{2\pi i} \int_{(1/2)} \frac{L(s, f\otimes \overline{f})}{\zeta(2s)} V(s)X^s ds. \notag

\end{align}$$

The shifted integral can again be written

$$\begin{equation}\label{eq:exp_shift2}

\int_{-\infty}^\infty \frac{L(\frac{1}{2} + it, f \otimes \overline{f})}{\zeta(1 + 2it)} V(\tfrac{1}{2} + it) X^{\frac{1}{2} + it} dt,

\end{equation}$$

and, bounding~\eqref{eq:exp_shift2} by absolute values as above, we get

$$\begin{equation}

\int_{-\infty}^\infty (1 + \lvert t \rvert)^{\frac{3}{4} + \epsilon} \lvert V(\tfrac{1}{2} + it) \rvert X^{\frac{1}{2}} dt \ll \int_{-\infty}^\infty (1 + \lvert t \rvert)^{\frac{3}{4} + \epsilon} \frac{1}{Y} \bigg(\frac{Y}{1 + \lvert t \rvert}\bigg)^m X^{\frac{1}{2}} dt \notag

\end{equation}$$

for any $m \geq 0$. In order to make the integral converge, we choose $m = \frac{7}{4} + 2\epsilon$, which shows that

$$\begin{equation}

\int_{-\infty}^\infty (1 + \lvert t \rvert)^{\frac{3}{4} + \epsilon} \lvert V(\tfrac{1}{2} + it) \rvert X^{\frac{1}{2}} dt \ll X^{\frac{1}{2}}Y^{\frac{3}{4} + \epsilon}. \notag

\end{equation}$$

Therefore, we have in total that

$$\begin{equation}

\frac{1}{2\pi i} \int_{(2)} \frac{L(s, f\otimes \overline{f})}{\zeta(2s)} V(s) X^s ds = \frac{RX V(1)}{\zeta(2)} + O(X^{\frac{1}{2}}Y^{\frac{3}{4} + \epsilon}). \notag

\end{equation}$$

Notice that the $X$ and $Y$ bounds are the exact same for the two separate integral bounds, and that the bounding process was essentially identical. Heuristically, this should generally be true (although in practice there may be some advantage to one over the other).

Now that we have estimated these two integrals, we can say that

$$\begin{equation}

\sum_{n \leq X} \frac{\lvert a(n) \rvert^2}{n^{k-1}} = cX + O\big(\frac{X}{Y}\big) + O(X^{\frac{1}{2}}Y^{\frac{3}{4}+\epsilon}) \notag

\end{equation}$$

for some computable constant $c$. This is optimized when

$$\begin{equation}

X^{\frac{1}{2}} = Y^{\frac{7}{4} + \epsilon} \implies Y \approx X^{\frac{2}{7}}, \notag

\end{equation}$$

leading to

$$\begin{equation}

\sum_{n \leq X} \frac{\lvert a(n) \rvert^2}{n^{k-1}} = cX + O(X^{\frac{5}{7} + \epsilon}). \notag

\end{equation}$$

This isn’t the best possible or best-known result, but it came for almost free! (So one can’t complain too much). Smooth cutoffs and understood polynomial growth allow sharp cutoffs with polynomial-savings error term.

In a forthcoming note, we will revisit this example and be slighly more clever in our application of this technique of comparing two smooth integral transforms together. We will discuss an improved (almost still free) result, and some of the limitations of the technique.

]]>- 2017 is a prime number. 2017 is the 306th prime. The 2017th prime is 17539.
- As 2011 is also prime, we call 2017 a sexy prime.
- 2017 can be written as a sum of two squares,

$$ 2017 = 9^2 +44^2,$$

and this is the only way to write it as a sum of two squares. - Similarly, 2017 appears as the hypotenuse of a primitive Pythagorean triangle,

$$ 2017^2 = 792^2 + 1855^2,$$

and this is the only such right triangle. - 2017 is uniquely identified as the first odd prime that leaves a remainder of $2$ when divided by $5$, $13$, and $31$. That is,

$$ 2017 \equiv 2 \pmod {5, 13, 31}.$$ - In different bases,

$$ \begin{align} (2017)_{10} &= (2681)_9 = (3741)_8 = (5611)_7 = (13201)_6 \notag \\ &= (31032)_5 = (133201)_4 = (2202201)_3 = (11111100001)_2 \notag \end{align}$$

The base $2$ and base $3$ expressions are sort of nice, including repetition.

$$\begin{array}{ll}

1 = 2\cdot 0 + 1^7 & 11 = 2 + 0! + 1 + 7 \\

2 = 2 + 0 \cdot 1 \cdot 7 & 12 = 20 – 1 – 7 = -2 + (0! + 1)\cdot 7 \\

3 = (20 + 1)/7 = 20 – 17 & 13 = 20 – 1 \cdot 7 \\

4 = -2 + 0 – 1 + 7 & 14 = 20 – (-1 + 7) \\

5 = -2 + 0\cdot 1 + 7 & 15 = -2 + 0 + 17 \\

6 = -2 + 0 + 1 + 7 & 16 = -(2^0) + 17 \\

7 = 2^0 – 1 + 7 & 17 = 2\cdot 0 + 17 \\

8 = 2 + 0 – 1 + 7 & 18 = 2^0 + 17 \\

9 = 2 + 0\cdot 1 + 7 & 19 = 2\cdot 0! + 17 \\

10 = 2 + 0 + 1 + 7 & 20 = 2 + 0! + 17.

\end{array}$$

In each expression, the digits $2, 0, 1, 7$ appear, in order, with basic mathematical symbols. I wonder what the first number is that can’t be nicely expressed (subjectively, of course)?

Now let’s look at less-common manipulations with numbers.

- The digit sum of $2017$ is $10$, which has digit sum $1$.
- Take $2017$ and its reverse, $7102$. The difference between these two numbers is $5085$. Repeating gives $720$. Continuing, we get

$$ 2017 \mapsto 5085 \mapsto 720 \mapsto 693 \mapsto 297 \mapsto 495 \mapsto 99 \mapsto 0.$$

So it takes seven iterations to hit $0$, where the iteration stabilizes. - Take $2017$ and its reverse, $7102$. Add them. We get $9119$, a palindromic number. Continuing, we get

$$ \begin{align} 2017 &\mapsto 9119 \mapsto 18238 \mapsto 101519 \notag \\ &\mapsto 1016620 \mapsto 1282721 \mapsto 2555542 \mapsto 5011094 \mapsto 9912199. \notag \end{align}$$

It takes one map to get to the first palindrome, and then seven more maps to get to the next palindrome. Another five maps would yield the next palindrome. - Rearrange the digits of $2017$ into decreasing order, $7210$, and subtract the digits in increasing order, $0127$. This gives $7083$. Repeating once gives $8352$. Repeating again gives $6174$, at which point the iteration stabilizes. This is called Kaprekar’s Constant.
- Consider Collatz: If $n$ is even, replace $n$ by $n/2$. Otherwise, replace $n$ by $3\cdot n + 1$. On $2017$, this gives

$$\begin{align}

2017 &\mapsto 6052 \mapsto 3026 \mapsto 1513 \mapsto 4540 \mapsto \notag \\

&\mapsto 2270 \mapsto 1135 \mapsto 3406 \mapsto 1703 \mapsto 5110 \mapsto \notag \\

&\mapsto 2555 \mapsto 7666 \mapsto 3833 \mapsto 11500 \mapsto 5750 \mapsto \notag \\

&\mapsto 2875 \mapsto 8626 \mapsto 4313 \mapsto 12940 \mapsto 6470 \mapsto \notag \\

&\mapsto 3235 \mapsto 9706 \mapsto 4853 \mapsto 14560 \mapsto 7280 \mapsto \notag \\

&\mapsto 3640 \mapsto 1820 \mapsto 910 \mapsto 455 \mapsto 1366 \mapsto \notag \\

&\mapsto 683 \mapsto 2050 \mapsto 1025 \mapsto 3076 \mapsto 1538 \mapsto \notag \\

&\mapsto 769 \mapsto 2308 \mapsto 1154 \mapsto 577 \mapsto 1732 \mapsto \notag \\

&\mapsto 866 \mapsto 433 \mapsto 1300 \mapsto 650 \mapsto 325 \mapsto \notag \\

&\mapsto 976 \mapsto 488 \mapsto 244 \mapsto 122 \mapsto 61 \mapsto \notag \\

&\mapsto 184 \mapsto 92 \mapsto 46 \mapsto 23 \mapsto 70 \mapsto \notag \\

&\mapsto 35 \mapsto 106 \mapsto 53 \mapsto 160 \mapsto 80 \mapsto \notag \\

&\mapsto 40 \mapsto 20 \mapsto 10 \mapsto 5 \mapsto 16 \mapsto \notag \\

&\mapsto 8 \mapsto 4 \mapsto 2 \mapsto 1 \notag

\end{align}$$

It takes $69$ steps to reach the seemingly inevitable $1$. This is much shorter than the $113$ steps necessary for $2016$ or the $113$ (yes, same number) steps necessary for $2018$. - Consider the digits $2,1,7$ (in that order). To generate the next number, take the units digit of the product of the previous $3$. This yields

$$2,1,7,4,8,4,8,6,2,6,2,4,8,4,\ldots$$

This immediately jumps into a periodic pattern of length $8$, but $217$ is not part of the period. So this is preperiodic. - Consider the digits $2,0,1,7$. To generate the next number, take the units digit of the sum of the previous $4$. This yields

$$ 2,0,1,7,0,8,6,1,5,0,2,8,\ldots, 2,0,1,7.$$

After 1560 steps, this produces $2,0,1,7$ again, yielding a cycle. Interestingly, the loop starting with $2018$ and $2019$ also repeat after $1560$ steps. - Take the digits $2,0,1,7$, square them, and add the result. This gives $2^2 + 0^2 + 1^2 + 7^2 = 54$. Repeating, this gives

$$ \begin{align} 2017 &\mapsto 54 \mapsto 41 \mapsto 17 \mapsto 50 \mapsto 25 \mapsto 29 \notag \\ &\mapsto 85 \mapsto 89 \mapsto 145 \mapsto 42 \mapsto 20 \mapsto 4 \notag \\ &\mapsto 16 \mapsto 37 \mapsto 58 \mapsto 89\notag\end{align}$$

and then it reaches a cycle. - Take the digits $2,0,1,7$, cube them, and add the result. This gives $352$. Repeating, we get $160$, and then $217$, and then $352$. This is a very tight loop.

- One can make $2017$ from determinants of basic matrices in a few ways. For instance,

$$ \begin{align}

\left \lvert \begin{pmatrix} 1&2&3 \\ 4&6&7 \\ 5&8&9 \end{pmatrix}\right \rvert &= 2, \qquad

\left \lvert \begin{pmatrix} 1&2&3 \\ 4&5&6 \\ 7&8&9 \end{pmatrix}\right \rvert &= 0\notag \\

\left \lvert \begin{pmatrix} 1&2&3 \\ 4&7&6 \\ 5&9&8 \end{pmatrix}\right \rvert &= 1 , \qquad

\left \lvert \begin{pmatrix} 1&2&3 \\ 4&5&7 \\ 6&8&9 \end{pmatrix}\right \rvert &= 7\notag

\end{align}$$

The matrix with determinant $0$ has the numbers $1$ through $9$ in the most obvious configuration. The other matrices are very close in configuration. - Alternately,

$$ \begin{align}

\left \lvert \begin{pmatrix} 1&2&3 \\ 5&6&9 \\ 4&8&7 \end{pmatrix}\right \rvert &= 20 \notag \\

\left \lvert \begin{pmatrix} 1&2&3 \\ 6&8&9 \\ 5&7&4 \end{pmatrix}\right \rvert &= 17 \notag

\end{align}$$

So one can form $20$ and $27$ separately from determinants. - One cannot make $2017$ from a determinant using the digits $1$ through $9$ (without repetition).
- If one uses the digits from the first $9$ primes, it is interesting that one can choose configurations with determinants equal to $2016$ or $2018$, but there is no such configuration with determinant equal to $2017$.

Similarly, as I learned more about cryptography, I learned that though the basic ideas are very simple, their application is often very inelegant. For example, the basis of RSA follows immediately from Euler’s Theorem as learned while studying elementary number theory, or alternately from Lagrange’s Theorem as learned while studying group theory or abstract algebra. And further, these are very early topics in these two areas of study!

But a naive implementation of RSA is doomed (For that matter, many professional implementations have their flaws too). Every now and then, a very clever expert comes up with a new attack on popular cryptosystems, generating new guidelines and recommendations. Some guidelines make intuitive sense [e.g. don’t use too small of an exponent for either the public or secret keys in RSA], but many are more complicated or designed to prevent more sophisticated attacks [especially side-channel attacks].

In the summer of 2013, I participated in the ICERM IdeaLab working towards more efficient homomorphic encryption. We were playing with existing homomorphic encryption schemes and trying to come up with new methods. One guideline that we followed is that an attacker should not be able to recognize an encryption of zero. This seems like a reasonable guideline, but I didn’t really understand why, until I was chatting with others at the 2017 Joint Mathematics Meetings in Atlanta.

It turns out that revealing zero isn’t just against generally sound advice. Revealing zero is a capital B capital T Bad Thing.

For the rest of this note, I’ll try to identify some of this reasoning.

In a typical cryptosystem, the basic setup is as follows. Andrew has a message that he wants to send to Beatrice. So Andrew converts the message into a list of numbers $M$, and uses some sort of encryption function $E(\cdot)$ to encrypt $M$, forming a ciphertext $C$. We can represent this as $C = E(M)$. Andrew transmits $C$ to Beatrice. If an eavesdropper Eve happens to intercept $C$, it should be very hard for Eve to recover any information about the original message from $C$. But when Beatrice receives $C$, she uses a corresponding decryption function $D(\cdot)$ to decrypt $C$, $M = d(C)$.

Often, the encryption and decryption techniques are based on number theoretic or combinatorial primitives. Some of these have extra structure (or at least they do with basic implementation). For instance, the RSA cryptosystem involves a public exponent $e$, a public mod $N$, and a private exponent $d$. Andrew encrypts the message $M$ by computing $C = E(M) \equiv M^e \bmod N$. Beatrice decrypts the message by computing $M = C^d \equiv M^{ed} \bmod N$.

Notice that in the RSA system, given two messages $M_1, M_2$ and corresponding ciphertexts $C_1, C_2$, we have that

\begin{equation}

E(M_1 M_2) \equiv (M_1 M_2)^e \equiv M_1^e M_2^e \equiv E(M_1) E(M_2) \pmod N. \notag

\end{equation}

The encryption function $E(\cdot)$ is a group homomorphism. This is an example of extra structure.

A fully homomorphic cryptosystem has an encryption function $E(\cdot)$ satisfying both $E(M_1 + M_2) = E(M_1) + E(M_2)$ and $E(M_1M_2) = E(M_1)E(M_2)$ (or more generally an analogous pair of operations). That is, $E(\cdot)$ is a ring homomorphism.

This extra structure allows for (a lot of) extra utility. A fully homomorphic $E(\cdot)$ would allow one to perform meaningful operations on encrypted data, even though you can’t read the data itself. For example, a clinic could store (encrypted) medical information on an external server. A doctor or nurse could pull out a cellphone or tablet with relatively little computing power or memory and securely query the medical data. Fully homomorphic encryption would allow one to securely outsource data infrastructure.

A different usage model suggests that we use a different mental model. So suppose Alice has sensitive data that she wants to store for use on EveCorp’s servers. Alice knows an encryption method $E(\cdot)$ and a decryption method $D(\cdot)$, while EveCorp only ever has mountains of ciphertexts, and cannot read the data [even though they have it].

Let us now consider some basic cryptographic attacks. We should assume that EveCorp has access to a long list of plaintext messages $M_i$ and their corresponding ciphertexts $C_i$. Not everything, but perhaps from small leaks or other avenues. Among the messages $M_i$ it is very likely that there are two messages $M_1, M_2$ which are relatively prime. Then an application of the Euclidean Algorithm gives a linear combination of $M_1$ and $M_2$ such that

\begin{equation}

M_1 x + M_2 y = 1 \notag

\end{equation}

for some integers $x,y$. Even though EveCorp doesn’t know the encryption method $E(\cdot)$, since we are assuming that they have access to the corresponding ciphertexts $C_1$ and $C_2$, EveCorp has access to an encryption of $1$ using the ring homomorphism properties:

\begin{equation}\label{eq:encryption_of_one}

E(1) = E(M_1 x + M_2 y) = x E(M_1) + y E(M_2) = x C_1 + y C_2.

\end{equation}

By multiplying $E(1)$ by $m$, EveCorp has access to a plaintext and encryption of $m$ for any message $m$.

Now suppose that EveCorp can always recognize an encryption of $0$. Then EveCorp can mount a variety of attacks exposing information about the data it holds.

For example, EveCorp can test whether a particular message $m$ is contained in the encrypted dataset. First, EveCorp generates a ciphertext $C_m$ for $m$ by multiplying $E(1)$ by $m$, as in \eqref{eq:encryption_of_one}. Then for each ciphertext $C$ in the dataset, EveCorp computes $C – C_m$. If $m$ is contained in the dataset, then $C – C_m$ will be an encryption of $0$ for the $C$ corresponding to $m$. EveCorp recognizes this, and now knows that $m$ is in the data. To be more specific, perhaps a list of encrypted names of medical patients appears in the data, and EveCorp wants to see if JohnDoe is in that list. If they can recognize encryptions of $0$, then EveCorp can access this information.

And thus it is unacceptable for external entities to be able to consistently recognize encryptions of $0$.

Up to now, I’ve been a bit loose by saying “an encryption of zero” or “an encryption of $m$”. The reason for this is that to protect against recognition of encryptions of $0$, some entropy is added to the encryption function $E(\cdot)$, making it multivalued. So if we have a message $M$ and we encrypt it once to get $E(M)$, and we encrypt $M$ later and get $E'(M)$, it is often not true that $E(M) = E'(M)$, even though they are both encryptions of the same message. But these systems are designed so that it is true that $C(E(M)) = C(E'(M)) = M$, so that the entropy doesn’t matter.

This is a separate matter, and something that I will probably return to later.

]]>In fall 2016, I taught Math 100 (second semester calculus, starting with integration by parts and going through sequences and series) at Brown University. Here are my concluding remarks.

In spring 2016, I designed and taught Math 42 (elementary number theory) at Brown University. My students were exceptional — check out a showcase of some of their final projects. Here are my concluding remarks.

In fall 2014, I taught Math 170 (advanced placement second semester calculus) at Brown University.

I taught number theory in the Summer@Brown program for high school students in the summers of 2013-2015.

I taught a privately requested course in precalculus in the summer of 2013.

I have served as a TA (many, many, many times) for

- Math 90 (first semester calculus) at Brown University
- Math 100 (second semester calculus) at Brown University
- Math 1501 (first semester calculus) at Georgia Tech
- Math 1502 (second semester calculus, starting with sequences and series but also with 7 weeks of linear algebra) at Georgia Tech
- Math 2401 (multivariable calculus) at Georgia Tech (there’s essentially no content on this site about this – this was just before I began to maintain a website)

I sometimes tutor at Brown (but not limited to Brown students) and around Boston, on a wide variety of topics (not just the ordinary, boring ones). I charge $80/hour, but I am not currently looking for tutees.

Below, you can find my most recent posts tagged under “Teaching”.

]]>**HNRSS**: (source), a HackerNews RSS generator written in python. HNRSS periodically updates RSS feeds from the HN frontpage and best list. It also attempts to automatically summarize the link (if there is a link) and includes the top five comments, all to make it easier to determine whether it’s worth checking out.

**LaTeX2Jax**: (source), a tool to convert LaTeX documents to HTML with MathJax. This is a modification of the earlier MSE2WP, which converts Math.StackExchange flavored markdown to WordPress+MathJax compatible html. In particular, this is more general, and allows better control of the resulting html by exposing more CSS elements (that generically aren’t available on free WordPress setups). This is what is used for all math posts on this site.

**MSE2WP**: (source), a tool to convert Math.Stackexchange flavored markdown to WordPress+MathJax compatible html. This was once written for the Math.Stackexchange Community Blog. But as that blog is shutting down, there is much less of a purpose for this script. Note that this began as a modified version of latex2wp.

I actively contribute to:

**python-markdown2**: (source), a fast and complete python implementation of markdown, with a few additional features.

And I generally support or have contributed to:

**SageMath**: (main site), a free and open source system of tools for mathematics. Some think of it as a free alternative to the “Big M’s” — Maple, Mathematica, Magma.

**Matplotlib**: (main site), a plotting library in python. Most of the static plots on this site were creating using matplotlib.

**crouton**: (source), a tool for making Chromebooks, which by default are very limited in capability, into hackable linux laptops. This lets you directly run Linux on the device at the same time as having ChromeOS installed. The only cost is that there is absolutely no physical security at all (and every once in a while a ChromeOS update comes around and breaks lots of things). It’s great!

Below, you can find my most recent posts tagged “Programming” on this site.

I will note the following posts which have received lots of positive feedback.

- A Notebook Preparing for a Talk at Quebec-Maine
- A Brief Notebook on Cryptography
- Computing pi with Tools from Calculus (which includes computational tidbits, though no actual programming).