## Mathematics Category Archive

Below you will find the most recent posts tagged “Mathematics”, arranged in reverse chronological order.

Below you will find the most recent posts tagged “Mathematics”, arranged in reverse chronological order.

Posted in Mathematics
Leave a comment

I’m giving a talk today on my recent and forthcoming work in collaboration with Theresa Anderson, Ayla Gafni, Robert Lemke Oliver, George Shakan, Frank Thorne, Jiuya Wang, and Ruixiang Zhang. The slides for my talk can be found here.

This talk includes some discussion of our paper to appear in IMRN (link to the arXiv version, which is mostly the same as what will be published). (See also my previous discussion on this paper). But I’ll note that in this talk I lean towards a few ideas that did not make it into the paper, but which we are using in current work.

In particular, in our paper we don’t need to use group actions or classify orbit sizes, but it turns out that this is a very strong idea! I’ll note that in a very particular case, Thorne and Taniguchi have applied this type of orbit counting method in their paper “Orbital exponential sums for prehomogeneous vector spaces” to gain extremely strong, specific understanding of Fourier transform for their application.

This summer, I proposed a research project for PROMYS (the PROgram in Mathematics for Young Scientists), a six-week intensive summer program at Boston University where highly motivated high school students explore mathematics. Three students (Nir Elber, Raymond Feng, and Henry Xie) chose to work on this project, and previous PROMYSer Anupam Datta gave additional guidance. Their summary of their findings can be found here. (**UPDATE**: a version of this now appears on the arXiv too).

Here I briefly describe the project and the work of Nir, Raymond, and Henry.

The project was organized around understanding why the following picture has so much structure.

Fundamentally, this image depicts differences between sums related to primes. Let $p_n$ denote the $n$th prime. It follows from the Prime Number Theorem that $p_n \approx n \log n$, and thus that $n p_n \approx n^2 \log n$. One can also show that $$ \sum_{m \leq n} p_m \approx \frac{1}{2} n^2 \log n,$$ and thus we should have that $$ \frac{n p_n}{\sum_{m \leq n} p_m} \to 2.\tag{1}$$

The vertical axis in the image above examines differences between consecutive $n$ in $(1)$ (in log scale), while the horizontal axis gives $n$ (also in log scale).

The fact that $(1) \to 0$ corresponds to the overall downwards trend in the graph. But there is so much more structure! Why do the points fall into “troughs” or along “curtains”? Does each line mean something?

In this version, I’ve colored differences coming from when $p_n$ is a twin prime (in blue), a cousin prime (in green), a sexy prime (in red), or a prime $p$ such that the next prime is $p+8$ (in cyan). The first dot is black because it comes from $2$. The next two correspond to $3$ and $5$ (both twin primes), and the fourth dot corresponds to $7$ and is green because the next prime after $7$ is $11$, and so on.

This is a strong hint at distributional aspects alluded to within the plots.

Nir, Raymond, and Henry proved many things! They quantified the rate of convergence in $(1)$ and thus quantified the guaranteed downward trend in the images and found images that better convey the structure of what’s going on better. I was already very impressed, but then they branched out and studied more!

We chose to investigate a nuanced question: what aspects of the initial plots depend strongly on the fact that the underlying data consists of *primes*, and what aspects depend only on the fact that the underlying data consists of integers with the same *density as the primes*?

To study this, one can create a new set of distinguished elements called Promys Primes (PPrimes) with the same density as true primes using probabilistic ideas of Cramér. Let’s call $2$ and $3$ PPrimes, and then for each odd $m \geq 5$, we call $m$ a PPrime with probability $2 / \log m$. Do this for a large sequence of $m$, and we get a collection of PPrimes that has (with very high probability) the same density as true primes, but none of the multiplicative structure.

It turns out that for sets of PPrimes, there are analogous pictures and the asymptotics are even better! This is in section 3 of their write-up.

We also thought to study analogous situations in related sets of primes, such as the Gaussian integers. Recall that the Gaussian integers $\mathbb{Z}[i] = \{ a + bi : a, b \in \mathbb{Z} \}$ are a unique factorization domain and have a rich theory of primes. Sometimes this theory is very similar to the standard theory of primes over $\mathbb{Z}$. But there are challenges.

One significant challenge is that $\mathbb{C}$ is not ordered. A related challenge is that there are more *units*. Over $\mathbb{Z}$, both $2$ and $-2$ are primes, but we typically recognize $2$ as being more “simple”. For Gaussian primes, there isn’t such a choice; for example each of $1 + i, 1 – i, -1 + i, -1 – i$ are Gaussian primes, but none are more simple or fundamental than the others.

More concretely, one has to be careful even with how to define the “sum of the first $n$ primes”. One natural thought might be to sum all Gaussian primes $\pi$ that have norm up to $X$. But one can quickly see that this sum is $0$ for analogous reasons to why the sum of all the typical primes with absolute values up to $X$ must vanish ($p + -p = 0$). In the Gaussian case, it is also true that $$ \sum_{N(\pi) \leq X} \pi^2 = 0.$$

But they considered higher powers, where there aren’t trivial or obvious reasons for massive cancellation, and they showed that there is *always* nontrivial cancellation. This is interesting on its own!

Then they also constructed a mixture, a Cramér-type model for Gaussian primes and showed that one should expect nontrivial cancellation there for purely distributional reasons.

I leave the details to their write-up. But they’ve done great work, and I look forward to seeing what they come up with in the future.

Posted in Expository, Math.NT, Mathematics
Leave a comment

At this year’s Maine-Québec Number Theory Conference, I’m giving a talk on **Zeros of half-integral weight Dirichlet series**. Here are the slides. I note that the references for the slides are included here at the end.

I’ll also note a few open problems that I don’t know how to handle and that I briefly describe during the talk.

- Is it possible to show that every (symmetrized) Dirichlet series associated to a half-integral weight modular form must have zeros off the critical line? This is true in practice, but seems hard to show.
- Is it possible to determine whether a given Dirichlet series has zeros in the half-plane of absolute convergence? If there is one zero, there are infinitely many – but is there a way of determining if there are any?
- Why does there seem to be a gap around the critical line in zero distribution?
- Can one explain why the pair correlation seems well-behaved (even heuristically)?

If you have any ideas, let me know!

Posted in Math.NT, Mathematics
Tagged dirichlet series, half-integral weight modular form, talk
Leave a comment

I gave a talk on visualizations of modular forms made with Adam Sakareassen at Bridges 2021. This talk goes with our short article. In this talk, I describe the line of ideas going towards producing three dimensional visualizations of modular forms, which I like to call *modular terrains*. When we first wrote that talk, we were working towards the following video visualization.

We are now working in a few different directions, involving informational visualizations of different forms and different types of forms, as well as purely artistic visualizations.

The slides for this talk can be found here.

I’ve recently been very fond of including renderings based on a picture of my wife and I in Iceland (from the beforetimes). This is us as a wallpaper (preserving many of the symmetries) for a particular modular form.

I reused a few images from Painted Modular Terrains, which I made a few months ago.

If you’re interested, you might also like a few previous talks and papers of mine:

- Slides from a talk on Visualizing Modular Forms
- Slides from a talk on computing Maass forms
- Notes behind a talk: visualizing modular forms
- Trace form 3.32.a.a
- phase_mag_plot: a sage package for plotting complex functions
- A paper: Visualizing modular forms
- A paper: Computing classical modular forms
- Bridges paper: Towards flying through modular forms

Posted in Art, Expository, Mathematics
Leave a comment

Theresa Anderson, Ayla Gafni, Robert Lemke Oliver, George Shakan, Ruixiang Zhang, and I have just uploaded a preprint to the arXiv called *Quantitative Hilbert Irreducibility and Almost Prime Values of Polynomial Discriminants.*

George has also written about this paper on his site.

This project began at an AIM workshop on Fourier analysis, arithmetic statistics, and discrete restriction.

Our guiding question was very open. For some *nice* local polynomial conditions, can we make sense of the Fourier transforms of these local conditions well enough to have arithmetic application?

This is partly inspired from *Orbital exponential sums for prehomogeneous vector spaces* by Takashi Taniguchi and Frank Thorne (preprint available on the arXiv). In this paper, Frank and Takashi algebraically compute Fourier transforms of a couple arithmetically interesting functions on prehomogeneous vector spaces over finite fields. It turns out that one can, for example, explicitly and completely compute the Fourier transform of the characteristic function of singular binary cubic forms over $\mathbb{F}_{q}$.

In a companion paper, Takashi and Frank combine those computations with sieves to prove that there are $\gg X / \log X$ cubic fields whose discriminant is squarefree, bounded above by $X$, and has at most $3$ prime factors. They also show there are $\gg X / \log X$ quartic fields whose discriminant is squarefree, bounded above by $X$, and has at most $8$ prime factors.

We have two classes of result. Both rely on similar types of analysis, and are each centered on a study of a particular indicator-type function, its Fourier transform, and a sieve.

First, we prove a bound on the number of polynomials whose Galois group is a subgroup of $A_n$. For $H > 1$, define \begin{equation*} V_n(H) = \{ f \in \mathbb{Z}[x] : \mathrm{ht}(f) \leq H \} \end{equation*} and \begin{equation*} E_n(H, A_n) := \# \{ f \in V_n(H) : \mathrm{Gal}(f) \subseteq A_n \}. \end{equation*} We show that \begin{equation} E_n(H, A_n) \ll H^{n – \frac{2}{3} + O(1/n)}. \end{equation} This is an improvement on progress towards a conjecture of Van der Waerden and is a quantitative form of Hilbert’s Irreducibility Theorem, which shows (among other applications) that most monic irreducibile polynomials have full Galois group.

However I should note that Bhargava has announced a proof of a (slightly weakened form of) Van der Waerden’s conjecture, and his result is strictly stronger than our result.

Secondly, we prove that for any $n \geq 3$ and $r \geq 2n – 3$, we have \begin{equation} \# \{ f \in \mathbb{Z}[x] : \mathrm{ht}(f) \leq H, f \, \text{monic }, \omega(\mathrm{Disc}(f)) \leq r \} \gg_{n, r} \frac{H^n}{\log H}, \end{equation} where $\omega(\cdot)$ denotes the number of distinct prime divisors. Qualitatively, this says that there are lots of polynomials with almost prime discriminants.

As a corollary of this second result, we prove that for $n \geq 3$ and $r \geq 2n – 3$, \begin{equation} \# \{ F / \mathbb{Q} : [F \colon Q] = n, \mathrm{Disc}(F) \leq X, \omega(\mathrm{Disc}(F)) \leq r \} \gg_{n, r, \epsilon} X^{\frac{1}{2} + \delta_n – \epsilon} \end{equation} for explicit $\delta_n > 0$ and any $\epsilon > 0$. This shows that there are at least $X^{1/2}$ cubic fields whose discriminants are divisible by at most $3$ primes, or at least $X^{1/2}$ quartic fields whose discriminants are divisible by at most $5$ primes, for example. We guarantee fewer fields than Taniguchi and Thorne, but we guarantee fields with fewer prime factors and cover all degrees.

In the remainder of this post, I’ll describe a line of thinking that went towards proving our first result.

We initially studied the Fourier transform of the *odd-polynomial* indicator function. We call a function $f(x) \in \mathbb{F}_p[x]$ *odd* if it has no repeated roots and the factorization type of $f$ corresponds to an odd permutation in the Galois group. That is, we can write $f$ as \begin{equation*} f(x) = f_1(x) f_2(x) \cdots f_r(x) \bmod p, \end{equation*} and there will be an element of the Galois group with cycle type $(\deg f_1) (\deg f_2) \cdots (\deg f_r)$. For *odd* $f$, this cycle must be an odd permutation.

A more convenient description of *oddness* is in terms of the Möbius function on $\mathbb{F}_p[x]$. A degree $n$ polynomial $f$ is odd precisely if $\mu_p(f) = (-1)^{n+1}$. Define $1^p_{sf}(f)$ to be the squarefree indicator function on $\mathbb{F}_p[x]$, and define $1^p_{odd, n}$ to be the odd indicator function on degree $n$ polynomials on $\mathbb{F}_p[x]$. Then \begin{equation*} 1^p_{odd, n}(f) = 1^p_n(f)\frac{(-1)^{n+1}\mu_p(f) + 1^p_{sf}(f)}{2}. \end{equation*} (Here, $1^p_n(f)$ keeps only the degree $n$ polynomials).

We then studied the Fourier transform of $1^p_{odd, n}$. Identifying the vector space of polynomials of degree at most $n$ over $\mathbb{F}_p[x]$, which we denote at $V_n(\mathbb{Z}/p\mathbb{Z})$, as $(\mathbb{Z}/p\mathbb{Z})^{n+1}$, we can study the Fourier transform of a function $\psi:V_n(\mathbb{Z}/p\mathbb{Z}) \longrightarrow \mathbb{C}$, \begin{equation*} \widehat{\psi}(\mathbf{u}) = \frac{1}{p^{n+1}} \sum_{f \in V_n(\mathbb{Z}/p\mathbb{Z})} \psi(f) e_p(\langle f, \mathbf{u} \rangle). \end{equation*} Here, $e_p(x) = e^{2 \pi i x / p}$.

It is possible to understand this Fourier transform using ideas similar to those of Takashi and Thorne. $\mathrm{GL}(2)$ acts on these polynomials in a similar way as it acts on quadratic forms, *and* $1^p_{odd, n}$ is invariant under this action. As in Takashi and Thorne, one can study the sizes of the Fourier transform on each orbit. This leads to several classical polynomial counting problems.

But unlike the prehomogeneous vector space context of Takashi and Thorne, we can’t *completely* determine the Fourier transform. For general degree, there are too many other terms.

Ultimately, we intend to use the knowledge of this Fourier transform as an ingredient in a sieve. An old theorem of Dedekind shows that if $\mathrm{Gal}(f) \subseteq A_n$, then $f$ is never *odd* mod any prime $p$.

We could use a Selberg sieve in the following form. For a nonnegative weight function $\phi: V_n(\mathbb{R}) \longrightarrow \mathbb{R}$ (roughly supported on the box $[-1, 1]^{n+1}$). Then consider \begin{equation}\label{eq:basic_sieve} \sum_{f \in V_n(\mathbb{Z})} \phi(f/H) \Big(\sum_{d: f \bmod p \text{ is odd } \forall p \mid d} \lambda_d \Big)^2 \geq 0 \end{equation} for some real weights $\lambda_d$ to be chosen later, but where $\lambda_1 = 1$.

For $f$ with $\mathrm{Gal}(f) \subseteq A_n$, $f$ is never odd. Thus the sum of weights $\lambda_d$ is exactly $\lambda_1 = 1$ for those $f$, and we get that \eqref{eq:basic_sieve} is bounded below by \begin{equation}\label{eq:basic_sieve_LHS} \sum_{\substack{f \in V_n(\mathbb{Z}) \\\\ \mathrm{Gal}(f) \subseteq A_n}} \phi(f/H). \end{equation} On the other hand, \eqref{eq:basic_sieve} is equal to \begin{equation}\label{eq:basic_sieve_RHS} \sum_{d_1, d_2} \lambda_{d_1} \lambda_{d_2} \sum_{f \in V_n(\mathbb{Z})} \phi(f / H) \prod_{p \mid [d_1, d_2]} 1^p_{odd, n}(f). \end{equation} Thus we have that \eqref{eq:basic_sieve_LHS} $\leq$ \eqref{eq:basic_sieve_RHS}. To bound \eqref{eq:basic_sieve_RHS}, we use Poisson summation to transform the sum of $\phi 1^p_{odd, n}$ into a dualized sum of $\widehat{\phi} \widehat{1}^p_{odd, n}$ and use our understanding of the Fourier transform $1^p_{odd, n}$ to (try to) get good bounds. Then one plays a game of optimizing over the weights $\lambda_d$.

There is a major problem with this approach. As we’re unable to completely determine the Fourier transform, it’s necessary to determine where it’s large and small and to handle the regions where it’s large well. Let’s look again at the expression \begin{equation*} 1^p_{odd, n}(f) = 1^p_n(f)\frac{(-1)^{n+1}\mu_p(f) + 1^p_{sf}(f)}{2}. \end{equation*} The Fourier transform of $\mu_p$ is expected to behave very well away from $0$. But the Fourier transform of $1^p_{sf}$ can be shown to have large Fourier coefficients away from $0$, strongly affecting the resulting bounds.

Instead of studying the indicator function $1^p_{odd, n}$, we chose to study a sort of *graded* indicator function \begin{equation*} \psi_p(f) = \frac{(-1)^{n+1}1^p_n(f)\mu_p(f) + 1}{2}. \end{equation*} This is $1$ if $f$ is odd and squarefree, $0$ if $f$ is squarefree and even, and $1/2$ if $f$ is not squarefree.

On the Fourier transform side, we completely understand the Fourier transform of $1$ and we can hope to have good understanding of the Möbius function. So we should expect much better bounds.

But on the other side, this is not as clean of an indicator function as $1^p_{odd, n}$. In comparison to the basic sieve inequality \eqref{eq:basic_sieve_LHS} $\leq$ \eqref{eq:basic_sieve_RHS}, the product of indicator functions on the right hand side now becomes much messier, and the basic setup no longer applies.

Instead, in \eqref{eq:basic_sieve}, we replace $\big( \sum \lambda_d \big)^2$ by a positive semidefinite quadratic form in $\lambda_{d_1}, \lambda_{d_2}$ to get a modified Selberg sieve inequality similar to \eqref{eq:basic_sieve_LHS} $\leq$ \eqref{eq:basic_sieve_RHS}. The tail of the argument remains largely the same. Instead of bounding \eqref{eq:basic_sieve_RHS}, we bound

\begin{equation*} \sum_{d_1, d_2} \lambda_{d_1} \lambda_{d_2} \sum_{f \in V_n(\mathbb{Z})} \phi(f / H) \prod_{p \mid [d_1, d_2]} \psi_p(f). \end{equation*}

After Poisson summation, the goal becomes controlling $\widehat{\psi_p}(f)$, which essentially boils down to understanding $\widehat{\mu_p}(f)$.

In explicit coordinates, this is the task of understanding \begin{equation*} \widehat{\mu_p}(u_0, \ldots, u_n) = \frac{1}{p^{n+1}} \sum_{t_i \in \mathbb{F}_p} \mu_p(t_n x^n + \cdots + t_0) e_p(u_n t_n + \cdots + u_0 t_0). \end{equation*} This is a $\mathbb{F}_p[x]$-analogue of the classical question of bounding \begin{equation*} \sum_{n \leq x} \mu(n) e(n\theta) \end{equation*} for some real $\theta$. Baker and Harman have proved that GRH implies that\begin{equation*} \Big \lvert \sum_{n \leq x} \mu(n) e(n\theta) \Big \rvert \ll x^{\frac{3}{4} + \epsilon}, \end{equation*} and Porritt has proved the analogous result holds over function fields (where RH is known).

Applying this bound in our modified form of the Selberg sieve is what allows us to prove our first theorem.

Posted in Math.NT, Mathematics
1 Comment

Yesterday I gave a talk at the University of Oregon Number Theory seminar on *Visualizing Modular Forms*. This is a spiritual successor to my paper on Visualizing modular forms that is to appear in Simons Symposia volume *Arithmetic Geometry, Number Theory, and Computation*.

I’ve worked with modular forms for almost 10 years now, but I’ve only known what a modular form looks like for about 2 years. In this talk, I explored visual representations of modular forms, with lots of examples.

The slides are available here.

I’ll share one visualization here that I liked a lot: a visualization of a particular Maass form on $\mathrm{SL}(2, \mathbb{Z})$.

When asked if I might contribute an image for MSRI program 332, I thought it would be fun to investigate a modular form with a label roughly formed from the program number, 332. We investigate the trace form `3.32.a.a`

.

The space of weight $32$ modular forms on $\Gamma_0(3)$ with trivial central character is an $11$-dimensional vector space. The subspace of newforms is a $5$-dimensional vector space.

These newforms break down into two groups: the two embeddings of an abstract newform whose coefficients lie in a quadratic field, and the three embeddings of an abstract newform whose coefficients lie in a cubic field. The label `3.32.a.a`

is a label for the two newforms with coefficients in a quadratic field.

These images are for the trace form, made by summing the two conjugate newforms in `3.32.a.a`

. This trace form is a newform of weight $32$ on $\Gamma_1(3)$.

Each modular form is naturally defined on the upper half-plane. In these images, the upper half-plane has been mapped to the unit disk. This mapping is uniquely specified by the following pieces of information: the real line $y = 0$ in the plane is mapped to the boundary of the disk, and the three points $(0, i, \infty)$ map to the (bottom, center, top) of the disk.

This is a relatively high weight modular form, meaning that magnitudes can change very quickly. In the contoured image, each contour indicates a multiplicative change in elevation: points on one contour are $32$ times larger or smaller than points on adjacent contours.

I have a bit more about this and related visualizations on my visualization site.

On Thursday, 18 March, I gave a talk on half-integral weight Dirichlet series at the Ole Miss number theory seminar.

This talk is a description of ongoing explicit computational experimentation with Mehmet Kiral, Tom Hulse, and Li-Mei Lim on various aspects of half-integral weight modular forms and their Dirichlet series.

These Dirichlet series behave like typical beautiful automorphic L-functions in many ways, but are very different in other ways.

The first third of the talk is largely about the “typical” story. The general definitions are abstractions designed around the objects that number theorists have been playing with, and we also briefly touch on some of these examples to have an image in mind.

The second third is mostly about how half-integral weight Dirichlet series aren’t quite as well-behaved as L-functions associated to GL(2) automorphic forms, but sufficiently well-behaved to be comprehendable. Unlike the case of a full-integral weight modular form, there isn’t a canonical choice of “nice” forms to study, but we identify a particular set of forms with symmetric functional equations to study. There are several small details that can be considered here, and I largely ignore them for this talk. This is something that I hope to return to in the future.

In the final third of the talk, we examine the behavior and zeros of a handful of half-integral weight Dirichlet series. There are plots of zeros, including a plot of approximately the first 150k zeros of one particular form. These are also interesting, and I intend to investigate and describe these more on this site later.

I was recently examining a technical hurdle in my project on “Uniform bounds for lattice point counting and partial sums of zeta functions” with Takashi Taniguchi and Frank Thorne. There is a version on the arxiv, but it currently has a mistake in its handling of bounds for small $X$.

In this note, I describe an aspect of this paper that I found surprising. In fact, I’ve found it continually surprising, as I’ve reproven it to myself three times now, I think. By writing this here and in my note system, I hope to perhaps remember this better.

In this paper, we revisit an application of “Landau’s Method” to estimate partial sums of coefficients of Dirichlet series. We model this paper off of an earlier application by Chandrasakharan and Narasimhan, except that we explicitly track dependence of the several implicit constants and we prove these results uniformly for all partial sums, as opposed to sufficiently large partial sums.

The only structure is that we have a Dirichlet series $\phi(s)$, some Gamma factors $\Delta(s)$, and a functional equation of the shape $$ \phi(s) \Delta(s) = \psi(s) \Delta(1-s). $$ This is relatively structureless, and correspondingly our attack is very general. We use some smoothed approximation to the sum of coefficients, shift lines of integration to pick up polar main terms, apply the functional equation and change variables so work with the dual, and then get some collection of error terms and error integrals.

It happens to be that it’s much easier to work with a $k$-Riesz smoothed approximation. That is, if $$

\phi(s) = \sum_{n \geq 1} \frac{a(n)}{\lambda_n^s}

$$ is our Dirichlet series, and we are interested in the partial sums $$

A_0(s) = \sum_{\lambda_n \leq X} a(n),

$$ then it happens to be easier to work with the smoothed approximations $$

A_k(X) = \frac{1}{\Gamma(k+1)}\sum_{\lambda_n \leq X} a(n) (X – \lambda_n)^k a(n),

$$ and to somehow combine several of these smoothed sums together.

This smoothed sum is recognizable as $$

A_k(X) =

\frac{1}{2\pi i}\int_{c – i\infty}^{c + i\infty} \phi(s)

\frac{\Gamma(s)}{\Gamma(s + k + 1)} X^{s + k}ds

$$ for $c$ somewhere in the half-plane of convergence of the Dirichlet series. As $k$ gets large, these integrals become better behaved. In application, one takes $k$ sufficiently large to guarantee desired convergence properties.

The process of taking several of these smoothed approximations for large $k$ together, studying them through basic functional equation methods, and combinatorially combining these smoothed approximations via finite differencing to get good estimates for the sharp sum $A_0(s)$ is roughly what I think of as “Landau’s Method”.

In our paper, as we apply Landau’s method, it becomes necessary to understand certain bounds coming from the dual Dirichlet series $$

\psi(s) = \sum_{n \geq 1} \frac{b(n)}{\mu_n^s}.

$$ Specifically, it works out that the (combinatorially finite differenced) between the $k$-smoothed sum $A_k(X)$ and its $k$-smoothed main term $S_k(X)$ can be written as $$

\Delta_y^k [A_k(X) – S_k(X)] = \sum_{n \geq 1}

\frac{b(n)}{\mu_n^{\delta + k}} \Delta_y^k I_k(\mu_n X),\tag{1}

$$ where $\Delta_y^k$ is a *finite differencing operator* that we should think of as a sum of several shifts of its input function.

More precisely, $\Delta_y F(X) := F(X + y) – F(X)$, and iterating gives $$

\Delta_y^k F(X) = \sum_{j = 0}^k (-1)^{k – j} {k \choose j} F(X + jy).

$$ The $I_k(\cdot)$ term on the right of $(1)$ is an inverse Mellin transform $$

I_k(t) = \frac{1}{2 \pi i} \int_{c – i\infty}^{c + i\infty}

\frac{\Gamma(\delta – s)}{\Gamma(k + 1 + \delta – s)}

\frac{\Delta(s)}{\Delta(\delta – s)} t^{\delta + k – s} ds.

$$ Good control for this inverse Mellin transform yields good control of the error for the overall approximation. Via the method of finite differencing, there are two basic choices: either bound $I_k(t)$ directly, or understand bounds for $(\mu_n y)^k I_k^{(k)}(t)$ for $t \approx \mu_n X$. Here, $I_k^{(k)}(t)$ means the $k$th derivative of $I_k(t)$.

In the classical application (as in the paper of CN), one worries about this asymptotic mostly as $t \to \infty$. In this region, $I_k(t)$ can be well-approximated by a $J$-Bessel function, which is sufficiently well understood in large argument to give good bounds. Similarly, $I_k^{(k)}(t)$ can be contour-shifted in a way that still ends up being well-approximated by $J$-Bessel functions.

The shape of the resulting bounds end up being that $\Delta_y^k I_k(\mu_n X)$ is bounded by either

- $(\mu_n X)^{\alpha + k(1 – \frac{1}{2A})}$, where $A$ is a fixed parameter that isn’t worth describing fully, and $\alpha$ is a bound coming from the direct bound of $I_k(t)$, or
- $(\mu_n y)^k (\mu_n X)^\beta$, where $\beta$ is a bound coming from bounding $I_k^{(k)}(t)$.

In both, there is a certain $k$-dependence that comes from the $k$-th Riesz smoothing factors, either directly (from $(\mu_n y)^k$), or via its corresponding inverse Mellin transform (in the bound from $I_k(t)$). But these are the only aspects that depend on $k$.

At this point in the classical argument, one determines when one bound is better than the other, and this happens to be something that can be done exactly, and (surprisingly) independently of $k$. Using this pair of bounds and examining what comes out the other side gives the original result.

In our application, we also worry about asymptotic as $t \to 0$. While it may still be true that $I_k$ can be approximated by a $J$-Bessel function, the “well-known” asymptotics for the $J$-Bessel function behave substantially worse for small argument. Thus different methods are necessary.

It turns out that $I_k$ can be approximated in a relatively trivial way for $t \leq 1$, so the only remaining hurdle is $I_k^{(k)}(t)$ as $t \to 0$.

We’ve proved a variety of different bounds that hold in slightly different circumstances. And for each sort of bound, the next steps would be the same as before: determine when each bound is better, bound by absolute values, sum together, and then choose the various parameters to best shape the final result.

But unlike before, the boundary between the regions where $I_k$ is best bounded directly or bounded via $I_k^{(k)}$ depends on $k$. Aside from choosing $k$ sufficiently large for convergence properties (which relate to the locations of poles and growth properties of the Dirichlet series and gamma factors), any sufficiently large $k$ would suffice.

After I step away from this paper and argument for a while and come back, I wonder about the right way to choose the balancing error. That is, I rework when to use bounds coming from studying $I_k(t)$ directly vs bounds coming from studying $I_k^{(k)}(t)$.

But it turns out that there is always a reasonable heuristic choice. Further, this heuristic gives the same choice of balancing as in the case when $t \to \infty$ (although this is not the source of the heuristic).

Making these bounds will still give bounds for $\Delta_y^k I_k(\mu_n X)$ of shape

- $(\mu_n X)^{\alpha + k(1 – \frac{1}{2A})}$, where $A$ is a fixed parameter that isn’t worth describing fully, and $\alpha$ is a bound coming from the direct bound of $I_k(t)$, or
- $(\mu_n y)^k (\mu_n X)^\beta$, where $\beta$ is a bound coming from bounding $I_k^{(k)}(t)$.

The actual bounds for $\alpha$ and $\beta$ will differ between the case of small $\mu_n X$ and large $\mu_n X$ ($J$-Bessel asymptotics for large, different contour shifting analysis for small), but in both cases it turns out that $\alpha$ and $\beta$ are independent of $k$.

This is relatively easy to see when bounding $I_k^{(k)}(t)$, as repeatedly differentiating under the integral shows essentially that $$

I_k^{(k)}(t) =

\frac{1}{2\pi i}

\int \frac{\Delta(s)}{(\delta – s)\Delta(\delta – s)}

t^{\delta – s} ds.

$$ (I’ll note that the contour does vary with $k$ in a certain way that doesn’t affect the shape of the result for $t \to 0$).

When balancing the error terms $(\mu_n X)^{\alpha + k(1 – \frac{1}{2A})}$ and $(\mu_n y)^k (\mu_n X)^\beta$, the heuristic comes from taking arbitrarily large $k$. As $k \to \infty$, the point where the two error terms balance is independent of $\alpha$ and $\beta$.

This reasoning applies to the case when $\mu_n X \to \infty$ as well, and gives the same point. Coincidentally, the actual $\alpha$ and $\beta$ values we proved for $\mu_n X \to \infty$ perfectly cancel in practice, so this limiting argument is not necessary — but it does still apply!

I suppose it might be possible to add another parameter to tune in the final result — a parameter measuring deviation from the heuristic, that can be refined for any particular error bound in a region of particular interest.

But we haven’t done that.

In fact, we were slightly lossy in how we bounded $I_k^{(k)}(t)$ as $t \to 0$, and (for complicated reasons that I’ll probably also forget and reprove to myself later) the heuristic choice assuming $k \sim \infty$ and our slighly lossy bound introduce the same order of imprecision to the final result.

We’re updating our preprint and will have that up soon. But as I’ve been thinking about this a lot recently, I realize there are a few other things I should note down. I intend to write more on this in the short future.

I’m currently at an AIM workshop on Arithmetic Statistics, Discrete Restriction, and Fourier Analysis. This morning (AIM time)/afternoon (USEast time), I’ll be giving a talk on *Lattice points and sums of Fourier Coefficients of modular forms*.

The theme of this talk is embodied in the statement that several lattice counting problems like the Gauss circle problem are essentially the same as very modular-form-heavy problems, sometimes very closely similar and sometimes appearing slightly different.

In this talk, I describe several recent adventures, successes and travails, in my studies of problems related to the Gauss circle problem and the task of producing better bounds for the sum of the first several coefficients of holomorphic cuspforms.

Here are the slides for my talk.

I’ll note that various parts of this talk have appeared in several previous talks of mine, but since it’s the pandemic era this is the first time much of this has appeared in slides.

Posted in Expository, Math.NT, Mathematics
Leave a comment