## Slides from a talk on Quantitative Hilbert Irreducibility

I’m giving a talk today on my recent and forthcoming work in collaboration with Theresa Anderson, Ayla Gafni, Robert Lemke Oliver, George Shakan, Frank Thorne, Jiuya Wang, and Ruixiang Zhang. The slides for my talk can be found here.

This talk includes some discussion of our paper to appear in IMRN (link to the arXiv version, which is mostly the same as what will be published). (See also my previous discussion on this paper). But I’ll note that in this talk I lean towards a few ideas that did not make it into the paper, but which we are using in current work.

In particular, in our paper we don’t need to use group actions or classify orbit sizes, but it turns out that this is a very strong idea! I’ll note that in a very particular case, Thorne and Taniguchi have applied this type of orbit counting method in their paper “Orbital exponential sums for prehomogeneous vector spaces” to gain extremely strong, specific understanding of Fourier transform for their application.

## Project Report on “Prime Sums”

This summer, I proposed a research project for PROMYS (the PROgram in Mathematics for Young Scientists), a six-week intensive summer program at Boston University where highly motivated high school students explore mathematics. Three students (Nir Elber, Raymond Feng, and Henry Xie) chose to work on this project, and previous PROMYSer Anupam Datta gave additional guidance. Their summary of their findings can be found here. (UPDATE: a version of this now appears on the arXiv too).

Here I briefly describe the project and the work of Nir, Raymond, and Henry.

The project was organized around understanding why the following picture has so much structure.

Fundamentally, this image depicts differences between sums related to primes. Let $p_n$ denote the $n$th prime. It follows from the Prime Number Theorem that $p_n \approx n \log n$, and thus that $n p_n \approx n^2 \log n$. One can also show that $$\sum_{m \leq n} p_m \approx \frac{1}{2} n^2 \log n,$$ and thus we should have that $$\frac{n p_n}{\sum_{m \leq n} p_m} \to 2.\tag{1}$$

The vertical axis in the image above examines differences between consecutive $n$ in $(1)$ (in log scale), while the horizontal axis gives $n$ (also in log scale).

The fact that $(1) \to 0$ corresponds to the overall downwards trend in the graph. But there is so much more structure! Why do the points fall into “troughs” or along “curtains”? Does each line mean something?

In this version, I’ve colored differences coming from when $p_n$ is a twin prime (in blue), a cousin prime (in green), a sexy prime (in red), or a prime $p$ such that the next prime is $p+8$ (in cyan). The first dot is black because it comes from $2$. The next two correspond to $3$ and $5$ (both twin primes), and the fourth dot corresponds to $7$ and is green because the next prime after $7$ is $11$, and so on.

This is a strong hint at distributional aspects alluded to within the plots.

Nir, Raymond, and Henry proved many things! They quantified the rate of convergence in $(1)$ and thus quantified the guaranteed downward trend in the images and found images that better convey the structure of what’s going on better. I was already very impressed, but then they branched out and studied more!

### Cramér’s Model

We chose to investigate a nuanced question: what aspects of the initial plots depend strongly on the fact that the underlying data consists of primes, and what aspects depend only on the fact that the underlying data consists of integers with the same density as the primes?

To study this, one can create a new set of distinguished elements called Promys Primes (PPrimes) with the same density as true primes using probabilistic ideas of Cramér. Let’s call $2$ and $3$ PPrimes, and then for each odd $m \geq 5$, we call $m$ a PPrime with probability $2 / \log m$. Do this for a large sequence of $m$, and we get a collection of PPrimes that has (with very high probability) the same density as true primes, but none of the multiplicative structure.

It turns out that for sets of PPrimes, there are analogous pictures and the asymptotics are even better! This is in section 3 of their write-up.

### Gaussian Integers

We also thought to study analogous situations in related sets of primes, such as the Gaussian integers. Recall that the Gaussian integers $\mathbb{Z}[i] = \{ a + bi : a, b \in \mathbb{Z} \}$ are a unique factorization domain and have a rich theory of primes. Sometimes this theory is very similar to the standard theory of primes over $\mathbb{Z}$. But there are challenges.

One significant challenge is that $\mathbb{C}$ is not ordered. A related challenge is that there are more units. Over $\mathbb{Z}$, both $2$ and $-2$ are primes, but we typically recognize $2$ as being more “simple”. For Gaussian primes, there isn’t such a choice; for example each of $1 + i, 1 – i, -1 + i, -1 – i$ are Gaussian primes, but none are more simple or fundamental than the others.

More concretely, one has to be careful even with how to define the “sum of the first $n$ primes”. One natural thought might be to sum all Gaussian primes $\pi$ that have norm up to $X$. But one can quickly see that this sum is $0$ for analogous reasons to why the sum of all the typical primes with absolute values up to $X$ must vanish ($p + -p = 0$). In the Gaussian case, it is also true that $$\sum_{N(\pi) \leq X} \pi^2 = 0.$$

But they considered higher powers, where there aren’t trivial or obvious reasons for massive cancellation, and they showed that there is always nontrivial cancellation. This is interesting on its own!

Then they also constructed a mixture, a Cramér-type model for Gaussian primes and showed that one should expect nontrivial cancellation there for purely distributional reasons.

I leave the details to their write-up. But they’ve done great work, and I look forward to seeing what they come up with in the future.

## Slides from a talk at Maine-Québec

At this year’s Maine-Québec Number Theory Conference, I’m giving a talk on Zeros of half-integral weight Dirichlet series. Here are the slides. I note that the references for the slides are included here at the end.

I’ll also note a few open problems that I don’t know how to handle and that I briefly describe during the talk.

1. Is it possible to show that every (symmetrized) Dirichlet series associated to a half-integral weight modular form must have zeros off the critical line? This is true in practice, but seems hard to show.
2. Is it possible to determine whether a given Dirichlet series has zeros in the half-plane of absolute convergence? If there is one zero, there are infinitely many – but is there a way of determining if there are any?
3. Why does there seem to be a gap around the critical line in zero distribution?
4. Can one explain why the pair correlation seems well-behaved (even heuristically)?

If you have any ideas, let me know!

## Slides from a talk at Bridges 2021

I gave a talk on visualizations of modular forms made with Adam Sakareassen at Bridges 2021. This talk goes with our short article. In this talk, I describe the line of ideas going towards producing three dimensional visualizations of modular forms, which I like to call modular terrains. When we first wrote that talk, we were working towards the following video visualization.

We are now working in a few different directions, involving informational visualizations of different forms and different types of forms, as well as purely artistic visualizations.

I’ve recently been very fond of including renderings based on a picture of my wife and I in Iceland (from the beforetimes). This is us as a wallpaper (preserving many of the symmetries) for a particular modular form.

I reused a few images from Painted Modular Terrains, which I made a few months ago.

If you’re interested, you might also like a few previous talks and papers of mine:

## Paper announcement: Quantitative HIT and Almost Prime Polynomial Discriminants

Theresa Anderson, Ayla Gafni, Robert Lemke Oliver, George Shakan, Ruixiang Zhang, and I have just uploaded a preprint to the arXiv called Quantitative Hilbert Irreducibility and Almost Prime Values of Polynomial Discriminants.

This project began at an AIM workshop on Fourier analysis, arithmetic statistics, and discrete restriction.

Our guiding question was very open. For some nice local polynomial conditions, can we make sense of the Fourier transforms of these local conditions well enough to have arithmetic application?

This is partly inspired from Orbital exponential sums for prehomogeneous vector spaces by Takashi Taniguchi and Frank Thorne (preprint available on the arXiv). In this paper, Frank and Takashi algebraically compute Fourier transforms of a couple arithmetically interesting functions on prehomogeneous vector spaces over finite fields. It turns out that one can, for example, explicitly and completely compute the Fourier transform of the characteristic function of singular binary cubic forms over $\mathbb{F}_{q}$.

In a companion paper, Takashi and Frank combine those computations with sieves to prove that there are $\gg X / \log X$ cubic fields whose discriminant is squarefree, bounded above by $X$, and has at most $3$ prime factors. They also show there are $\gg X / \log X$ quartic fields whose discriminant is squarefree, bounded above by $X$, and has at most $8$ prime factors.

## Results

We have two classes of result. Both rely on similar types of analysis, and are each centered on a study of a particular indicator-type function, its Fourier transform, and a sieve.

First, we prove a bound on the number of polynomials whose Galois group is a subgroup of $A_n$. For $H > 1$, define \begin{equation*} V_n(H) = \{ f \in \mathbb{Z}[x] : \mathrm{ht}(f) \leq H \} \end{equation*} and \begin{equation*} E_n(H, A_n) := \# \{ f \in V_n(H) : \mathrm{Gal}(f) \subseteq A_n \}. \end{equation*} We show that $$E_n(H, A_n) \ll H^{n – \frac{2}{3} + O(1/n)}.$$ This is an improvement on progress towards a conjecture of Van der Waerden and is a quantitative form of Hilbert’s Irreducibility Theorem, which shows (among other applications) that most monic irreducibile polynomials have full Galois group.

However I should note that Bhargava has announced a proof of a (slightly weakened form of) Van der Waerden’s conjecture, and his result is strictly stronger than our result.

Secondly, we prove that for any $n \geq 3$ and $r \geq 2n – 3$, we have $$\# \{ f \in \mathbb{Z}[x] : \mathrm{ht}(f) \leq H, f \, \text{monic }, \omega(\mathrm{Disc}(f)) \leq r \} \gg_{n, r} \frac{H^n}{\log H},$$ where $\omega(\cdot)$ denotes the number of distinct prime divisors. Qualitatively, this says that there are lots of polynomials with almost prime discriminants.

As a corollary of this second result, we prove that for $n \geq 3$ and $r \geq 2n – 3$, $$\# \{ F / \mathbb{Q} : [F \colon Q] = n, \mathrm{Disc}(F) \leq X, \omega(\mathrm{Disc}(F)) \leq r \} \gg_{n, r, \epsilon} X^{\frac{1}{2} + \delta_n – \epsilon}$$ for explicit $\delta_n > 0$ and any $\epsilon > 0$. This shows that there are at least $X^{1/2}$ cubic fields whose discriminants are divisible by at most $3$ primes, or at least $X^{1/2}$ quartic fields whose discriminants are divisible by at most $5$ primes, for example. We guarantee fewer fields than Taniguchi and Thorne, but we guarantee fields with fewer prime factors and cover all degrees.

In the remainder of this post, I’ll describe a line of thinking that went towards proving our first result.

## Odd polynomials

We initially studied the Fourier transform of the odd-polynomial indicator function. We call a function $f(x) \in \mathbb{F}_p[x]$ odd if it has no repeated roots and the factorization type of $f$ corresponds to an odd permutation in the Galois group. That is, we can write $f$ as \begin{equation*} f(x) = f_1(x) f_2(x) \cdots f_r(x) \bmod p, \end{equation*} and there will be an element of the Galois group with cycle type $(\deg f_1) (\deg f_2) \cdots (\deg f_r)$. For odd $f$, this cycle must be an odd permutation.

A more convenient description of oddness is in terms of the Möbius function on $\mathbb{F}_p[x]$. A degree $n$ polynomial $f$ is odd precisely if $\mu_p(f) = (-1)^{n+1}$. Define $1^p_{sf}(f)$ to be the squarefree indicator function on $\mathbb{F}_p[x]$, and define $1^p_{odd, n}$ to be the odd indicator function on degree $n$ polynomials on $\mathbb{F}_p[x]$. Then \begin{equation*} 1^p_{odd, n}(f) = 1^p_n(f)\frac{(-1)^{n+1}\mu_p(f) + 1^p_{sf}(f)}{2}. \end{equation*} (Here, $1^p_n(f)$ keeps only the degree $n$ polynomials).

## Fourier transform of odd indicator function: a first approach

We then studied the Fourier transform of $1^p_{odd, n}$. Identifying the vector space of polynomials of degree at most $n$ over $\mathbb{F}_p[x]$, which we denote at $V_n(\mathbb{Z}/p\mathbb{Z})$, as $(\mathbb{Z}/p\mathbb{Z})^{n+1}$, we can study the Fourier transform of a function $\psi:V_n(\mathbb{Z}/p\mathbb{Z}) \longrightarrow \mathbb{C}$, \begin{equation*} \widehat{\psi}(\mathbf{u}) = \frac{1}{p^{n+1}} \sum_{f \in V_n(\mathbb{Z}/p\mathbb{Z})} \psi(f) e_p(\langle f, \mathbf{u} \rangle). \end{equation*} Here, $e_p(x) = e^{2 \pi i x / p}$.

It is possible to understand this Fourier transform using ideas similar to those of Takashi and Thorne. $\mathrm{GL}(2)$ acts on these polynomials in a similar way as it acts on quadratic forms, and $1^p_{odd, n}$ is invariant under this action. As in Takashi and Thorne, one can study the sizes of the Fourier transform on each orbit. This leads to several classical polynomial counting problems.

But unlike the prehomogeneous vector space context of Takashi and Thorne, we can’t completely determine the Fourier transform. For general degree, there are too many other terms.

Ultimately, we intend to use the knowledge of this Fourier transform as an ingredient in a sieve. An old theorem of Dedekind shows that if $\mathrm{Gal}(f) \subseteq A_n$, then $f$ is never odd mod any prime $p$.

We could use a Selberg sieve in the following form. For a nonnegative weight function $\phi: V_n(\mathbb{R}) \longrightarrow \mathbb{R}$ (roughly supported on the box $[-1, 1]^{n+1}$). Then consider $$\label{eq:basic_sieve} \sum_{f \in V_n(\mathbb{Z})} \phi(f/H) \Big(\sum_{d: f \bmod p \text{ is odd } \forall p \mid d} \lambda_d \Big)^2 \geq 0$$ for some real weights $\lambda_d$ to be chosen later, but where $\lambda_1 = 1$.

For $f$ with $\mathrm{Gal}(f) \subseteq A_n$, $f$ is never odd. Thus the sum of weights $\lambda_d$ is exactly $\lambda_1 = 1$ for those $f$, and we get that \eqref{eq:basic_sieve} is bounded below by $$\label{eq:basic_sieve_LHS} \sum_{\substack{f \in V_n(\mathbb{Z}) \\\\ \mathrm{Gal}(f) \subseteq A_n}} \phi(f/H).$$ On the other hand, \eqref{eq:basic_sieve} is equal to $$\label{eq:basic_sieve_RHS} \sum_{d_1, d_2} \lambda_{d_1} \lambda_{d_2} \sum_{f \in V_n(\mathbb{Z})} \phi(f / H) \prod_{p \mid [d_1, d_2]} 1^p_{odd, n}(f).$$ Thus we have that \eqref{eq:basic_sieve_LHS} $\leq$ \eqref{eq:basic_sieve_RHS}. To bound \eqref{eq:basic_sieve_RHS}, we use Poisson summation to transform the sum of $\phi 1^p_{odd, n}$ into a dualized sum of $\widehat{\phi} \widehat{1}^p_{odd, n}$ and use our understanding of the Fourier transform $1^p_{odd, n}$ to (try to) get good bounds. Then one plays a game of optimizing over the weights $\lambda_d$.

### Problem

There is a major problem with this approach. As we’re unable to completely determine the Fourier transform, it’s necessary to determine where it’s large and small and to handle the regions where it’s large well. Let’s look again at the expression \begin{equation*} 1^p_{odd, n}(f) = 1^p_n(f)\frac{(-1)^{n+1}\mu_p(f) + 1^p_{sf}(f)}{2}. \end{equation*} The Fourier transform of $\mu_p$ is expected to behave very well away from $0$. But the Fourier transform of $1^p_{sf}$ can be shown to have large Fourier coefficients away from $0$, strongly affecting the resulting bounds.

## Graded indicator function: a second approach

Instead of studying the indicator function $1^p_{odd, n}$, we chose to study a sort of graded indicator function \begin{equation*} \psi_p(f) = \frac{(-1)^{n+1}1^p_n(f)\mu_p(f) + 1}{2}. \end{equation*} This is $1$ if $f$ is odd and squarefree, $0$ if $f$ is squarefree and even, and $1/2$ if $f$ is not squarefree.

On the Fourier transform side, we completely understand the Fourier transform of $1$ and we can hope to have good understanding of the Möbius function. So we should expect much better bounds.

But on the other side, this is not as clean of an indicator function as $1^p_{odd, n}$. In comparison to the basic sieve inequality \eqref{eq:basic_sieve_LHS} $\leq$ \eqref{eq:basic_sieve_RHS}, the product of indicator functions on the right hand side now becomes much messier, and the basic setup no longer applies.

Instead, in \eqref{eq:basic_sieve}, we replace $\big( \sum \lambda_d \big)^2$ by a positive semidefinite quadratic form in $\lambda_{d_1}, \lambda_{d_2}$ to get a modified Selberg sieve inequality similar to \eqref{eq:basic_sieve_LHS} $\leq$ \eqref{eq:basic_sieve_RHS}. The tail of the argument remains largely the same. Instead of bounding \eqref{eq:basic_sieve_RHS}, we bound

\begin{equation*} \sum_{d_1, d_2} \lambda_{d_1} \lambda_{d_2} \sum_{f \in V_n(\mathbb{Z})} \phi(f / H) \prod_{p \mid [d_1, d_2]} \psi_p(f). \end{equation*}

After Poisson summation, the goal becomes controlling $\widehat{\psi_p}(f)$, which essentially boils down to understanding $\widehat{\mu_p}(f)$.

In explicit coordinates, this is the task of understanding \begin{equation*} \widehat{\mu_p}(u_0, \ldots, u_n) = \frac{1}{p^{n+1}} \sum_{t_i \in \mathbb{F}_p} \mu_p(t_n x^n + \cdots + t_0) e_p(u_n t_n + \cdots + u_0 t_0). \end{equation*} This is a $\mathbb{F}_p[x]$-analogue of the classical question of bounding \begin{equation*} \sum_{n \leq x} \mu(n) e(n\theta) \end{equation*} for some real $\theta$. Baker and Harman have proved that GRH implies that\begin{equation*} \Big \lvert \sum_{n \leq x} \mu(n) e(n\theta) \Big \rvert \ll x^{\frac{3}{4} + \epsilon}, \end{equation*} and Porritt has proved the analogous result holds over function fields (where RH is known).

Applying this bound in our modified form of the Selberg sieve is what allows us to prove our first theorem.

Posted in Math.NT, Mathematics | 1 Comment

## Slides from a talk on Visualizing Modular Forms

Yesterday I gave a talk at the University of Oregon Number Theory seminar on Visualizing Modular Forms. This is a spiritual successor to my paper on Visualizing modular forms that is to appear in Simons Symposia volume Arithmetic Geometry, Number Theory, and Computation.

I’ve worked with modular forms for almost 10 years now, but I’ve only known what a modular form looks like for about 2 years. In this talk, I explored visual representations of modular forms, with lots of examples.

I’ll share one visualization here that I liked a lot: a visualization of a particular Maass form on $\mathrm{SL}(2, \mathbb{Z})$.

Posted in Expository, Math.NT, Mathematics | Tagged | 1 Comment

## Trace form 3.32.a.a

When asked if I might contribute an image for MSRI program 332, I thought it would be fun to investigate a modular form with a label roughly formed from the program number, 332. We investigate the trace form 3.32.a.a.

The space of weight $32$ modular forms on $\Gamma_0(3)$ with trivial central character is an $11$-dimensional vector space. The subspace of newforms is a $5$-dimensional vector space.

These newforms break down into two groups: the two embeddings of an abstract newform whose coefficients lie in a quadratic field, and the three embeddings of an abstract newform whose coefficients lie in a cubic field. The label 3.32.a.a is a label for the two newforms with coefficients in a quadratic field.

These images are for the trace form, made by summing the two conjugate newforms in 3.32.a.a. This trace form is a newform of weight $32$ on $\Gamma_1(3)$.

Each modular form is naturally defined on the upper half-plane. In these images, the upper half-plane has been mapped to the unit disk. This mapping is uniquely specified by the following pieces of information: the real line $y = 0$ in the plane is mapped to the boundary of the disk, and the three points $(0, i, \infty)$ map to the (bottom, center, top) of the disk.

This is a relatively high weight modular form, meaning that magnitudes can change very quickly. In the contoured image, each contour indicates a multiplicative change in elevation: points on one contour are $32$ times larger or smaller than points on adjacent contours.

## Slides from a talk on Half Integral Weight Dirichlet Series

On Thursday, 18 March, I gave a talk on half-integral weight Dirichlet series at the Ole Miss number theory seminar.

This talk is a description of ongoing explicit computational experimentation with Mehmet Kiral, Tom Hulse, and Li-Mei Lim on various aspects of half-integral weight modular forms and their Dirichlet series.

These Dirichlet series behave like typical beautiful automorphic L-functions in many ways, but are very different in other ways.

The first third of the talk is largely about the “typical” story. The general definitions are abstractions designed around the objects that number theorists have been playing with, and we also briefly touch on some of these examples to have an image in mind.

The second third is mostly about how half-integral weight Dirichlet series aren’t quite as well-behaved as L-functions associated to GL(2) automorphic forms, but sufficiently well-behaved to be comprehendable. Unlike the case of a full-integral weight modular form, there isn’t a canonical choice of “nice” forms to study, but we identify a particular set of forms with symmetric functional equations to study. There are several small details that can be considered here, and I largely ignore them for this talk. This is something that I hope to return to in the future.

In the final third of the talk, we examine the behavior and zeros of a handful of half-integral weight Dirichlet series. There are plots of zeros, including a plot of approximately the first 150k zeros of one particular form. These are also interesting, and I intend to investigate and describe these more on this site later.

## A balancing act in “Uniform bounds for lattice point counting”

I was recently examining a technical hurdle in my project on “Uniform bounds for lattice point counting and partial sums of zeta functions” with Takashi Taniguchi and Frank Thorne. There is a version on the arxiv, but it currently has a mistake in its handling of bounds for small $X$.

In this note, I describe an aspect of this paper that I found surprising. In fact, I’ve found it continually surprising, as I’ve reproven it to myself three times now, I think. By writing this here and in my note system, I hope to perhaps remember this better.

## Landau’s Method

In this paper, we revisit an application of “Landau’s Method” to estimate partial sums of coefficients of Dirichlet series. We model this paper off of an earlier application by Chandrasakharan and Narasimhan, except that we explicitly track dependence of the several implicit constants and we prove these results uniformly for all partial sums, as opposed to sufficiently large partial sums.

The only structure is that we have a Dirichlet series $\phi(s)$, some Gamma factors $\Delta(s)$, and a functional equation of the shape $$\phi(s) \Delta(s) = \psi(s) \Delta(1-s).$$ This is relatively structureless, and correspondingly our attack is very general. We use some smoothed approximation to the sum of coefficients, shift lines of integration to pick up polar main terms, apply the functional equation and change variables so work with the dual, and then get some collection of error terms and error integrals.

It happens to be that it’s much easier to work with a $k$-Riesz smoothed approximation. That is, if $$\phi(s) = \sum_{n \geq 1} \frac{a(n)}{\lambda_n^s}$$
is our Dirichlet series, and we are interested in the partial sums $$A_0(s) = \sum_{\lambda_n \leq X} a(n),$$
then it happens to be easier to work with the smoothed approximations $$A_k(X) = \frac{1}{\Gamma(k+1)}\sum_{\lambda_n \leq X} a(n) (X – \lambda_n)^k a(n),$$
and to somehow combine several of these smoothed sums together.

This smoothed sum is recognizable as $$A_k(X) = \frac{1}{2\pi i}\int_{c – i\infty}^{c + i\infty} \phi(s) \frac{\Gamma(s)}{\Gamma(s + k + 1)} X^{s + k}ds$$
for $c$ somewhere in the half-plane of convergence of the Dirichlet series. As $k$ gets large, these integrals become better behaved. In application, one takes $k$ sufficiently large to guarantee desired convergence properties.

The process of taking several of these smoothed approximations for large $k$ together, studying them through basic functional equation methods, and combinatorially combining these smoothed approximations via finite differencing to get good estimates for the sharp sum $A_0(s)$ is roughly what I think of as “Landau’s Method”.

## Application and shape of the error

In our paper, as we apply Landau’s method, it becomes necessary to understand certain bounds coming from the dual Dirichlet series $$\psi(s) = \sum_{n \geq 1} \frac{b(n)}{\mu_n^s}.$$
Specifically, it works out that the (combinatorially finite differenced) between the $k$-smoothed sum $A_k(X)$ and its $k$-smoothed main term $S_k(X)$ can be written as $$\Delta_y^k [A_k(X) – S_k(X)] = \sum_{n \geq 1} \frac{b(n)}{\mu_n^{\delta + k}} \Delta_y^k I_k(\mu_n X),\tag{1}$$
where $\Delta_y^k$ is a finite differencing operator that we should think of as a sum of several shifts of its input function.

More precisely, $\Delta_y F(X) := F(X + y) – F(X)$, and iterating gives $$\Delta_y^k F(X) = \sum_{j = 0}^k (-1)^{k – j} {k \choose j} F(X + jy).$$
The $I_k(\cdot)$ term on the right of $(1)$ is an inverse Mellin transform $$I_k(t) = \frac{1}{2 \pi i} \int_{c – i\infty}^{c + i\infty} \frac{\Gamma(\delta – s)}{\Gamma(k + 1 + \delta – s)} \frac{\Delta(s)}{\Delta(\delta – s)} t^{\delta + k – s} ds.$$
Good control for this inverse Mellin transform yields good control of the error for the overall approximation. Via the method of finite differencing, there are two basic choices: either bound $I_k(t)$ directly, or understand bounds for $(\mu_n y)^k I_k^{(k)}(t)$ for $t \approx \mu_n X$. Here, $I_k^{(k)}(t)$ means the $k$th derivative of $I_k(t)$.

## Large input errors

In the classical application (as in the paper of CN), one worries about this asymptotic mostly as $t \to \infty$. In this region, $I_k(t)$ can be well-approximated by a $J$-Bessel function, which is sufficiently well understood in large argument to give good bounds. Similarly, $I_k^{(k)}(t)$ can be contour-shifted in a way that still ends up being well-approximated by $J$-Bessel functions.

The shape of the resulting bounds end up being that $\Delta_y^k I_k(\mu_n X)$ is bounded by either

• $(\mu_n X)^{\alpha + k(1 – \frac{1}{2A})}$, where $A$ is a fixed parameter that isn’t worth describing fully, and $\alpha$ is a bound coming from the direct bound of $I_k(t)$, or
• $(\mu_n y)^k (\mu_n X)^\beta$, where $\beta$ is a bound coming from bounding $I_k^{(k)}(t)$.

In both, there is a certain $k$-dependence that comes from the $k$-th Riesz smoothing factors, either directly (from $(\mu_n y)^k$), or via its corresponding inverse Mellin transform (in the bound from $I_k(t)$). But these are the only aspects that depend on $k$.

At this point in the classical argument, one determines when one bound is better than the other, and this happens to be something that can be done exactly, and (surprisingly) independently of $k$. Using this pair of bounds and examining what comes out the other side gives the original result.

## Small input errors

In our application, we also worry about asymptotic as $t \to 0$. While it may still be true that $I_k$ can be approximated by a $J$-Bessel function, the “well-known” asymptotics for the $J$-Bessel function behave substantially worse for small argument. Thus different methods are necessary.

It turns out that $I_k$ can be approximated in a relatively trivial way for $t \leq 1$, so the only remaining hurdle is $I_k^{(k)}(t)$ as $t \to 0$.

We’ve proved a variety of different bounds that hold in slightly different circumstances. And for each sort of bound, the next steps would be the same as before: determine when each bound is better, bound by absolute values, sum together, and then choose the various parameters to best shape the final result.

But unlike before, the boundary between the regions where $I_k$ is best bounded directly or bounded via $I_k^{(k)}$ depends on $k$. Aside from choosing $k$ sufficiently large for convergence properties (which relate to the locations of poles and growth properties of the Dirichlet series and gamma factors), any sufficiently large $k$ would suffice.

## Limiting behavior gives a heuristic region

After I step away from this paper and argument for a while and come back, I wonder about the right way to choose the balancing error. That is, I rework when to use bounds coming from studying $I_k(t)$ directly vs bounds coming from studying $I_k^{(k)}(t)$.

But it turns out that there is always a reasonable heuristic choice. Further, this heuristic gives the same choice of balancing as in the case when $t \to \infty$ (although this is not the source of the heuristic).

Making these bounds will still give bounds for $\Delta_y^k I_k(\mu_n X)$ of shape

• $(\mu_n X)^{\alpha + k(1 – \frac{1}{2A})}$, where $A$ is a fixed parameter that isn’t worth describing fully, and $\alpha$ is a bound coming from the direct bound of $I_k(t)$, or
• $(\mu_n y)^k (\mu_n X)^\beta$, where $\beta$ is a bound coming from bounding $I_k^{(k)}(t)$.

The actual bounds for $\alpha$ and $\beta$ will differ between the case of small $\mu_n X$ and large $\mu_n X$ ($J$-Bessel asymptotics for large, different contour shifting analysis for small), but in both cases it turns out that $\alpha$ and $\beta$ are independent of $k$.

This is relatively easy to see when bounding $I_k^{(k)}(t)$, as repeatedly differentiating under the integral shows essentially that $$I_k^{(k)}(t) = \frac{1}{2\pi i} \int \frac{\Delta(s)}{(\delta – s)\Delta(\delta – s)} t^{\delta – s} ds.$$
(I’ll note that the contour does vary with $k$ in a certain way that doesn’t affect the shape of the result for $t \to 0$).

When balancing the error terms $(\mu_n X)^{\alpha + k(1 – \frac{1}{2A})}$ and $(\mu_n y)^k (\mu_n X)^\beta$, the heuristic comes from taking arbitrarily large $k$. As $k \to \infty$, the point where the two error terms balance is independent of $\alpha$ and $\beta$.

This reasoning applies to the case when $\mu_n X \to \infty$ as well, and gives the same point. Coincidentally, the actual $\alpha$ and $\beta$ values we proved for $\mu_n X \to \infty$ perfectly cancel in practice, so this limiting argument is not necessary — but it does still apply!

I suppose it might be possible to add another parameter to tune in the final result — a parameter measuring deviation from the heuristic, that can be refined for any particular error bound in a region of particular interest.

But we haven’t done that.

In fact, we were slightly lossy in how we bounded $I_k^{(k)}(t)$ as $t \to 0$, and (for complicated reasons that I’ll probably also forget and reprove to myself later) the heuristic choice assuming $k \sim \infty$ and our slighly lossy bound introduce the same order of imprecision to the final result.

## More coming soon

We’re updating our preprint and will have that up soon. But as I’ve been thinking about this a lot recently, I realize there are a few other things I should note down. I intend to write more on this in the short future.

## Setting up a Wacom Intuos CTL-4100 drawing tablet on Ubuntu 20.04 LTS

For much of the pandemic, when it has come time to write things by hand, I could write on my (old, inexpensive) tablet, or write on paper and point a camera. But more recently I’ve begun to use collaborative whiteboards, and my tablet simply cannot handle it. To be fair, it’s several years old, I got it on sale, and it was even then quite inexpensive. But it’s just not up to the task.

So I bought a Wacom drawing tablet to plug into my computer. Specifically, I bought a Wacom Intuos CTL-4100 (about 70 dollars) and have gotten it working on my multiple monitor Ubuntu 20.04 LTS setup.

For many, that would be the end of the story — as these work very well and are just plug-and-play. Or at least, that’s the story on supported systems. I use linux as my daily driver, and on my main machine I use Ubuntu. This is explicitly unsupported by Wacom, but there has long been community support and community drivers.

I note here the various things that I’ve done to make this tablet work out well.

My ubuntu distribution (20.04 LTS) already had drivers installed, so I could just plug it in and “use” the drawing tablet. But there were problems.

Firstly, it turns out that when Wacom Intuos CTL-4100 is first plugged in, the status light on the Wacom dims and indicates that it’s miscalibrated. This is immediately noticeable, as the left third of the tablet corresponds to the whole writing area on the screen (which also happens to be incorrect at first — this is the second point handled below).

This is caused by the tablet mis-identifying my operating system as Android, and the dimmed light is one way the tablet indicates it’s in Android mode. (I’ll note that this is also indicated with a different vendor ID in lsusb, where it’s reported as 0x2D1F instead of 0x056A. This doesn’t actually matter, but it did help me track down the problem).

Thus after plugging in my tablet, it is necessary to restart the tablet in “PC Mode”. This is done by holding the two outer keys on the tablet for a few seconds until the light turns off and on again. After it turns on, it should be at full brightness.

Secondly, I also have multiple screens set up. Although it looks fine, in practice what actually happens is that I have a single “screen” of a certain dimension and the X window system partitions the screen across my monitors. But the Wacom tablet initially was mapped to the whole “screen”, and thus the left side of the tablet was at the far left of my left monitor, and 7 inches or so to the right on the tablet corresponded to the far right of my right monitor. All of my writing had the wrong aspect ratio and this was totally unwieldy.

But this is fixable. After plugging in the tablet and having it in PC Mode (described above), it is possible to map its output to a region of the “screen”. This is easiest done through xrandr and xsetwacom.

First, I used xrandr --listactivemonitors to get the name of my monitors. I see that my right monitor is labelled DP-2. I’ve decided that my monitor labelled DP-2 will be the monitor in which I use this tablet — the area on the tablet will correspond to the area mapped to my right monitor.

Now I will map the STYLUS to this monitor. First I need to find the id of the stylus. To do this, I use xsetwacom --list devices, whose output for me was

Wacom Intuos S Pad pad id: 21 type: PAD
Wacom Intuos S Pen stylus id: 22 type: STYLUS
Wacom Intuos S Pen eraser id: 23 type: ERASER
Wacom Intuos S Pen cursor id: 24 type: CURSOR


I want to map the stylus. (I don’t currently know the effect of mapping anythign else, and that hasn’t been necessary, but I suppose this is a thing to keep in mind). Thus I note the id 22.

Then I run xsetwacom --set "21" MapToOutput DP-2, and all works excellently.

I’m sure that I’ll encounter more problems at some point in the future. When I do, I’ll update these notes accordingly.