mixedmath

Explorations in math and programming
David Lowry-Duda



I recently wrote an answer to a question on MSE about estimating the number of squarefree integers up to $X$. Although the result is known and not too hard, I very much like the proof and my approach. So I write it down here.

First, let's see if we can understand why this "should" be true from an analytic perspective.

We know that $$ \sum_{n \geq 1} \frac{\mu(n)^2}{n^s} = \frac{\zeta(s)}{\zeta(2s)},$$ and a general way of extracting information from Dirichlet series is to perform a cutoff integral transform (or a type of Mellin transform). In this case, we get that $$ \sum_{n \leq X} \mu(n)^2 = \frac{1}{2\pi i} \int_{(2)} \frac{\zeta(s)}{\zeta(2s)} X^s \frac{ds}{s},$$ where the contour is the vertical line $\text{Re }s = 2$. By Cauchy's theorem, we shift the line of integration left and poles contribute terms or large order. The pole of $\zeta(s)$ at $s = 1$ has residue $$ \frac{X}{\zeta(2)},$$ so we expect this to be the leading order. Naively, since we know that there are no zeroes of $\zeta(2s)$ on the line $\text{Re } s = \frac{1}{2}$, we might expect to push our line to exactly there, leading to an error of $O(\sqrt X)$. But in fact, we know more. We know the zero-free region, which allows us to extend the line of integration ever so slightly inwards, leading to a $o(\sqrt X)$ result (or more specifically, something along the lines of $O(\sqrt X e^{-c (\log X)^\alpha})$ where $\alpha$ and $c$ come from the strength of our known zero-free region.

In this heuristic analysis, I have omitted bounding the top, bottom, and left boundaries of the rectangles of integration. But proceeding in a similar way as in the proof of the analytic prime number theorem, you could proceed here. So we expect the answer to look like $$ \frac{X}{\zeta(2)} + O(\sqrt X e^{-c (\log X)^\alpha})$$ using no more than the zero-free region that goes into the prime number theorem.

We will now prove this result, but in an entirely elementary way (except that I will refer to a result from the prime number theorem).

We do this in a series of steps.

\begin{equation*} \sum_{d^2 \mid n} \mu(d) = \begin{cases} 1 & \text{if } n \text{ is squarefree} \\ 0 & \text{else} \end{cases} \end{equation*}

Proof. This comes almost immediately upon noticing that this is a multiplicative function, and it's trivial to prove it for prime powers. $\spadesuit$

So to sum up the squarefree numbers up to $X$, we look at $$ \sum_{n \leq X} \sum_{d^2 \mid n} \mu(d) = \sum_{d^2e \leq X} \mu(d)= \sum_{d^2 \leq X} \mu(d) \left\lfloor \frac{X}{d^2} \right\rfloor.$$

This last expression is written in one of the links in Marty's answer, and they prove it with inclusion-exclusion. I happen to find this derivation more intuitive, but it's our launching point forwards.

We approximate the floored bit. Notice that $$ \left \lfloor \frac{X}{d^2} \right \rfloor = \frac{X}{d^2} + E(X,d)$$ with $\lvert E(x,d) \rvert \leq 1$ (we think of it as the Error of the approximation). So the number of squarefree numbers up to $X$ is $$ \sum_{d^2 \leq X} \mu(d) \frac{X}{d^2} + \sum_{d^2 \leq X}\mu(d) E(x,d).$$ We look at the two terms separately.

The first term

The first term can be approximated by the infinite series plus an error term. $$ \sum_{d^2 \leq X} \frac{\mu(d)}{d^2} = X\sum_{d \geq 1} \frac{\mu(d)}{d^2} - X\sum_{d > \sqrt X} \frac{\mu(d)}{d^2} = \frac{X}{\zeta(2)} - X\sum_{d > \sqrt X} \frac{\mu(d)}{d^2}.$$

We must now be a bit careful. If we perform the naive bound, by bounding $\mu(n) \leq 1$, then this last sum is of size $O(\sqrt X)$. That's too big!

So instead, we integrate by parts (Riemann-Stieltjes integration) or equivalently we perform summation by parts to see that $$ \sum_{d > \sqrt X} \frac{\mu(d)}{d^2} = O\left( \int_{\sqrt X}^\infty \frac{M(t)}{t^3} dt \right)$$ where $$ M(t) = \sum_{n \leq t} \mu(n).$$ By the prime number theorem (and as you mention in your question), we know that $M(t) = o(t)$. (In fact, the analytic prime number theorem in one of the easy forms is that $M(X) = O(X e^{-c (\log X)^{1/9}})$, which we might use here). This means that this last term is bounded by $$ o(\sqrt X)$$ if we just use that $M(t) = o(t)$, or $$ O(\sqrt X e^{-c (\log X)^{1/9}})$$ if we use more. This completes the first term $\spadesuit$

Second term

This is easier now. We again use integration by parts. Notice that $$ \begin{align} \sum_{d \leq \sqrt X} \mu(d) E(X,d) &= \sum_{d \leq \sqrt X} (M(d) - M(d - 1))E(X,d) \\ &= M(\lfloor \sqrt X \rfloor) E(X, \lfloor \sqrt X \rfloor) + \sum_{d \leq \sqrt X - 1} M(d) (E(X, d) - E(X, d+1)). \end{align}$$

By using that $E(X,d) \leq 1$ and $M(X) = o(\sqrt X)$ (or the stronger version), we match the results from the first part. $\spadesuit$

Putting these two results together, we have proven that the number of squarefree integers up to $X$ is $$\frac{X}{\zeta(2)} + o(\sqrt X)$$ using only that $M(X) = o(\sqrt X)$, and alternately $$ \frac{X}{\zeta(2)} + O(\sqrt X e^{-c (\log X)^{1/9}})$$ using a bit of the zero-free region. This completes the proof. $\diamondsuit$


Leave a comment

Info on how to comment

To make a comment, please send an email using the button below. Your email address won't be shared (unless you include it in the body of your comment). If you don't want your real name to be used next to your comment, please specify the name you would like to use. If you want your name to link to a particular url, include that as well.

bold, italics, and plain text are allowed in comments. A reasonable subset of markdown is supported, including lists, links, and fenced code blocks. In addition, math can be formatted using $(inline math)$ or $$(your display equation)$$.

Please use plaintext email when commenting. See Plaintext Email and Comments on this site for more. Note also that comments are expected to be open, considerate, and respectful.

Comment via email