mixedmath

Explorations in math and programming
David Lowry-Duda

Examining Excess in the Schmidt Bound

Schmidt1 1Wolfgang Schmidt. Number fields of given degree and bounded discriminant. Astérisque. 1995. Available here, with free download (thanks Numdam!) devised a straightforward way to estimate the quantity of number fields of a given degree up to a certain discriminant.

Schmidt shows that there are at most $O(X^{\frac{n+2}{4}})$ number fields of degree $n$ with discriminant up to $X$. We expect the real answer to be $O(X)$ (or really, $\asymp cX$ for some $c$).

To produce this bound, Schmidt shows that it suffices to count polynomials $$\label{eq:poly} f(x) = x^n + 0 \cdot x^{n-1} + \cdots + c_{n-2} x^{2} + c_1 x + c_n$$ where $c_1 = 0$, $\lvert c_j \rvert \ll H^{j}$, and where $H \asymp X^{1/(2n-2)}$. We typically call these trace $0$ polynomials, as the sum of the roots (as in Vièta's formulas) is $0$.

A bit more detail

Schmidt shows that if $K/\mathbb{Q}$ is a number field of degree $n$, then there is an element $\alpha \in O_K$ with certain bounds on the coefficients of the minimal polynomial of $\alpha$. With a clever choice of basis (and at the cost of an implicit constant that I ignore), we can guarantee that the trace of $\alpha$ is $0$. The worst-counting case ends up being when the given $\alpha$ is primitive (i.e. it generates $K$).

The number of such minimal polynomials then gives an upper bound for the number of number fields. Counting them all shows that there are at most

\begin{equation*} \prod_{j = 2}^n H^j = H^{\frac{n(n+1)}{2} - 1} = X^{\frac{n + 2}{4}}. \end{equation*}

I'm going to begin including more of the tiny evaluations I do all the time with my computing environments. These may be easy manual calculations — but I've been known to make simple mistakes.

# Short sympy verification of exponent simplification.
import sympy
n = sympy.var('n')
g = (n*(n + 1)/2 - 1) / (2*n - 2)
print(sympy.simplify(g))
# >> n/4 + 1/2

# Alternately, in sage.
n = var('n')
g = (n*(n + 1)/2 - 1) / (2*n - 2)
print(g.full_simplify())
# >> n/4 + 1/2
# Direct comparison (sage bonus)
print(bool( g == ((n + 2) / 4) ))
# >> True

Overcounting

There are two ways that Schmidt's argument overcounts.

1. We could be counting fields more than once.
2. We could be counting fields with discriminant larger than $X$.

Heuristically, a generic field with coefficients constrained only by this size limit should be of size $X^{n/2}$, which is much larger than $X$.

Why? The discrimant of a monic polynomial as in \eqref{eq:poly} can be written as \begin{equation*} (-1)^{n(n-1)/2} \prod_{i \neq j} (r_i - r_j), \end{equation*} where the product is over the roots $r_i$ of $f$. The product has $n \cdot (n-1)$ terms. The bound $\lvert c_j \rvert \leq H^j$ above implies that $\left\lvert \sum r_i \right\rvert \leq H$. (More generally, we have \begin{equation*} \left \lvert \sum_{i \neq j} r_i r_j \right \rvert \lesssim H^2, \end{equation*} and similarly for the other basic symmetric polynomials in the roots).

For a generic polynomial, we shouldn't expect severe cancellation among the symmetric polynomials in the roots. Hence we should expect $\lvert r_i \rvert \lesssim H$. For the discriminant this implies that \begin{equation*} \lvert \mathrm{Disc}(f) \rvert \lesssim H^{n(n-1)/2} \asymp X^{n/2}. \end{equation*}

Aside on Conspiratorial Polynomials

The heuristic that $\lvert \mathrm{Disc}(f) \rvert \lesssim X^{n/2}$ is based on generic polynomials. As we all know, a generic polynomial is nice, not out to get you.

What can we say about conspiratorial polynomials? That is, what is the largest that $\lvert \mathrm{Disc}(f) \rvert$ can be for a polynomial in the Schmidt region, with $\lvert c_j \rvert \leq H^j$ (and optionally with $c_1 = 0$)?

I'm not sure! I've never thought about it until now. An easy upper bound is on the order of $X^n$. This is because the discriminant of $f$ is a homogeneous polynomial of degree $2n - 2$ in the coefficients $c_j$, and the largest coefficient is of size $H^n \asymp X^{n/(2n-2)}$.

According to a short, simple argument by Mahler,2 2K. Mahler. An inequality for the discriminant of a polynomial. Michigan Math. J 11(3): 257-262. DOI: 10.1307/mmj/1028999140. Available on projecteuclid, also with free download (Thanks ProjectEuclid!).

the easy upper bound above is achieved (up to multiplication by a constant that I ignore but Mahler makes explicit) for nonmonic polynomials.

Mahler considers polynomials \begin{equation*} f(x) = a_0 x^n + a_1 x^{n-1} + \cdots + a_n, \end{equation*} and defines two notions of height for the polynomials, $$L(f) = \sum \lvert a_j \rvert, \qquad M(f) = \lvert a_0 \rvert \prod_{1 \leq j \leq n} \max(1, \lvert a_j \rvert).$$ He shows that $$\lvert \mathrm{Disc}(f) \rvert \leq n^n M(f)^{2n - 2}$$ (or the same, with $L(f)$ in pace of $M(f)$). Further, equality occurs if and only if $f = a_0 x^n + a_n$ with $\lvert a_0 \rvert = \lvert a_m \rvert > 0$.

In the extremal case, where the equality, the discriminant is made large by having all roots be of the same size and as equally-spaced as possible, while the height bound is as small as possible.

It's not clear to me how to produce better bounds for conspiratorial polynomials in the Schmidt region $\lvert c_j \rvert \leq H^j$. I think this is an interesting question — maybe it's already been considered.

Experimentation on Overcounting

We have on the order of $X^{\frac{n + 2}{4}}$ polynomials in the Schmidt region $\lvert c_j \rvert \leq H^j$ with trace $0$, and each field of discriminant up to $O_n(X)$ is guaranteed to be generated by one of these polynomials. We expect $\Theta(X)$ many such polynomials.

The heuristic that the generic polynomial here will have too large a discriminant shows that much of the overcounting should come from polynomials generating fields that are too large.

Heuristically, this suggests that the right way to improve the count would be to reduce this overcounting, perhaps by considering in addition other invariants attached to the polynomials. I'm burying the lede here — this is what was done by Ellenberg and Venkatesh, Couveignes, and Lemke Oliver and Thorne when they made further improvements for large degree. And in a different direction, this is what Anderson, Gafni, Hugnes, Lemke Oliver, Thorne, Wang, Zhang, and I do in our paper for small degree (and what Bhargava, Shankar, and Wang do in their stronger paper.

But I wonder: does repeated counting contribute meaningfully? I do this with one eye looking forwards: could it be that there is significant repeat counting even in the more recent work studying additional invariants? I don't know!

For the remainder of this note, I perform a set of (naive) experiments.

In the first, I'm going to exhaustively look at small heights and small degrees. (This is doomed to not work well, as there are $H^{\frac{n(n+1)}{2} - 1}$ polynomials, which grows very quickly).

I do this using sage.

Exhaustive Search

First, I implement an iterator over monic polynomials with given height.The keyword yield in this expression makes this function return a generator, which can be iterated over. The nice thing about this approach is that it is lazy, and the list of polynomials isn't precomputed.

import itertools

def polys_of_height_H(H, degree=3):
"""
Returns an iterator over monic polynomials of degree degree
with height up to H.
"""
l = []
for j in range(1, degree + 1):
if j != degree - 1:
l.append(range(-H^j, H^j + 1))
for partial_coeff_list in itertools.product(*l):  # Cartesian product
coeffs = [1, 0] + list(partial_coeff_list)
# Polynomials want coefficients in reverse order
yield ZZ[x](list(reversed(coeffs)))
# list(.) makes iterator explicit for sage

We verify on two small examples.

iter2 = polys_of_height_H(2, degree=2)
for poly in iter2:
print(poly)
# Cubic, height 1
iter1 = polys_of_height_H(1, degree=3)
for poly in iter1:
print(poly)
x^2 - 4
x^2 - 3
x^2 - 2
x^2 - 1
x^2
x^2 + 1
x^2 + 2
x^2 + 3
x^2 + 4

x^3 - x - 1
x^3 - x
x^3 - x + 1
x^3 - 1
x^3
x^3 + 1
x^3 + x - 1
x^3 + x
x^3 + x + 1

Good!

We are only interested in irreducible polynomials. And we want to make sure that we detect when two different polynomials generate the same field extension.

For the former, we use sage's is_irreducible.

For the latter, we convert the polynomial to a PARI/GP polynomial and use its polredabs.3 3I don't think this can be done directly from a sage polynomial object? Interesting.

poly = ZZ[x](x^3 + x^2 + x)
print(poly.is_irreducible())
# >> False
poly1 = ZZ[x](x^2 + 1)
poly2 = ZZ[x](x^2 + 4)

pari_poly1 = pari.Pol(poly1)
pari_poly2 = pari.Pol(poly2)

print("Poly1", pari_poly1.polredabs())
# >> Poly1 x^2 + 1
print("Poly2", pari_poly2.polredabs())
# >> Poly2 x^2 + 1

# We can use a set to see what we have and haven't already seen
seen = set()
for poly in [poly1, poly2]:
pari_poly = pari.Pol(poly)
reduced = pari_poly.polredabs()
if reduced not in seen:
print("New extension", poly)
else:
# >> New extension x^2 + 1
# >> Already seen  x^2 + 4

Now we set up the experiment itself. I am going to cheat in one small way: to generate exactly those fields of discriminant up to $X$, one can take $H$ to be a constant (depending on $n$) times $X^{1/(2n - 2)}$. I'm going to ignore that constant as it's a bit orthogonal to what I'm investigating (unless somehow all the duplication or all the excess occurs at the periphery... which isn't the case).

def perform_experiment(H, degree=2):
seen = set()
num_total_polys = 0
num_repeat_polys = 0
num_large_polys = 0
num_both_polys = 0
discriminant_bound = H^(2*degree - 2)
polyiter = polys_of_height_H(H, degree=degree)
for poly in polyiter:
is_too_large = False
is_repeat = False
if poly.is_irreducible() == False:
continue
# It's irreducible
num_total_polys += 1
pari_poly = pari.Pol(poly)
reduced = pari_poly.polredabs()
disc = reduced.nfdisc()
if abs(disc) > discriminant_bound:
is_too_large = True
if reduced in seen:
is_repeat = True
if is_too_large and is_repeat:
num_both_polys += 1
if is_too_large:
num_large_polys += 1
if is_repeat:
num_repeat_polys += 1
return (
num_total_polys,
num_repeat_polys,
num_large_polys,
num_both_polys
)

Degree $2$

I run this first in the trivial case for (monic, trace $0$) quadratic polynomials of increasing heights. This is really just keeping track of polynomials $x^2 + c_2$ for $\lvert c_2 \rvert \leq H$, which we already understand.

The tabulation for this is below.

$H$ Total Repeat Large Both Time (seconds)
2 6 1 3 0 0.45841026306152344
4 28 7 11 0 0.02414727210998535
8 120 43 37 0 0.093841552734375
16 496 183 157 0 0.5661072731018066
32 2016 769 624 0 1.5321400165557861
64 8128 3147 2488 0 6.250866889953613
128 32640 12717 9962 0 25.209779262542725
256 130816 51129 39840 0 100.79308009147644
512 523776 205057 159360 0 404.81032609939575
1024 2096128 821207 637466 0 1642.1120653152466

Note that the ratio $\text{Repeat}/\text{Total} \approx 0.3914$. In fact, a careful eye might notice that \begin{equation*} \frac{\text{Repeat}}{\text{Total}} + \frac{6}{\pi^2} \approx 1. \end{equation*} Of course, this should be expected.

Degree $3$

Now for degree $3$. This is rather small, but already the Schmidt counting is too large. Schmidt would guess $X^{5/4}$ cubic fields of discriminant up to $X$. What do we see?

The table for this is below.

$H$ Total Repeat Large Both Time(seconds)
2 68 37 68 37 0.5513186454772949
4 1094 671 950 552 1.9168543815612793
8 17162 10350 15174 8759 28.84293484687805
16 269322 165186 245680 146342 464.929158449173
32 4255708 2596875 4012452 2405501 7612.138286828995
48 21445478 13175063 20502652 12428495 40084.37070131302

Notice how radically large Large is — as expected, most polynomials sampled generate number fields with too large a discriminant. Further, the percentages of sampled polynomials is rising as $H$ increases (not counting $H = 2$):

\begin{equation*} 0.86, 0.88, 0.91, 0.94, 0.95. \end{equation*}

In contrast, the percentage of polynomials yielding number fields that we've already seen appears to be limiting to a number of size approximately $0.614$.

Is this provable? The degree is small, so it might be directly computable (as in the quadratic case).

Degree $4$

Now for degree $4$. This is already large in terms of the number of polynomials we encounter. We're running into the limits of exhaustive search already.

$H$ Total Repeat Large Both Time(seconds)
2 1284 584 1284 584 4.150874853134155
4 149323 70384 147563 68923 440.8601644039154
8 17920563 8654549 17816959 8573910 56531.99887394905

The observations here are morally the same as with cubic. A positive proportion seem to come from repeat polynomials, while almost all seem to come from polynomials that are too large.

I note here that it's unclear apriori whether we should expect any correlation between repeat polynomials and large polynomials. One way to make a heuristic guess would be to check whether \begin{equation*} P(f = \text{large AND } f = \text{repeat}) \approx P(f = \text{large})P(f = \text{repeat}). \end{equation*} Both here and in previous tabulations, this seems true. That is, it seems that being a repeat polynomial and being too large are independent.

Degree $5$

Finally, we look at degree $5$. This is already far too large.

$H$ Total Repeat Large Both Time(seconds)
1 34 20 34 20 0.5939536094665527
2 46156 23500 46156 23500 203.79982089996338
3 3509494 1765764 3509156 1765466 15944.9661591053

It's certainly not true that every polynomial is too large. Exhaustive consideration in the degree $5$ becomes computationally expensive before we can escape the law of small numbers scenario.

Context

I think I was primed4 4This is completely unrelated, but apparently the psychological theory of priming is "effectively dead" and somewhat debunked, to paraphrase Kahneman. The experiments aren't replicable. to expect that there was something to be gained.

In our paper AGHLOLDTWZ, we count number fields in the Schmidt region but without the trace $0$ constraint. One of the things that we show is that incorporating the factor of $H$ additional polynomials by allowing $\lvert c_1 \rvert \leq H$ (instead of $c_1 = 0$) overcounts each number field by a factor of $H$.

We did this out of a certain sense of laziness — it makes certain boxes more symmetric and it's easier to study orbits of polynomials under a certain group action (which we had been using in an earlier version of the paper... but which we don't use in the final version at all). I decided it could be easy to overcount, and then possibly easy to account for that overcounting.

I wouldn't have been surprised if it was possible to save, say, $X^{\frac{1}{2n - A}}$ for some $A > 0$, for $n \gg 1$, or something like that.

But no, that is not supported by this experimentation at all!

Conclusion

As we knew would occur, 100 percent of polynomials give number fields that have discriminants that are too large.

On the other hand, these tabulations suggest that counting repeated polynomials might account for a percentage of the fields seen. This amounts to a savings by a constant factor, and isn't very interesting.

Further, it seems that being a repeat polynomial and having too large of a discriminant do not correlate.

In summary, it's not worth trying to account for sets of polynomials that generate the same number field.

Info on how to comment

bold, italics, and plain text are allowed in comments. A reasonable subset of markdown is supported, including lists, links, and fenced code blocks. In addition, math can be formatted using $(inline math)$ or $$(your display equation)$$.

Please use plaintext email when commenting. See Plaintext Email and Comments on this site for more. Note also that comments are expected to be open, considerate, and respectful.

Comment via email

1. 2024-02-13 David Lowry-Duda

I experiment with mirroring some commentary from Mastodon and here. This is from @davidlowryduda@mathstodon.xyz.

I study excess in the Schmidt bound for number fields. Schmidt estimated that there are at most $X^{\frac{n + 2}{4}}$ number fields of degree $n$ with discriminant up to $X$. But we expect the real number is really of the form $cX$ for some constant $c$.

To do this, Schmidt gives a region of polynomials to count, where some polynomial is guaranteed to generate each number field of discriminant up to $X$. There are two sorts of error there: counting polynomials that yield number fields with discriminant larger than $X$, and sets of polynomials that all count the same number field.

Which error is larger?

For heuristic reasons, we should expect counting polynomials that are too large to be a major source of overcounting. So perhaps a meaningful question would be: does counting the same number field multiple times account for much of the error at all?

In this note I describe an experiment and show that no, repeated counting isn't a significant factor.

2. 2024-02-13 David Lowry-Duda

I will also note that I hadn't expected this to be so cut and dry, and I ran a separate experiment that sampled randomly from the space of polynomials and went in that direction.

The setup and conclusion are more complicated. But the conclusion is the same. So I chose to not document or include it.