{ "cells": [ { "cell_type": "markdown", "id": "d1d5608f-dcc5-432d-a58f-db2dc5083111", "metadata": {}, "source": [ "# Out of Distribution Data for \"ML the vanishing order of rational L-functions\" #\n", "\n", "At Harvard's CMSA [recent workshop on Math and ML](https://cmsa.fas.harvard.edu/event/mml_2025/), Alyson Deines and Tamara Deenstra gave a talk that included some of our recent work. (See [arxiv link to ML the vanishing order](https://arxiv.org/pdf/2502.10360) and the reference [BBCDLLDOQV25b](https://davidlowryduda.com/research/#BBCDLLDOQV25b) on my research page.)\n", "\n", "To summarize one observation in few words: an $L$-function can be written as\n", "\n", "$$ L(s) = \\sum_{n \\geq 1} \\frac{a(n)}{n^s} $$\n", "\n", "and satisfies a functional equation of the shape $\\Lambda(s) := N^{s} G(s) L(s) = \\varepsilon \\Lambda(1 - s)$, where $N$ and $\\varepsilon$ are distinguished numbers called the conductor and root number, respectively, and where $G(s)$ is a product of gamma factors. We're interested in rational $L$-functions, where the coefficients $a(n)$ are rational numbers when appropriately normalized. These arise from counting questions in number theory and arithmetic geometry.\n", "\n", "Conjecturally, the analytic behavior of $L(s)$ at $s = 1/2$ contains deep arithmetic information about the associated counting questions. One such conjecture is the [Birch and Swinnerton-Dyer Millenium Conjecture](https://en.wikipedia.org/wiki/Birch_and_Swinnerton-Dyer_conjecture), which says that the order of vanishing at $s = 1/2$ of the $L$-function associated to an elliptic curve agrees with the arithmetic rank of the curve.\n", "\n", "No one knows how to prove those results. In [BBCDLLDOQV25b](https://davidlowryduda.com/research/#BBCDLLDOQV25b), we looked to see if ML might successfully learn something about the order of vanishing from the first several coefficients.\n", "\n", "Aside: model success or failure wouldn't say something conclusive about BSD or related conjectures. But in practice, ML can act like a one-sided oracle: if model performance on a particular set of features is very high, this indicates that the arithmetic information is contained within those set of features. If mathematicians don't understand *why* or *how*, then at least this can point to a place where we can look for more.\n", "\n", "One suggestive graph comes from looking at the 2D PCA. Let's make it here. The data is available at https://zenodo.org/records/14774042. In the code below, this data is in the file `lfun_rat_withap.txt`." ] }, { "cell_type": "code", "execution_count": 1, "id": "dce75a97-f365-43b4-a09f-7c69b6de8b85", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "id": "3272b5ba-1219-4937-94ec-684aec08eeeb", "metadata": {}, "outputs": [], "source": [ "def is_prime(n):\n", " \"Naive prime detection\"\n", " if n < 2:\n", " return False\n", " if n < 4:\n", " return True\n", " for d in range(2, int(n**.5) + 2):\n", " if n % d == 0:\n", " return False\n", " return True\n", "\n", "\n", "assert [p for p in range(20) if is_prime(p)] == [2, 3, 5, 7, 11, 13, 17, 19]\n", "\n", "\n", "def write_to_float(ap_list):\n", " \"\"\"\n", " Convert the ap string containing aps to a list of ints.\n", " \"\"\"\n", " ap_list = ap_list.replace('[','')\n", " ap_list = ap_list.replace(']','')\n", " ap_list = [float(ap) for ap in ap_list.split(',')]\n", " return ap_list\n", "\n", "\n", "ALL_COLUMNS = [str(n) for n in range(1,1001)]\n", "PRIME_COLUMNS=[str(p) for p in range(1000) if is_prime(p)]\n", "\n", "\n", "def write_to_murm_normalized_primes(an_list, w, d = 1):\n", " \"\"\"\n", " Convert the ap strings to a list of normalized floats.\n", " \"\"\"\n", " an_list = an_list.strip('[')\n", " an_list = an_list.strip(']')\n", " an_list = [int(an) for an in an_list.split(',')]\n", " normalized_list = []\n", " for n, an in enumerate(an_list):\n", " p=int(PRIME_COLUMNS[n])\n", " normalization_quotient = d * (p**(w/2.))\n", " normalized_list.append(np.float32(round(an / normalization_quotient, 5)))\n", " return normalized_list\n", "\n", "\n", "def build_murm_ap_df(DF):\n", " \"\"\"\n", " Create a dataframe expanding the single column list of Dirichlet coefficients\n", " into only prime coefficients, with a single column per prime.\n", " \"\"\"\n", " # Copy all the existing columns except 'ap'\n", " base = DF.drop(columns=[\"ap\"]).copy()\n", " \n", " # Expand prime coefficients for each row\n", " prime_expanded = pd.DataFrame(\n", " [write_to_murm_normalized_primes(a, w, d) \n", " for w, a, d in zip(DF[\"motivic_weight\"], DF[\"ap\"], DF[\"degree\"])],\n", " columns=PRIME_COLUMNS,\n", " index=DF.index\n", " )\n", " DF_new = pd.concat([base, prime_expanded], axis=1)\n", " return DF_new" ] }, { "cell_type": "markdown", "id": "a03dbd74-e766-4010-8482-c261c515f000", "metadata": {}, "source": [ "In the code above, the coefficients are normalized. Coefficient normalization in $L$-functions is an easy source of confusion. Different applicatoins and contexts suggest different obvious normalizations. Instead of detailing this normalization, I'll note that the effect is that the functional equations in this normalization have shape $s \\mapsto 1 - s$, and the normalized coefficients vary between $[-1, 1]$." ] }, { "cell_type": "code", "execution_count": 3, "id": "829e7ca1-0891-45c7-bd9a-4fc0cd2f3dcd", "metadata": {}, "outputs": [], "source": [ "fname = \"lfun_rat_withap.txt\"\n", "DF_base = pd.read_table(fname, delimiter=\":\")" ] }, { "cell_type": "code", "execution_count": 4, "id": "5d9a651f-e92f-45c3-b740-d0b4f5e64e0a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | label | \n", "primitive | \n", "conductor | \n", "central_character | \n", "motivic_weight | \n", "degree | \n", "order_of_vanishing | \n", "z1 | \n", "root_angle | \n", "root_analytic_conductor | \n", "instance_types | \n", "ap | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1-1-1.1-r0-0-0 | \n", "True | \n", "1 | \n", "1.10 | \n", "0 | \n", "1 | \n", "0 | \n", "14.134725 | \n", "0.0 | \n", "0.004644 | \n", "['NF', 'DIR', 'Artin', 'Artin'] | \n", "[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... | \n", "
1 | \n", "1-5-5.4-r0-0-0 | \n", "True | \n", "5 | \n", "5.40 | \n", "0 | \n", "1 | \n", "0 | \n", "6.648453 | \n", "0.0 | \n", "0.023220 | \n", "['DIR'] | \n", "[-1, -1, 0, -1, 1, -1, -1, 1, -1, 1, 1, -1, 1,... | \n", "
2 | \n", "1-2e3-8.5-r0-0-0 | \n", "True | \n", "8 | \n", "8.50 | \n", "0 | \n", "1 | \n", "0 | \n", "4.899974 | \n", "0.0 | \n", "0.037152 | \n", "['DIR'] | \n", "[0, -1, -1, 1, -1, -1, 1, -1, 1, -1, 1, -1, 1,... | \n", "
3 | \n", "1-12-12.11-r0-0-0 | \n", "True | \n", "12 | \n", "12.11 | \n", "0 | \n", "1 | \n", "0 | \n", "3.804628 | \n", "0.0 | \n", "0.055728 | \n", "['DIR'] | \n", "[0, 0, -1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1,... | \n", "
4 | \n", "1-13-13.12-r0-0-0 | \n", "True | \n", "13 | \n", "13.12 | \n", "0 | \n", "1 | \n", "0 | \n", "3.119341 | \n", "0.0 | \n", "0.060372 | \n", "['DIR'] | \n", "[-1, 1, -1, -1, -1, 0, 1, -1, 1, 1, -1, -1, -1... | \n", "
\n", " | label | \n", "primitive | \n", "conductor | \n", "central_character | \n", "motivic_weight | \n", "degree | \n", "order_of_vanishing | \n", "z1 | \n", "root_angle | \n", "root_analytic_conductor | \n", "... | \n", "937 | \n", "941 | \n", "947 | \n", "953 | \n", "967 | \n", "971 | \n", "977 | \n", "983 | \n", "991 | \n", "997 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1-1-1.1-r0-0-0 | \n", "True | \n", "1 | \n", "1.10 | \n", "0 | \n", "1 | \n", "0 | \n", "14.134725 | \n", "0.0 | \n", "0.004644 | \n", "... | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "
1 | \n", "1-5-5.4-r0-0-0 | \n", "True | \n", "5 | \n", "5.40 | \n", "0 | \n", "1 | \n", "0 | \n", "6.648453 | \n", "0.0 | \n", "0.023220 | \n", "... | \n", "-1.0 | \n", "1.0 | \n", "-1.0 | \n", "-1.0 | \n", "-1.0 | \n", "1.0 | \n", "-1.0 | \n", "-1.0 | \n", "1.0 | \n", "-1.0 | \n", "
2 | \n", "1-2e3-8.5-r0-0-0 | \n", "True | \n", "8 | \n", "8.50 | \n", "0 | \n", "1 | \n", "0 | \n", "4.899974 | \n", "0.0 | \n", "0.037152 | \n", "... | \n", "1.0 | \n", "-1.0 | \n", "-1.0 | \n", "1.0 | \n", "1.0 | \n", "-1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "-1.0 | \n", "
3 | \n", "1-12-12.11-r0-0-0 | \n", "True | \n", "12 | \n", "12.11 | \n", "0 | \n", "1 | \n", "0 | \n", "3.804628 | \n", "0.0 | \n", "0.055728 | \n", "... | \n", "1.0 | \n", "-1.0 | \n", "1.0 | \n", "-1.0 | \n", "-1.0 | \n", "1.0 | \n", "-1.0 | \n", "1.0 | \n", "-1.0 | \n", "1.0 | \n", "
4 | \n", "1-13-13.12-r0-0-0 | \n", "True | \n", "13 | \n", "13.12 | \n", "0 | \n", "1 | \n", "0 | \n", "3.119341 | \n", "0.0 | \n", "0.060372 | \n", "... | \n", "1.0 | \n", "-1.0 | \n", "-1.0 | \n", "1.0 | \n", "-1.0 | \n", "1.0 | \n", "-1.0 | \n", "-1.0 | \n", "1.0 | \n", "1.0 | \n", "
5 rows × 179 columns
\n", "\n", " | feature | \n", "importance_mean | \n", "importance_std | \n", "
---|---|---|---|
5 | \n", "13 | \n", "0.042678 | \n", "0.000862 | \n", "
4 | \n", "11 | \n", "0.040870 | \n", "0.000925 | \n", "
3 | \n", "7 | \n", "0.040661 | \n", "0.000815 | \n", "
7 | \n", "19 | \n", "0.037700 | \n", "0.000879 | \n", "
2 | \n", "5 | \n", "0.037221 | \n", "0.000829 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "
110 | \n", "607 | \n", "0.000384 | \n", "0.000293 | \n", "
146 | \n", "853 | \n", "0.000311 | \n", "0.000282 | \n", "
153 | \n", "887 | \n", "0.000268 | \n", "0.000240 | \n", "
165 | \n", "983 | \n", "0.000192 | \n", "0.000247 | \n", "
159 | \n", "941 | \n", "0.000175 | \n", "0.000270 | \n", "
168 rows × 3 columns
\n", "