mixedmath

Explorations in math and programming
David Lowry-Duda



The Möbius function in python+CPP: basic ctypes

I've been computing several experiments1 1See for example notes from my talk or my general or technical notes on recent ML experiments leading to my preprint. with the Möbius function $\mu(n)$. I've done some of these experiments on machines without computer algebra systems like sage installed.

Instead of installing sage or PARI/GP, I wrote my own routines to compute $\mu(n)$. The method I use now is based on wheel factorization using the primes $2, 3$, and $5$ (a mod $30$ wheel).

def mobius(n):
    if n < 1:
        return 0
    if n == 1:
        return 1
    ret = 1
    if n % 2 == 0:
        ret *= -1
        n //= 2
    if n % 2 == 0:
        return 0
    if n % 3 == 0:
        ret *= -1
        n //= 3
    if n % 3 == 0:
        return 0
    if n % 5 == 0:
        ret *= -1
        n //= 5
    if n % 5 == 0:
        return 0

    incs = [4, 2, 4, 2, 4, 6, 2, 6]
    p = 7
    i = 0
    while p*p <= n:
        if n % p == 0:
            ret *= -1
            n //= p
        if n % p == 0:
            return 0
        p += incs[i]
        i = (i + 1) % len(incs)
    if n > 1:
        ret *= -1
    return ret

Most of the time when $n$ is divisible by a square, it's divisible by a small square (say $2^2, 3^2, 5^2$, or $7^2$ — these account for just over $95\%$ of squarefull numbers). For these, this implementation terminates very quickly and correctly returns $0$ without doing extra work. When $n$ isn't squarefree, this function essentially factors the number using wheel factorization.

(Many implementations in the wild always factor $n$ completely, and so this simple algorithm has similar or better performance than many implementations in CAS).

This is almost fast enough for my purposes. Fortunately, it's trivial2 2Despite knowing it's easy, I frequently forget how to do this. I write this note for future me. to rewrite this program in C or CPP and call it from python using the standard library ctypes module.

CPP Version

I prefer CPP to C, so I give the CPP version here.3 3There is almost no difference here except the mysterious extern "C" incantation. We write a CPP version.

/* mobius.cpp */

extern "C" {

int mobius(long long n) {
  if (n < 1) { return 0; }
  if (n == 1) { return 1; }
  int ret = 1;

  if (n % 2 == 0) { ret *= -1; n /= 2; }
  if (n % 2 == 0) { return 0; }

  if (n % 3 == 0) { ret *= -1; n /= 3; }
  if (n % 3 == 0) { return 0; }

  if (n % 5 == 0) { ret *= -1; n /= 5; }
  if (n % 5 == 0) { return 0; }

  long long incs[] = {4, 2, 4, 2, 4, 6, 2, 6};
  int i = 0;
  long long p = 7;

  while (p*p <= n) {
    if (n % p == 0) { ret *= -1; n /= p; }
    if (n % p == 0) { return 0; }
    p += incs[i];
    i = (i + 1) % 8;
  }
  if (n > 1) { return -ret; }
  return ret;
}

}

This is the same implementation as above. What we'll do is compile this into a shared library and load this library into python. We can do this with

g++ –std=c++17 -O3 -shared -o mobius.so -fPIC mobius.cpp

This outputs a shared library mobius.so. Now we need to load this into python.

import ctypes
import os


dldlib = ctypes.CDLL(os.path.abspath('mobius.so'))
dldmobius = dldlib.mobius
dldmobius.argtypes = [ctypes.c_longlong]
dldmobius.restype = ctypes.c_int


print(dldmobius(1001))
#> -1

That's all there is to it.

I strongly advise setting argtypes and restype. In older versions of python and ctypes, this was necessary to maintain the integrity of the stack. It is unclear to me from the documentation what automatic type conversions ctypes does. I know it doesn't handle pointers automatically, but will int, long, long long all be handled correctly? I have no idea! So just set the types and know that it will work now.

On my machine, computing $\sum_{n = 1}^{10^6} \mu(n)$ in pure python, python+ctypes, and pure cpp took the following times:

program execution time
python 6.504s
ctypes 0.695s
cpp 0.329s

In other words, this is a radical speed improvement. For larger $n$, the difference in speed becomes even more pronounced.

Moral

The moral of the story is that for simple functions, i.e. functions that operate on standard types and return standard types, using ctypes to wrap a speedy version written in C or CPP is easy.

It's so easy that any basic LLM would almost certainly be able to write the C or CPP version from the python version. (Yes! And this makes it even easier to do).

Aside: Testing Gemini

Actually, let's try this now. I'm asking gemini-2.0-flash to convert the python code above into a C version and a CPP version. Here's what it says (with some linebreaks and comments removed manually).

#include <stdio.h>

int mobius(int n) {
    if (n < 1) { return 0; }
    if (n == 1) { return 1; }
    int ret = 1;
    if (n % 2 == 0) { ret *= -1; n /= 2; }
    if (n % 2 == 0) { return 0; }
    if (n % 3 == 0) { ret *= -1; n /= 3; }
    if (n % 3 == 0) { return 0; }
    if (n % 5 == 0) { ret *= -1; n /= 5; }
    if (n % 5 == 0) { return 0; }

    int incs[] = {4, 2, 4, 2, 4, 6, 2, 6};
    int p = 7;
    int i = 0;
    while (p * p <= n) {
        if (n % p == 0) { ret *= -1; n /= p; }
        if (n % p == 0) { return 0; }
        p += incs[i];
        i = (i + 1) % (sizeof(incs) / sizeof(incs[0]));
    }
    if (n > 1) { ret *= -1; }
    return ret;
}

int main() {
    printf("mobius(30) = %d\n", mobius(30));
    printf("mobius(31) = %d\n", mobius(31));
    printf("mobius(32) = %d\n", mobius(32));
    return 0;
}

Essentially correct, except for the int types. But how would it know?

Now for CPP.

#include <iostream>

int mobius(int n) {
  if (n < 1) { return 0; }
  if (n == 1) { return 1; }
  int ret = 1;
  if (n % 2 == 0) { ret *= -1; n /= 2; }
  if (n % 2 == 0) { return 0; }
  if (n % 3 == 0) { ret *= -1; n /= 3; }
  if (n % 3 == 0) { return 0; }
  if (n % 5 == 0) { ret *= -1; n /= 5; }
  if (n % 5 == 0) { return 0; }

  int incs[] = {4, 2, 4, 2, 4, 6, 2, 6};
  int p = 7;
  int i = 0;
  while (p * p <= n) {
    if (n % p == 0) { ret *= -1; n /= p; }
    if (n % p == 0) { return 0; }
    p += incs[i];
    i = (i + 1) % (sizeof(incs) / sizeof(incs[0]));
  }
  if (n > 1) { ret *= -1; }
  return ret;
}

int main() {
  std::cout << "mobius(8): " << mobius(8) << std::endl;
  std::cout << "mobius(9): " << mobius(9) << std::endl;
  std::cout << "mobius(10): " << mobius(10) << std::endl;
  std::cout << "mobius(30): " << mobius(30) << std::endl;
  return 0;
}

Reasonable. Also int values and logs of stream flushing with std::endl, but that wouldn't actually make it to the python code.


Leave a comment

Info on how to comment

To make a comment, please send an email using the button below. Your email address won't be shared (unless you include it in the body of your comment). If you don't want your real name to be used next to your comment, please specify the name you would like to use. If you want your name to link to a particular url, include that as well.

bold, italics, and plain text are allowed in comments. A reasonable subset of markdown is supported, including lists, links, and fenced code blocks. In addition, math can be formatted using $(inline math)$ or $$(your display equation)$$.

Please use plaintext email when commenting. See Plaintext Email and Comments on this site for more. Note also that comments are expected to be open, considerate, and respectful.

Comment via email