Tristan Stérin, Maja Kądziołka
bbchallenge.org
Notes for those viewing the slides online:
- a recording of this talk is available on YouTube
- in particular, some explanations rely on things being drawn live, with spoken commentary
- during testing, we have noticed that having the Bitwarden extension installed
makes things choppy, at least on Firefox
- if you have it installed, consider using a private tab or a different browser profile
The busy beaver function
Tibor Radó, 1895 — 1965
\(BB(n)\): maximum number of steps done by a halting 2-symbol Turing machine
with \(n\) states starting from an all-0 tape.
T. Radó. On Non-computable Functions. Bell System Technical Journal, 41(3):877–884. 1962.
BB(n) = "Maximum algorithmic bang for your buck"
UNCOMPUTABLE
BB(5) winner, first 20,000 steps
Small busy beaver values:
- BB(1) = 1, BB(2) = 6 [Radó, 1962]
- BB(3) = 21 [Radó and Lin, 1963]
- BB(4) = 107 [Brady, 1983]
- BB(5) ≥ 47,176,870
[Marxen and Buntrock, 1989]
- BB(5) = 47,176,870
[bbchallenge, 2024]
For each 4-state Turing machine, witness that it halts
within 107 steps, or prove it never halts (which can be hard in general)
Mathematical hardness of busy beaver values
Any \(\Pi_1\) statement (i.e. \(\forall x, P(x)\) for computable \(P\)) can be converted
into a Turing machine that halts iff the statement is false,
by searching for counterexamples.
Every positive even integer \(n > 2\) can be written as a sum of two primes.
\(4 = 2 + 2 \quad 6 = 3 + 3 \quad 8 = 5 + 3 \quad 10 = 7 + 3 \quad 12 = 7 + 5
\quad 14 = 11 + 3 \quad \ldots\)
- Enumerate all even numbers \(n\)
- For each \(n\), enumerate all primes \(p < n\) and test all sums
- Halt if no sum reaches the current \(n\)
There is a 25-state Turing machine that implements the above.
[anonymous, 2016] Verified in Lean! [lengyijun, 2024]
Hence, knowing BB(25) is "at least as hard" as solving Goldbach's conjecture.
Other examples:
- Riemann Hypothesis \(\rightarrow BB(744)\) [Matiyassevich and O'Rear, 2016]
- Consistency of ZF \(\rightarrow BB(748)\) [O'Rear, 2017] [Riebel, 2024]
Georgi Georgiev's bbfind program
Georgi "Skelet" Georgiev
In 2003, Georgi Georgiev's (Skelet) publishes his bbfind program:
- solves all 5-state machines except ~150 hard ones
- best result since 1989 [Marxen and Buntrock, 1989]
- ~6000 lines of undocumented Pascal code
- concerns about correctness
- Note: failed to find a better machine
In 2020, Scott Aaronson conjectured that there is no better machine:
Marxen and Buntrock's machine is the BB(5) winner.
bbchallenge.org: The Busy Beaver Challenge
- Created by Tristan in 2022
- Online, asynchronous, almost exclusively communicating on Discord
- No "management": entropically driven research
- Even the formal proof happened through a grassroots initiative
- 800+ members on Discord
- ~50 active contributors
- ~15 contributors whose contributions were directly used in the Coq proof
- Galaxy of ~25 GitHub repositories, many languages used:
C++, Python, Rust, Go, Haskell, Pascal, Coq, Lean, Dafny

Proof outline
1. Enumerate all 5-state Turing machines
- There are \(21^{10} \approx 1.67 \cdot 10^{13}\) possible 5-state Turing machines
- Only \(181,385,789\) after symmetries and pruning
2. Try automated proof strategies
- Deciders: programs that, given a TM, output one of
HALTS
DOESN'T HALT
UNKNOWN
\(\Downarrow\)
try another decider
\[
\forall \mathrm{tm}. \mathrm{decider}(\mathrm{tm}) = \mathtt{NONHALT} \Longrightarrow \text{tm doesn't halt}
\]
3. Manually inspect what's left
- Holdouts: machines that the decider pipeline couldn't deal with
- Usually the most interesting ones
1. Efficient TM enumeration
Naive enumeration has many redundant machines:
Allen Brady, 1934 — 2024
Solution: enumerate-as-you-go, choosing transitions only when the TM actually reaches them.
This is known as Tree Normal Form enumeration [Brady, 1966]
Options for the next transition: \(\{\mathtt 0, \mathtt 1\} \times \{\mathtt L, \mathtt R\} \times \{\mathtt A, \mathtt B, \mathtt C\}\)
Rinse & repeat™
TNF enumeration is unreasonably effective
- \(21^{10} \approx 1.67 \cdot 10^{13}\) possible Turing machines
- Back-of-the envelope calculation: \(\frac{21^{10}}{4! \cdot 2 \cdot 2} \approx 1.74 \cdot 10^{11}\) after symmetry
- Actually enumerated: \(181,385,789 \approx 1.81 \cdot 10^8\)
2. Deciding the undecidable: a quick tutorial
This is impossible! Let's do it anyway.
Simplest idea: there's gotta be some TMs that just enter a cycle
A bit more general: Translated Cyclers
However, the tape left behind isn't always clean
Translated Cyclers
- This decider was first used for BB(3) [Radó and Lin, 1963]
- originally known as Lin recurrence
- However, we can do better:
Consider the sequence of (current state, symbol under head) pairs the machine
visits during its execution. Two repetitions in this sequence are enough
to declare this machine a (translated) cycler.
Side conditions apply. Not a legally binding algorithm description. Consult
a mathematician before use.
Translated Cyclers can get bigger

- 10,000-step space-time diagram
- Behaviour starts after about 1000 steps
- Period is about 600 steps
They can get a LOT worse
Skelet #1, a ginormous Translated Cycler

- Discovered by Georgi Georgiev (a.k.a Skelet) in 2003
- Behavior starts after \(10^{51}\) steps
- Period of more than 8 billion steps
- Required an individual Coq proof of nonhalting
- One of the most complex 5-state TMs
More deciders
Outside of Cyclers, there are two strategies:
Example of a Bouncer, one such concrete class
- choose a specific class of machines, tailor a decider to recognize them
- generic methods that attempt to approximate the behavior in some way
(abstract interpretation)
\(n\)-gram Closed Position Set
Machines decided by \(n\)-gram CPS
So, where are we at?

3. What are the holdouts like?
Skelet #10: counts in base Fibonacci on two sides of the tape, halts if the two counters are out of sync
All natural numbers can be expressed as a sum of Fibonacci numbers
in exactly
one way, if you forbid using numbers immediately adjacent in the Fibonacci
sequence.
\[
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144
\]
\[
17 = 1 + 3 + 13
\]
Skelet #1: very chaotic, starts cycling after \(10^{51}\) steps
Skelet #17: complex discrete math problem involving counting in Gray code [Chris Xu, 2024, arXiv:2407.02426]
Initially, formal verification was considered too ambitious and out of scope.
Original correctness strategy:
- independent, interoperable implementations of each component
- consensus-based development
- modular pipeline with very loose interlinking
- in place of the decider pipeline, the TNF enumeration used a very simple criterion:
- if it halts within 47,176,870, then it halts
- otherwise, mark it as undecided
In 2022, Nathan Fenner wrote formally verified versions of some
of the deciders in Dafny:
- Finite Automata Reduction
- Halting Segment (since obsoleted by other work)

In 2023, mei joins the project.
- always been meaning to get into formal verification (seemed interesting)
- previous experience: small stuff in Isabelle/HOL
- read Adam Chlipala's Certified Programming with Dependent Types and forgot most of it
- learned Coq through Software Foundations
- found bbchallenge through OEIS immediately afterwards

Note: due to time constraints, the rest of the talk was given using preexisting slides outside of slipshow.
See continuation on Google Slides \(\Longrightarrow\)