kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

Self-Build Bootstrap (current state and roadmap)

This roadmap covers the staged self-build of kit: building the compiler with itself until it reproduces its own output byte-for-byte. The mechanics and products of the build are described in ../BUILD.md; this document tracks the reproducibility goal, the current baseline, the open problems that remain, and the next steps for widening coverage. The bootstrap is the strongest end-to-end correctness oracle in the project, because it exercises the C frontend, every optimizer pass, the native backends, the object writers, the linker, and the archive tools, all on the compiler's own source.

Goal: a self-reproducing fixed point

The bootstrap builds kit three times and requires the last two stages to be identical:

Stage2 vs stage3 is the fixed point: once the compiler reproduces itself, a third pass cannot change anything. The bootstrap drives the normal Makefile with CC/AR/LD repointed at each stage's symlinks, so there is no separate build system to maintain — it is the same rules run with kit as the toolchain. This depends on the reproducible-build guarantees in ../BUILD.md (deterministic ordering, no embedded timestamps/paths); any nondeterminism in codegen or object layout surfaces here as a stage2/stage3 mismatch.

Driving targets (see ../BUILD.md):

Current baseline

Done (baseline) on aarch64-macos:

Done on aarch64-linux (ELF), run natively inside an arm64 Linux container from the macOS host (see "Bootstrapping a Linux target from a non-Linux host" below):

Done on aarch64-freebsd (ELF), run natively inside the FreeBSD aarch64 VM from the macOS host (scripts/freebsd_bootstrap.sh aarch64; see "Bootstrapping a Linux target from a non-Linux host" — the FreeBSD VM path is the same shape):

This gives four fully self-hosting configurations — aarch64-macos, aarch64-linux (musl + glibc), and aarch64-freebsd — each at both -O0 and -O1. The remaining work is breadth: the other native targets (x86-64, rv64), and guarding the property over time.

Open problems and next steps

Widen target and platform coverage

The fixed point holds for aarch64-macos, aarch64-linux (musl + glibc), and aarch64-freebsd. The bootstrap should hold for every supported native target and object format. Until each is green it is an open question whether its backend + object writer are fully deterministic and self-consistent.

Bootstrapping a Linux target from a non-Linux host

make bootstrap keys off the build host's own uname (HOST_OS + machine), so it selects the native toolchain and object format with no cross-compilation. To bootstrap aarch64-linux from the macOS dev host, run the normal three-stage build inside an arm64 Linux container, where it is an ordinary native build:

Three host-environment differences from the macOS reference, all handled by the script / mk/bootstrap.mk:

Non-bootstrap gaps surfaced by the aarch64-linux Toy run (tracked, not fixed-point blockers — the per-object diff is byte-identical):

The native-ELF Toy L lane links hosted (kit cc -lc, so the crt provides _start) rather than freestanding kit ld, because an ELF executable needs a crt entry where Mach-O drives LC_MAIN straight to main; test/toy/run.sh selects this automatically on non-Darwin hosts (KIT_TOY_L_HOSTED).

These connect to the per-arch backend state tracked in ../CODEGEN.md and ../ARCH.md, and to the object/format paths in ../OBJ.md and ../LINK.md / LINKER.md. A new arch's first bootstrap is also the most thorough regression test those components get.

Guard the property over time

The fixed point is easy to break with a single nondeterministic or miscompiling change, and a regression is expensive to bisect after the fact.

Cross-bootstrap (stretch)

The current chains are native (host arch building host arch). A cross-bootstrap — host kit building a stage2 for a different target, then validating that stage2 reproduces a stage3 when run under ../EMU.md or on hardware — would prove the backends independent of the host. This is a stretch goal that depends on the emulator being able to host the full compiler.

Triage playbook for fixed-point regressions

When a stage2/stage3 mismatch (or a stage3 link failure) appears, the following approach has proven effective and should be the default starting point.

Use object reproduction, not just "does it link", as the oracle. A stage3 link failure is usually a symptom of a malformed object emitted earlier, not a linker bug. The decisive question is whether stage2, used as a compiler, reproduces the same .o that the host-built compiler produces. Compile one suspect TU with both the host kit and the stage2 kit using identical flags, then cmp the two objects. This separates malformed-object bugs from link-driver symptoms and points straight at the diverging codegen.

Narrow with hybrid relinks. Relink stage2 after replacing one suspect TU (or one piece of a split TU) with a clang-built object, then use that stage2 to compile the known-differing target object. This isolates whether a failure is in the linker itself or in codegen for a specific source file.

Inspect MIR around the suspect symbol. A temporary filtered MIR dump around the target function, taken after lowering and the combine pass, is usually enough to see the divergence (e.g. a call argument that should reference an allocable register but instead references a backend scratch register).

Avoid -g while triaging -O1 codegen. Debug info changes object layout and can create or hide layout-sensitive bugs; one historical "regalloc" diagnosis was actually a -g artifact. Triage on the non--g object first.

Root-cause classes seen at the fixed point

These bug classes were responsible for past -O1 fixed-point and stage3-link failures. They are fixed in the baseline, but they map the parts of the pipeline most likely to break the property again, so they are worth keeping in mind when a new arch or platform is brought up. See ../OPT.md for the passes.

The throughline: the fragile interactions are between the optimizer's register-level reasoning (../OPT.md) and the backend's scratch-register discipline (../ARCH.md), with the object/link layer (../OBJ.md, ../LINK.md) as where the symptom finally surfaces. New backends should expect to re-litigate these before reaching their own fixed point.