commit 83432bcb4404316ecbbbe9f0dca39391c2c3d648
parent a1e3df941d9d15283a5a552e21b87b3f70854deb
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Thu, 7 May 2026 11:06:30 -0700
Update README, add TOUR
Diffstat:
| M | README.md | | | 122 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------- |
| A | docs/TOUR.md | | | 312 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
2 files changed, 420 insertions(+), 14 deletions(-)
diff --git a/README.md b/README.md
@@ -1,4 +1,12 @@
-# boot2, another bootstrap path
+# boot2 — a bootstrap chain you can read
+
+`boot2` brings a Linux/POSIX system up from a few hundred bytes of seed
+machine code to `tcc` + `musl` running on a tcc-built kernel. Every
+intermediate stage is small enough to read end-to-end. The compiler that
+builds the C compiler is in this repository. So is the kernel that runs
+it.
+
+## The chain
```
;; ── boot0.sh ── Bootstrap from seed ──────────────────────────────────
@@ -47,23 +55,90 @@
;; runtime for DRIVER=seed re-runs, closing the bootstrap loop.
```
-* P1: [docs/P1.md](docs/P1.md.html)
-* M1pp: [docs/M1PP.md](docs/M1PP.md.html)
-* hex2pp: [docs/HEX2pp.md](docs/HEX2pp.md.html)
-* P1PP: [docs/LIBP1PP.md](docs/LIBP1PP.md.html), [P1/P1.M1pp](P1/P1.M1pp.html), [P1/P1pp.P1pp](P1/P1pp.P1pp.html)
-* Scheme: [docs/SCHEME1.md](docs/SCHEME1.md.html), [scheme1/scheme1.P1pp](scheme1/scheme1.P1pp.html)
-* C: [docs/CC.md](docs/CC.md.html), [cc/cc.scm](cc/cc.scm.html)
-* OS: [docs/OS.md](docs/OS.md.html) (seed-kernel contract), [docs/MUSL.md](docs/MUSL.md.html)
+For a stage-by-stage walk-through of what is built, what becomes
+trustable, and what becomes unused at each step, see
+[docs/TOUR.md](docs/TOUR.md).
+
+## What you have to trust
+
+The trust boundary has two parts: a small set of **vendored bytes** that
+the chain starts from, and a small body of **hand-written source** that
+turns those bytes into everything else.
+
+### Vendored bytes (per architecture)
+
+Per arch, seven files from
+[live-bootstrap](https://github.com/fosslinux/live-bootstrap)'s
+stage0-posix; full provenance in [vendor/seed/README.md](vendor/seed/README.md).
+Sizes for `aarch64 / amd64 / riscv64`:
+
+| file | role | bytes (a/x/r) |
+| ----------- | ----------------------------------------- | ------------- |
+| `hex0-seed` | the only opaque ELF; assembles `hex0.hex0` | 526 / 229 / 392 |
+| `hex0.hex0` | hex assembler — source of `hex1` | 9763 / 6387 / 8065 |
+| `hex1.hex0` | hex assembler with labels | 18971 / 10784 / 27080 |
+| `hex2.hex1` | hex assembler with ELF-aware linking | 31017 / 24767 / 39860 |
+| `catm.hex2` | concatenates files | 6456 / 5468 / 6231 |
+| `M0.hex2` | macro stage above hex2 | 50189 / 43551 / 65364 |
+| `ELF.hex2` | ELF header preamble | 2981 / 2672 / 2661 |
+
+Every one of these except `hex0-seed` is a textual hex file you can
+read. `hex0-seed` itself is a few hundred bytes; it is the smallest
+opaque artifact in the trust path, and the vendored copies are the same
+bytes used by other live-bootstrap consumers.
+
+### Host envelope
+
+You also trust your runtime environment: `sh`, `podman` (or rootless
+equivalents), and the `qemu-user-static` binaries for any cross arches. With
+`DRIVER=seed`, you trust `qemu-system-<arch>` — but not your host's toolchain,
+since the seed driver runs each stage *inside* the kernel that boot6 produced.
+Of course, you can also run these directly (without podman) or on bare metal
+which would reduce your trust base.
+
+### Hand-written source (LoC)
+
+Lines of code (comments and blanks stripped) that are loaded and
+executed during the bootstrap, not counting vendored sources flattened
+later in the chain (`tcc-0.9.26`, `musl-1.2.5`):
+
+| layer | files | LoC |
+| ------------- | -------------------------------------------- | ----- |
+| M1pp | `M1pp/M1pp.P1` | 5000 |
+| hex2pp | `hex2pp/hex2pp.P1` | 3087 |
+| P1 | `P1/{P1.M1pp, P1pp.P1pp, P1-<arch>.M1pp, …}` | 3236 |
+| catm | `catm/catm.P1pp` | 105 |
+| scheme1 | `scheme1/{scheme1.P1pp, prelude.scm}` | 4842 |
+| cc | `cc/cc.scm` | 5173 |
+| mes-libc | `vendor/mes-libc/libc.c` | 1019 |
+| seed-kernel | `seed-kernel/{kernel.c, arch/<arch>/*}` | ~1700 (incl. asm) |
+
+Every layer is small enough to read in an afternoon. The full list of
+files crossed by the chain, and the order in which to read them, is in
+[docs/TOUR.md](docs/TOUR.md).
+
+## Reading order
+
+* **5 minutes** — the chain pseudocode above, plus
+ [docs/TOUR.md §0 "Map"](docs/TOUR.md).
+* **An hour** — [docs/TOUR.md](docs/TOUR.md) end to end, then skim
+ [boot/boot0.sh](boot/boot0.sh) … [boot/boot6.sh](boot/boot6.sh).
+* **A day** — the component specs in dependency order:
+ [docs/P1.md](docs/P1.md), [docs/M1PP.md](docs/M1PP.md),
+ [docs/HEX2pp.md](docs/HEX2pp.md), [docs/LIBP1PP.md](docs/LIBP1PP.md),
+ [docs/SCHEME1.md](docs/SCHEME1.md), [docs/CC.md](docs/CC.md),
+ [docs/CCSCM.md](docs/CCSCM.md),
+ [docs/LIBC.md](docs/LIBC.md), [docs/MUSL.md](docs/MUSL.md),
+ [docs/TCC.md](docs/TCC.md), [docs/OS.md](docs/OS.md).
## Architectures × drivers
-`DRIVER={podman,seed} x {aarch64,amd64,riscv64}`
+`DRIVER={podman,seed} × ARCH={aarch64,amd64,riscv64}`
-`DRIVER` selects the runtime that executes each bootN stage:
+`DRIVER` selects the runtime that executes each `bootN` stage:
-* **podman** (default) — each stage runs in a per-arch container under
- `qemu-user`. One-time host setup: a working `podman` + the qemu-user
- static binaries for the cross arches.
+* **podman** (default) — each stage runs in a container with access only to its
+ input binaries and sources.
* **seed** — each stage runs inside the tcc-built seed-kernel under
`qemu-system-<arch>`. Closes the loop: the kernel built by
`DRIVER=podman` boot6 is the runtime for the next pass. First-time
@@ -119,4 +194,23 @@ in [tests/README.md](tests/README.md).
diffs seed-built vs podman-built artifacts for byte equivalence; see
that script's header for modes.
-If you'd like to chat, email me at hi at ryansepassi.com
+## Repository layout
+
+```
+boot/ bootN.sh stage drivers + the shared shell DSL
+bootprep/ source-tree prep: vendor flatten, run.scm gen, calibration
+catm/ catm.P1pp — concatenator, second tier
+M1pp/ M1pp.P1 — macro expander (M1pp.c is the readable C reference)
+hex2pp/ hex2pp.P1 — assembler/linker (hex2pp.c is the readable C reference)
+P1/ P1 frontend, libp1pp, per-arch backends
+scheme1/ scheme1.P1pp + prelude.scm — the Scheme interpreter and its R7RS layer
+cc/ cc.scm — the C compiler, in scheme1
+tcc/ tcc 0.9.26 patches, mem.c, host-cross asm fallback
+seed-kernel/ kernel.c + per-arch boot/MMU + user-mode tests
+docs/ component specs and the TOUR
+tests/ suite fixtures + harness
+vendor/ seed bytes, tcc/mes-libc/musl tarballs (provenance in vendor/*/README.md)
+tools/ small helper scripts (count-lines, disasm-elf, …)
+```
+
+If you'd like to chat, email me at hi at ryansepassi.com.
diff --git a/docs/TOUR.md b/docs/TOUR.md
@@ -0,0 +1,312 @@
+# Tour
+
+A walk through the bootstrap, stage by stage. Each stage takes about
+ten minutes to read. By the end, you'll have seen every binary the
+chain produces and how each one is built from the binaries before it.
+
+The READMEs and per-component specs in docs/ cover the *what* and
+*why* in depth. This tour is the *order* — the path that ties them
+together.
+
+## §0. Map
+
+The chain is a sequence of seven shell scripts under [boot/](../boot).
+Each `bootN.sh` produces one or two binaries from the binaries the
+prior stages produced, plus source from the canonical
+`build/<arch>/src/` tree (prepared once by `bootprep/prep-src.sh`).
+
+| Stage | Driver script | Produces | Trust extension |
+| ----- | ----------------------- | ----------------------------------------- | ------------------------------------------------------------- |
+| 0 | [boot0.sh](../boot/boot0.sh) | `hex2`, `catm`, `M0` | hex0-seed → hex assemblers → file concatenator → macro stage |
+| 1 | [boot1.sh](../boot/boot1.sh) | `M1pp`, `hex2pp` | first programs in the portable P1 pseudo-ISA |
+| 2 | [boot2.sh](../boot/boot2.sh) | `catm` (rebuilt), `scheme1` | seed `catm` retired; Scheme interpreter arrives |
+| 3 | [boot3.sh](../boot/boot3.sh) | `tcc0` | C arrives — `cc.scm` (in scheme1) compiles upstream tcc |
+| 4 | [boot4.sh](../boot/boot4.sh) | `tcc1`, `tcc2`, `tcc3`, `libc.a`, `libtcc1.a` | tcc self-host, byte-identical fixed point `tcc2 == tcc3`, minimal libc |
+| 5 | [boot5.sh](../boot/boot5.sh) | `libc.a`, `crt{1,i,n}.o` | musl-1.2.5 built by the self-hosted tcc |
+| 6 | [boot6.sh](../boot/boot6.sh) | `Image` (aarch64) / `kernel.elf` | a minimal kernel that can host the chain (`DRIVER=seed`) |
+
+Drivers (`DRIVER=podman` default, `DRIVER=seed` for the loop pass) only
+change *where* each stage executes; the inputs, outputs, and shell
+scripts are identical.
+
+## §1. boot0 — from a hex seed to a macro assembler
+
+**You arrive with**: nothing of ours. Just `sh`, `podman` or
+`qemu-user-static`, and the seven [vendored seed
+files](../vendor/seed/) per arch. `hex0-seed` is the only opaque
+artifact; it is a few hundred bytes (526 / 229 / 392 for
+aarch64 / amd64 / riscv64).
+
+**boot0 builds**: `hex2`, `catm`, `M0`.
+
+**How**:
+
+```
+hex0-seed hex0.hex0 → hex0 (hex assembler, no labels)
+hex0 hex1.hex0 → hex1 (hex assembler with labels)
+hex1 hex2.hex1 → hex2 (hex with ELF-aware linking)
+hex2 catm.hex2 → catm (concatenate files)
+catm ELF.hex2 + M0.hex2 → M0.combined.hex2
+hex2 M0.combined.hex2 → M0 (macro stage above hex2)
+```
+
+Each line is one `stage` call in [boot0.sh](../boot/boot0.sh). The
+script is 48 lines. Read it.
+
+**Trust extension**: from this point on you have a file concatenator
+(`catm`), a hex-with-labels assembler/linker (`hex2`), and a macro
+preprocessor (`M0`). Everything later is derived from these three.
+
+**Worth reading**: `boot0.sh` itself, then the
+[live-bootstrap stage0-posix
+documentation](https://github.com/oriansj/stage0-posix) for what each
+seed file is.
+
+## §2. boot1 — first self-hosted programs
+
+**You arrive with**: `hex2`, `catm`, `M0` from boot0.
+
+**boot1 builds**: `M1pp` and `hex2pp` — the M1 expander and
+hex2 assembler that all later P1 / P1pp source uses.
+
+**How**: a small build function, applied once each:
+
+```sh
+build_p1() { # $1 = source .P1, $2 = output binary name
+ stage catm combined.M1 P1.M1 "$1" -- P1.M1 "$1" -- combined.M1
+ stage M0 combined.M1 prog.hex2 -- combined.M1 -- prog.hex2
+ stage catm linked.hex2 ELF.hex2 prog.hex2 -- ELF.hex2 prog.hex2 -- linked.hex2
+ stage hex2 linked.hex2 "$2" -- linked.hex2 -- "$2"
+}
+```
+
+`P1.M1` is the per-arch backend that turns portable P1 instruction
+mnemonics into native machine code. `M1pp.P1` and `hex2pp.P1` are
+~5000 and ~3100 lines of P1 source. They are the first programs in
+this chain written in our own pseudo-ISA, and are the first sources that
+are naturally human-readable (ie not hex bytes) and portable.
+
+**Trust extension**: M1pp accepts the macro flavour every later
+`.P1pp` file uses (function-like macros, struct/enum synthesis,
+compile-time integer eval, token paste, hygienic intra-macro labels).
+hex2pp adds nestable scopes, alignment, and pointer-size directives on
+top of hex2. Together they are enough to compile every later stage's
+source.
+
+**Worth reading**: `M1pp/M1pp.c` is a 2110-line C reference implementation kept
+in sync with `M1pp.P1`. Its preamble lays out the syntax in 60 lines:
+
+```
+/*
+ * Tiny single-pass M1pp macro expander. Output is consumed directly by
+ * hex2pp -- there is no intermediate M0/hex2 stage. All emission is in
+ * the byte/label/directive vocabulary hex2pp accepts.
+ *
+ * Syntax:
+ * %macro NAME(a, b)
+ * ... body ...
+ * %endm
+ * …
+```
+
+The actual bootstrap loads `M1pp.P1`; the C version is for reference.
+Same arrangement for `hex2pp/hex2pp.c` vs `hex2pp/hex2pp.P1`.
+
+The full M1pp / hex2pp specs are [docs/M1PP.md](M1PP.md) and
+[docs/HEX2pp.md](HEX2pp.md).
+
+## §3. boot2 — closing on catm, then a Scheme
+
+**You arrive with**: M1pp, hex2pp from boot1; the seed `catm` from
+boot0 (one last use).
+
+**boot2 builds**: `catm` (rebuilt from `catm.P1pp`) and `scheme1`.
+
+**How**: the universal P1pp build function appears here for the first
+time —
+
+```sh
+build_p1pp() { # $1 = catm-bin, $2 = src .P1pp, $3 = out
+ stage "$1" combined.M1pp backend.M1pp frontend.M1pp libp1pp.P1pp "$2" \
+ -- backend.M1pp frontend.M1pp libp1pp.P1pp "$2" -- combined.M1pp
+ stage M1pp combined.M1pp expanded.hex2pp -- combined.M1pp -- expanded.hex2pp
+ stage "$1" linked.hex2pp ELF.hex2 expanded.hex2pp -- ELF.hex2 expanded.hex2pp -- linked.hex2pp
+ stage hex2pp -B 0x600000 linked.hex2pp "$3" -- linked.hex2pp -- "$3"
+}
+```
+
+The four files concatenated into `combined.M1pp` are:
+`P1-<arch>.M1pp` (per-arch backend), `P1.M1pp` (portable frontend),
+`P1pp.P1pp` ("libp1pp" — standard macros and helpers), and the
+program source. M1pp expands all the macros; hex2pp turns the
+resulting bytes into an ELF.
+
+`catm.P1pp` is built first (using the seed `catm` one last time, then
+discarded). After that, `scheme1.P1pp` is built using the new `catm`.
+From here forward, stage0 is retired, and all binaries have been built from
+portable sources (no arch-specific code and no hex).
+
+**Trust extension**: `catm` is now self-built. `scheme1` is a small
+R7RS-subset Scheme — fixnums, pairs, symbols, bytevectors,
+closures, records. ~4300 LoC of P1pp; full surface in
+[docs/SCHEME1.md](SCHEME1.md). It is not a teaching toy: it runs the C
+compiler in the next stage.
+
+**Worth reading**: `catm/catm.P1pp` is 105 lines and is the most
+readable single P1pp file in the project. It demonstrates the whole
+P1pp surface in one ~100-line program. After that, eval/apply and the
+dispatcher headers of
+[scheme1/scheme1.P1pp](../scheme1/scheme1.P1pp).
+
+## §4. boot3 — C arrives via a Scheme
+
+**You arrive with**: `M1pp`, `hex2pp` from boot1; `catm`, `scheme1`
+from boot2.
+
+**boot3 builds**: `tcc0` — a tcc-0.9.26 binary, compiled by
+`cc.scm` running inside `scheme1`, and assembled by M1pp and hex2pp.
+
+**How**: the boot3 driver hands `scheme1` a generated `run.scm` that
+loads `cc/cc.scm`, hands it the flattened tcc translation unit
+`tcc.flat.c`, captures the emitted P1pp, and runs the standard
+M1pp+hex2pp pipeline to turn that into an ELF. `tcc.flat.c` is one
+big TU produced by `bootprep/stage1-flatten.sh` from upstream
+tcc-0.9.26 plus a small set of patches (search for `our-patches/` in
+that script).
+
+`cc.scm` is the central piece. ~5200 LoC of Scheme implementing a
+streaming C compiler: lexer → preprocessor → parser → codegen →
+P1pp emission. Full code map in [docs/CCSCM.md](CCSCM.md); the
+accepted C subset in [docs/CC.md](CC.md).
+
+**Trust extension**: a C compiler that you can read end to end in an
+afternoon is now part of the chain. Its output, `tcc0`, is a real tcc
+— with all of tcc's faults plus whatever divergence cc.scm's codegen
+contributes; boot4 will iron that out.
+
+**Worth reading**: the cc.scm overview block, then the codegen
+section header (search for `;; ── Code generator ──`). The
+phase-1 milestone test
+[tests/cc/000-return-argc.c](../tests/cc/000-return-argc.c) is the
+smallest end-to-end exercise of the whole pipeline.
+
+## §5. boot4 — self-host to a fixed point
+
+**You arrive with**: `tcc0` from boot3.
+
+**boot4 builds**: `tcc1`, `tcc2`, `tcc3`, plus `crt1.o`,
+`libc.a` (mes-libc), and `libtcc1.a`.
+
+**How**: tcc compiles tcc, three more times.
+
+```
+tcc0 = tcc-source compiled by cc.scm ← boot3
+tcc1 = tcc-source compiled by tcc0 ← here
+tcc2 = tcc-source compiled by tcc1 ← here
+tcc3 = tcc-source compiled by tcc2 ← here
+```
+
+After the third bounce, the script asserts `tcc2 == tcc3` byte for
+byte. That's the fixed point: tcc compiling itself with no help from
+cc.scm reaches a stable image, and any future build that walks through
+this stage will produce the same `tcc3` from the same sources.
+
+Why four stages, not two? `cc.scm` and tcc's own codegen produce
+different — but both correct — code for the same source, so `tcc0`
+and `tcc1` differ. `tcc1` and `tcc2` are both built by tcc, but `tcc1`
+itself was a cc.scm-shaped binary, so its codegen choices in `tcc2`
+need one more bounce to reach the tcc-shaped fixed point.
+
+**Trust extension**: a self-built tcc that demonstrates determinism
+under self-application. Plus mes-libc, a small libc that's good
+enough for tcc's own runtime, and `libtcc1.a` — tcc's helper archive
+(division/intrinsics that tcc emits calls to).
+
+**Worth reading**: the fixed-point check in
+[boot4.sh:114–123](../boot/boot4.sh) is the audit point of the entire
+chain.
+
+## §6. boot5 — a real libc
+
+**You arrive with**: `tcc3` and `libtcc1.a` from boot4; `catm` and
+`scheme1` from boot2.
+
+**boot5 builds**: `libc.a` and `crt{1,i,n}.o` from upstream
+musl-1.2.5, plus a `hello` smoke binary linked statically against the
+result.
+
+**How**: musl is patched lightly during prep (`bootprep/musl-vendor.sh`
++ `bootprep/prep-src.sh`) to work around tcc's missing GCC extensions:
+register-asm-variable syscalls, `__attribute__((alias))` weak refs,
+`_Complex`, x86_64 SSE/x87 inline asm. The patch list lives in the
+musl prep script. Then a generated `run.scm` walks the per-source
+list and invokes `tcc -c` ~1300 times inside the driver, producing a
+static archive.
+
+**Trust extension**: a real libc. The chain no longer depends on
+mes-libc for anything but the seed-kernel's `hello` link path; every
+later binary you build with this `tcc3` can use musl.
+
+**Worth reading**: the musl skip / patch lists under
+[vendor/musl/](../vendor/musl/) and the calibration script
+[bootprep/boot5-calibrate.sh](../bootprep/boot5-calibrate.sh).
+Background: [docs/MUSL.md](MUSL.md), [docs/LIBC.md](LIBC.md).
+
+## §7. boot6 — a kernel that runs the chain
+
+**You arrive with**: `tcc3` from boot4 and `scheme1` from boot2.
+
+**boot6 builds**: a minimal kernel image — `Image` on aarch64, `kernel.elf`
+on amd64 / riscv64.
+
+**How**: tcc3 compiles `seed-kernel/kernel.c` plus the per-arch entry
+in `seed-kernel/arch/<arch>/{kernel.S, mmu.c, arch.h}` and `tcc/cc/mem.c`,
+linking through tcc directly — no separate linker, no objcopy. On
+aarch64, tcc's flat-binary mode produces an arm64 `Image`.
+
+The kernel is a simple OS satisfying [docs/OS.md](OS.md) Tier 1: it
+boots through an arch backend with two virtio-blk-MMIO disks, parses
+the DTB, brings up a polling virtio-blk driver, reads a cpio newc
+archive into an in-memory tmpfs, loads `/init` (a static target ELF),
+enters it through the trap-return path, and serialises the tmpfs to
+the second disk on exit. ~1300 lines of C plus per-arch entry / MMU
+in another ~250.
+
+**Trust extension**: a kernel built by the same chain it needs to
+run. With this kernel, `DRIVER=seed` lets you re-run *every* prior
+stage inside the kernel that boot6 just produced.
+
+**Worth reading**:
+[seed-kernel/kernel.c](../seed-kernel/kernel.c) — preamble plus the
+console / fs / trap dispatcher headers.
+The full OS contract is [docs/OS.md](OS.md).
+
+## §8. The loop
+
+The chain has now produced a kernel. The kernel is enough to run any
+of the earlier stages: under `DRIVER=seed`, each `bootN.sh` packs its
+inputs into a cpio, boots the kernel under `qemu-system-<arch>`, the
+kernel loads an init ELF that runs the same shell pipeline as the
+podman driver, and the resulting outputs are extracted from the
+write-disk on shutdown.
+
+```sh
+DRIVER=seed ./boot/boot.sh aarch64
+```
+
+`tests/seed-accept.sh` then diffs the seed-driver outputs against the
+podman-driver outputs byte for byte. When that diff is empty, the
+chain has run on itself and produced the same artifacts. That is the
+loop closure — the smallest interesting fact this project tries to
+demonstrate.
+
+## What to read next
+
+* [docs/P1.md](P1.md) — the portable pseudo-ISA, the abstraction
+ every later stage rests on.
+* [docs/CCSCM.md](CCSCM.md) — code map for `cc.scm`, the C compiler.
+* [docs/OS.md](OS.md) — the seed-kernel / target-userland contract.
+* The tests under [tests/](../tests). Most suites are
+ fixture-driven; each fixture is a small program that demonstrates
+ one feature. `tests/cc-cg/`, `tests/cc/`, and `tests/cc-libc/`
+ together are a guided tour of the C surface.