boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit f487b8ca2037417036958009646a2482cd3d172c
parent 246c54871778e0157ee59765472d994c249ca2b3
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Tue,  5 May 2026 07:48:14 -0700

docs/OS-TODO: split landed vs open; add tcc-paths TODO

- Promote the boot4 path-embedding caveat from a one-liner buried in
  the boot3/4 landed bullet into an explicit Open TODO with the two
  candidate fixes (TCC_EMBED_BASENAME define/patch, or basename-only
  in the podman path).
- Sharpen the boot5 TODO with the two specific unblockers (cache
  parsed prelude in the kernel; tcc batch mode for multi-TU compile).

Diffstat:
Mdocs/OS-TODO.md | 49++++++++++++++++++++++++++++++++++++++-----------
1 file changed, 38 insertions(+), 11 deletions(-)

diff --git a/docs/OS-TODO.md b/docs/OS-TODO.md @@ -163,17 +163,44 @@ scheme1 spawning the boot2-built catm via the .scm prelude. tier2-gate ≈ 22 s; seed-accept (boot0/1/2) ≈ 2 s; boot3 acceptance ≈ 5 min wall (was multi-hour under TCG). -- **Port boot5 to the seed driver — deferred.** boot5 compiles ~500 - musl TUs, each one a `(run "tcc" …)`. Even with HVF and the - pool-swap fix, the per-clone fixed cost (TLB flush, ELF reload, - scheme1 start-up) compounds to a long wall time. `scripts/boot5.sh` - rejects `DRIVER=seed` with a pointer here. The natural unblockers - are (a) caching the parsed prelude in the kernel (avoid re-parsing - 24 KB scheme on every spawn), or (b) a "compile many sources" tcc - batch mode so one clone covers many TUs. Neither is in scope of - OS.md. - -- **NULL-page hardening**: slot 0 is unmapped so a NULL deref faults to +## Open + +- **Port boot5 to the seed driver.** boot5 compiles ~500 musl TUs, each + one a `(run "tcc" …)`. Even with HVF and the pool-swap fix, the + per-clone fixed cost (TLB flush, ELF reload, scheme1 start-up) + compounds to a long wall time. `scripts/boot5.sh` rejects + `DRIVER=seed` today. Two natural unblockers, either of which would + make boot5 tractable on its own: + - **Cache the parsed prelude in the kernel** so each spawn doesn't + re-tokenise + re-build the AST for the 24 KB prelude.scm. The + parser output is per-process today; lift it into kernel state + keyed by the prelude's content hash, hand the child a fresh + pointer-to-AST at execve time. + - **tcc batch mode**: a single `(run "tcc" "-c" src1 src2 …)` that + emits one .o per TU, so one clone covers many translation units. + Upstream tcc already accepts multiple inputs in one invocation; + the boot4-gen-runscm path just doesn't use it. Likely the + cheaper of the two and worth trying first. + +- **tcc-emitted source paths embed in .o files.** Boot4's intermediate + artifacts (`crt1.o`, `libc.a`, `libtcc1.a`) differ from the podman + path by exactly the length of the embedded source-filename string, + so the seed-vs-podman byte-identity check is narrowed to tcc3 + + hello (the linker drops those strings). Two ways to close the gap: + - **Make tcc emit relative-only paths.** A `-DTCC_EMBED_BASENAME` or + similar guard in tcc.flat.c so the relocation/STT_FILE entry uses + `basename(input)` regardless of how it was passed. Either as a + define applied at boot3 build time, or a small upstream-style + patch carried in `simple-patches/`. + - **Make the podman path use basenames too.** `cd /work/in && + tcc -c start.S` instead of `tcc -c /work/in/start.S`. Smaller diff + but pushes the constraint into the boot4-gen-runscm + boot4.sh + podman branch rather than the compiler itself. + The first is the more principled fix because it makes any future + bootN-on-seed comparison path-agnostic; the second is the lower- + risk one if we want to land it without touching tcc. + +- **NULL-page hardening.** Slot 0 is unmapped so a NULL deref faults to the kernel as a user sync; the kernel currently panics rather than delivering a SIGSEGV-equivalent. Acceptable per OS.md (default-action termination is sufficient) but a minor polish opportunity.