commit f487b8ca2037417036958009646a2482cd3d172c
parent 246c54871778e0157ee59765472d994c249ca2b3
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Tue, 5 May 2026 07:48:14 -0700
docs/OS-TODO: split landed vs open; add tcc-paths TODO
- Promote the boot4 path-embedding caveat from a one-liner buried in
the boot3/4 landed bullet into an explicit Open TODO with the two
candidate fixes (TCC_EMBED_BASENAME define/patch, or basename-only
in the podman path).
- Sharpen the boot5 TODO with the two specific unblockers (cache
parsed prelude in the kernel; tcc batch mode for multi-TU compile).
Diffstat:
1 file changed, 38 insertions(+), 11 deletions(-)
diff --git a/docs/OS-TODO.md b/docs/OS-TODO.md
@@ -163,17 +163,44 @@ scheme1 spawning the boot2-built catm via the .scm prelude.
tier2-gate ≈ 22 s; seed-accept (boot0/1/2) ≈ 2 s; boot3 acceptance
≈ 5 min wall (was multi-hour under TCG).
-- **Port boot5 to the seed driver — deferred.** boot5 compiles ~500
- musl TUs, each one a `(run "tcc" …)`. Even with HVF and the
- pool-swap fix, the per-clone fixed cost (TLB flush, ELF reload,
- scheme1 start-up) compounds to a long wall time. `scripts/boot5.sh`
- rejects `DRIVER=seed` with a pointer here. The natural unblockers
- are (a) caching the parsed prelude in the kernel (avoid re-parsing
- 24 KB scheme on every spawn), or (b) a "compile many sources" tcc
- batch mode so one clone covers many TUs. Neither is in scope of
- OS.md.
-
-- **NULL-page hardening**: slot 0 is unmapped so a NULL deref faults to
+## Open
+
+- **Port boot5 to the seed driver.** boot5 compiles ~500 musl TUs, each
+ one a `(run "tcc" …)`. Even with HVF and the pool-swap fix, the
+ per-clone fixed cost (TLB flush, ELF reload, scheme1 start-up)
+ compounds to a long wall time. `scripts/boot5.sh` rejects
+ `DRIVER=seed` today. Two natural unblockers, either of which would
+ make boot5 tractable on its own:
+ - **Cache the parsed prelude in the kernel** so each spawn doesn't
+ re-tokenise + re-build the AST for the 24 KB prelude.scm. The
+ parser output is per-process today; lift it into kernel state
+ keyed by the prelude's content hash, hand the child a fresh
+ pointer-to-AST at execve time.
+ - **tcc batch mode**: a single `(run "tcc" "-c" src1 src2 …)` that
+ emits one .o per TU, so one clone covers many translation units.
+ Upstream tcc already accepts multiple inputs in one invocation;
+ the boot4-gen-runscm path just doesn't use it. Likely the
+ cheaper of the two and worth trying first.
+
+- **tcc-emitted source paths embed in .o files.** Boot4's intermediate
+ artifacts (`crt1.o`, `libc.a`, `libtcc1.a`) differ from the podman
+ path by exactly the length of the embedded source-filename string,
+ so the seed-vs-podman byte-identity check is narrowed to tcc3 +
+ hello (the linker drops those strings). Two ways to close the gap:
+ - **Make tcc emit relative-only paths.** A `-DTCC_EMBED_BASENAME` or
+ similar guard in tcc.flat.c so the relocation/STT_FILE entry uses
+ `basename(input)` regardless of how it was passed. Either as a
+ define applied at boot3 build time, or a small upstream-style
+ patch carried in `simple-patches/`.
+ - **Make the podman path use basenames too.** `cd /work/in &&
+ tcc -c start.S` instead of `tcc -c /work/in/start.S`. Smaller diff
+ but pushes the constraint into the boot4-gen-runscm + boot4.sh
+ podman branch rather than the compiler itself.
+ The first is the more principled fix because it makes any future
+ bootN-on-seed comparison path-agnostic; the second is the lower-
+ risk one if we want to land it without touching tcc.
+
+- **NULL-page hardening.** Slot 0 is unmapped so a NULL deref faults to
the kernel as a user sync; the kernel currently panics rather than
delivering a SIGSEGV-equivalent. Acceptable per OS.md (default-action
termination is sufficient) but a minor polish opportunity.