boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit 15391ae704f949821f6f8e7f7088a106e12b18d0
parent 16e81e5d753312600d9beaac52a9155841662cf0
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Wed,  6 May 2026 14:32:09 -0700

B (scripts): unify bootN.sh; per-stage + cumulative timings

Sweep the scripts/boot{,0..6}.sh + lib-{arch,pipeline,runscm}.sh
surface for clarity and uniformity. No behavior change to the build
chain itself.

lib-arch.sh:
  - bootlib_init exports BOOT_STAGE; driver_init now sets OUT/STAGE
    automatically (was duplicated per script).
  - Add require_src and require_file <path> [<hint>] helpers,
    replacing seven copies of `[ -d "$SRC" ] || ...` and several
    ad-hoc file checks.
  - bootlib_init records BOOT_T0 and installs an EXIT trap; on
    success every stage prints `[BOOT_TAG] done in Xs (cum Ys)`
    where cum sums all per-stage .timing files for the current
    arch/driver. Works for both `make build/.../Image` and direct
    `boot.sh`.

lib-pipeline.sh, lib-runscm.sh:
  - pipeline_export / runscm_export accept varargs.
  - *_input_from_src and runscm_input_tree_from_src drop the
    leading `_kind` arg (always `src`); the one bin case in boot0
    inlines pipeline_input.
  - Add runscm_gen <gen-script> <args...>: emits run.scm to
    $STAGE/run.scm, logs size, registers it. Replaces a 4-line
    pattern in boot4/5/6.
  - Document why `stage` needs explicit -- input/output lists.

boot.sh:
  - set -e -> set -eu; rename helper stage() -> time_stage(); then
    drop time_stage entirely since per-child timing now comes from
    each child's lib trap.

boot0..boot6:
  - Standardized header docstrings; trimmed boot4's tcc1!=tcc2 essay.
  - boot1: build_p1 helper; boot2: build_p1pp helper. The duplicated
    4-stage chains are now one call per output.

Verified by `sh -n`, shellcheck, and end-to-end smoke runs of
boot0/1/2 on aarch64/podman with timing output.

Diffstat:
Mscripts/boot.sh | 33+++++++++++++--------------------
Mscripts/boot0.sh | 23+++++++----------------
Mscripts/boot1.sh | 52+++++++++++++++++++++-------------------------------
Mscripts/boot2.sh | 60+++++++++++++++++++++++++-----------------------------------
Mscripts/boot3.sh | 78+++++++++++++++++++++++++++++++++---------------------------------------------
Mscripts/boot4.sh | 117+++++++++++++++++++++++++++++---------------------------------------------------
Mscripts/boot5.sh | 55++++++++++++++++++++-----------------------------------
Mscripts/boot6.sh | 70+++++++++++++++++++++++++++++-----------------------------------------
Mscripts/lib-arch.sh | 93+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
Mscripts/lib-pipeline.sh | 32+++++++++++++++++++++-----------
Mscripts/lib-runscm.sh | 51++++++++++++++++++++++++++++++++++-----------------
11 files changed, 329 insertions(+), 335 deletions(-)

diff --git a/scripts/boot.sh b/scripts/boot.sh @@ -14,7 +14,7 @@ ## DRIVER=seed ./scripts/boot.sh <arch> # re-run on tcc-built kernel ## Subsequent DRIVER=seed runs reuse the Image directly — no stashing. -set -e +set -eu case "${1:-}" in -h|--help) @@ -63,35 +63,28 @@ fi # (the seed driver consumes build/$ARCH/podman/boot6/$KERNEL_NAME). rm -rf build/$ARCH/$DRIVER -T0=$(date +%s) -trap 'echo "[$BOOT_TAG] elapsed at exit: $(($(date +%s) - T0))s"' EXIT - -stage() { - name=$1; shift - s=$(date +%s) - "$@" - e=$(date +%s) - echo "[$BOOT_TAG] $name: $((e - s))s (cum $((e - T0))s)" -} +# Per-stage timing comes from each child's own bootlib EXIT trap +# (`[bootN/$DRIVER/$ARCH] done in Xs (cum Ys)`); this orchestrator only +# adds its own total at the end (also via the lib trap). # A0a: build the canonical generated source tree at build/$ARCH/src/. # Boot stages read source from there exclusively (no flatten/unpack/ # patch inside boot{N}.sh). -stage prep-src ./scripts/prep-src.sh $ARCH +./scripts/prep-src.sh $ARCH -stage boot0 ./scripts/boot0.sh $ARCH -stage boot1 ./scripts/boot1.sh $ARCH -stage boot2 ./scripts/boot2.sh $ARCH -stage boot3 ./scripts/boot3.sh $ARCH -stage boot4 ./scripts/boot4.sh $ARCH +./scripts/boot0.sh $ARCH +./scripts/boot1.sh $ARCH +./scripts/boot2.sh $ARCH +./scripts/boot3.sh $ARCH +./scripts/boot4.sh $ARCH # A0b: apply the per-arch musl skip filter (needs tcc3 from boot4 if # the calibration list is missing; the committed list is the common # case and runs without compiler). -stage prep-musl ./scripts/prep-musl.sh $ARCH +./scripts/prep-musl.sh $ARCH -stage boot5 ./scripts/boot5.sh $ARCH +./scripts/boot5.sh $ARCH # boot6 builds the seed-kernel ELF/Image with boot4's tcc3 (no `ld -T`, # no objcopy). -stage boot6 ./scripts/boot6.sh $ARCH +./scripts/boot6.sh $ARCH diff --git a/scripts/boot0.sh b/scripts/boot0.sh @@ -1,15 +1,14 @@ #!/bin/sh -## boot0.sh — standalone seed bootstrap. +## boot0.sh — seed bootstrap: hex0-seed → hex0 → hex1 → hex2 → catm → M0. ## ## Stage 0 of the README's chain. From the ~400-byte vendored hex0-seed, -## brings up: hex0 -> hex1 -> hex2 -> catm -> M0. Three of those (hex2, -## catm, M0) are the binaries every later stage depends on. +## brings up the three binaries every later stage depends on (hex2, catm, +## M0). ## ## ─── Inputs (sources, from canonical tree) ─────────────────────────── ## build/$ARCH/src/bin/hex0-seed ## build/$ARCH/src/src/vendor-seed/{hex0.hex0, hex1.hex0, hex2.hex1, ## catm.hex2, M0.hex2, ELF.hex2} -## (populated by scripts/prep-src.sh from vendor/seed/$ARCH/.) ## ## ─── Outputs ────────────────────────────────────────────────────────── ## build/$ARCH/$DRIVER/boot0/{hex2, catm, M0} @@ -22,20 +21,15 @@ set -eu . scripts/lib-arch.sh bootlib_init boot0 "${1:-}" driver_init scratch - -SRC=build/$ARCH/src -OUT=build/$ARCH/$DRIVER/boot0 -STAGE=build/$ARCH/$DRIVER/.boot0-stage - -[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } +require_src . scripts/lib-pipeline.sh pipeline_init "$STAGE" "$OUT" "$DRIVER" # ─── inputs (from canonical src tree) ───────────────────────────────── -pipeline_input_from_src bin hex0-seed +pipeline_input hex0-seed "build/$ARCH/src/bin/hex0-seed" for f in hex0.hex0 hex1.hex0 hex2.hex1 catm.hex2 M0.hex2 ELF.hex2; do - pipeline_input_from_src src "vendor-seed/$f" + pipeline_input_from_src "vendor-seed/$f" done # ─── pipeline ───────────────────────────────────────────────────────── @@ -48,10 +42,7 @@ stage hex2 catm.hex2 catm -- catm.hex2 stage catm M0.combined.hex2 ELF.hex2 M0.hex2 -- ELF.hex2 M0.hex2 -- M0.combined.hex2 stage hex2 M0.combined.hex2 M0 -- M0.combined.hex2 -- M0 -pipeline_export hex2 -pipeline_export catm -pipeline_export M0 - +pipeline_export hex2 catm M0 pipeline_run echo "[$BOOT_TAG] OK -> $OUT/{hex2, catm, M0}" diff --git a/scripts/boot1.sh b/scripts/boot1.sh @@ -1,9 +1,9 @@ #!/bin/sh -## boot1.sh — standalone build of M1pp + hex2pp. +## boot1.sh — build the self-hosted M1pp + hex2pp pair. ## -## Stage 1 of the README's chain: produces the self-hosted M1pp + -## hex2pp pair, built from their .P1 sources via the seed M0 + hex2 -## chain. catm is rebuilt from catm.P1pp in boot2. +## Stage 1 of the README's chain: produces M1pp and hex2pp from their +## .P1 sources via the seed M0 + hex2 chain. catm is rebuilt later in +## boot2 from catm.P1pp. ## ## ─── Inputs (sources, from canonical tree) ─────────────────────────── ## build/$ARCH/src/src/M1pp/M1pp.P1 @@ -25,14 +25,10 @@ set -eu . scripts/lib-arch.sh bootlib_init boot1 "${1:-}" driver_init scratch +require_src BOOT0=build/$ARCH/$DRIVER/boot0 -SRC=build/$ARCH/src -OUT=build/$ARCH/$DRIVER/boot1 -STAGE=build/$ARCH/$DRIVER/.boot1-stage - require_prev "$BOOT0" hex2 M0 catm -[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } . scripts/lib-pipeline.sh pipeline_init "$STAGE" "$OUT" "$DRIVER" @@ -41,35 +37,29 @@ pipeline_init "$STAGE" "$OUT" "$DRIVER" pipeline_input hex2 "$BOOT0/hex2" pipeline_input M0 "$BOOT0/M0" pipeline_input catm "$BOOT0/catm" -pipeline_input_from_src src "P1/P1-$ARCH.M1" P1.M1 -pipeline_input_from_src src vendor-seed/ELF.hex2 -pipeline_input_from_src src M1pp/M1pp.P1 -pipeline_input_from_src src hex2pp/hex2pp.P1 +pipeline_input_from_src "P1/P1-$ARCH.M1" P1.M1 +pipeline_input_from_src vendor-seed/ELF.hex2 +pipeline_input_from_src M1pp/M1pp.P1 +pipeline_input_from_src hex2pp/hex2pp.P1 # ─── pipeline ───────────────────────────────────────────────────────── -echo "[$BOOT_TAG] M1pp.P1 + hex2pp.P1 -> M1pp + hex2pp" - -# .P1 -> ELF via M0 + hex2: -# catm P1.M1 + src -> combined.M1 +# .P1 -> ELF, applied to each of M1pp.P1 and hex2pp.P1: +# catm P1.M1 + <src> -> combined.M1 # M0 combined.M1 -> prog.hex2 # catm ELF.hex2 + prog.hex2 -> linked.hex2 # hex2 linked.hex2 -> ELF binary +build_p1() { # $1 = source .P1, $2 = output binary name + stage catm combined.M1 P1.M1 "$1" -- P1.M1 "$1" -- combined.M1 + stage M0 combined.M1 prog.hex2 -- combined.M1 -- prog.hex2 + stage catm linked.hex2 ELF.hex2 prog.hex2 -- ELF.hex2 prog.hex2 -- linked.hex2 + stage hex2 linked.hex2 "$2" -- linked.hex2 -- "$2" +} -# M1pp.P1 -> M1pp -stage catm combined.M1 P1.M1 M1pp.P1 -- P1.M1 M1pp.P1 -- combined.M1 -stage M0 combined.M1 prog.hex2 -- combined.M1 -- prog.hex2 -stage catm linked.hex2 ELF.hex2 prog.hex2 -- ELF.hex2 prog.hex2 -- linked.hex2 -stage hex2 linked.hex2 M1pp -- linked.hex2 -- M1pp - -# hex2pp.P1 -> hex2pp -stage catm combined.M1 P1.M1 hex2pp.P1 -- P1.M1 hex2pp.P1 -- combined.M1 -stage M0 combined.M1 prog.hex2 -- combined.M1 -- prog.hex2 -stage catm linked.hex2 ELF.hex2 prog.hex2 -- ELF.hex2 prog.hex2 -- linked.hex2 -stage hex2 linked.hex2 hex2pp -- linked.hex2 -- hex2pp - -pipeline_export M1pp -pipeline_export hex2pp +echo "[$BOOT_TAG] M1pp.P1 + hex2pp.P1 -> M1pp + hex2pp" +build_p1 M1pp.P1 M1pp +build_p1 hex2pp.P1 hex2pp +pipeline_export M1pp hex2pp pipeline_run echo "[$BOOT_TAG] OK -> $OUT/{M1pp, hex2pp}" diff --git a/scripts/boot2.sh b/scripts/boot2.sh @@ -1,10 +1,10 @@ #!/bin/sh ## boot2.sh — rebuild catm via M1pp+hex2pp, then build scheme1. ## -## Stage 2 of the README's chain: first rebuilds catm from catm.P1pp -## via the freshly-built M1pp + hex2pp pipeline (replacing the seed -## boot0 catm so later stages run with zero boot0 dependencies); then -## builds the scheme1 interpreter from scheme1.P1pp using the new catm. +## Stage 2 of the README's chain. First rebuilds catm from catm.P1pp via +## the freshly-built M1pp+hex2pp pipeline (replacing the seed boot0 catm +## so later stages have zero boot0 dependencies); then builds the +## scheme1 interpreter from scheme1.P1pp using the new catm. ## ## ─── Inputs (sources, from canonical tree) ─────────────────────────── ## build/$ARCH/src/src/catm/catm.P1pp @@ -27,16 +27,12 @@ set -eu . scripts/lib-arch.sh bootlib_init boot2 "${1:-}" driver_init scratch +require_src BOOT0=build/$ARCH/$DRIVER/boot0 BOOT1=build/$ARCH/$DRIVER/boot1 -SRC=build/$ARCH/src -OUT=build/$ARCH/$DRIVER/boot2 -STAGE=build/$ARCH/$DRIVER/.boot2-stage - require_prev "$BOOT0" catm require_prev "$BOOT1" M1pp hex2pp -[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } . scripts/lib-pipeline.sh pipeline_init "$STAGE" "$OUT" "$DRIVER" @@ -45,39 +41,33 @@ pipeline_init "$STAGE" "$OUT" "$DRIVER" pipeline_input catm0 "$BOOT0/catm" # bootstrap; replaced by output 'catm' pipeline_input M1pp "$BOOT1/M1pp" pipeline_input hex2pp "$BOOT1/hex2pp" -pipeline_input_from_src src "P1/P1-$ARCH.M1pp" backend.M1pp -pipeline_input_from_src src P1/P1.M1pp frontend.M1pp -pipeline_input_from_src src P1/P1pp.P1pp libp1pp.P1pp -pipeline_input_from_src src vendor-seed/ELF.hex2 -pipeline_input_from_src src catm/catm.P1pp -pipeline_input_from_src src scheme1/scheme1.P1pp +pipeline_input_from_src "P1/P1-$ARCH.M1pp" backend.M1pp +pipeline_input_from_src P1/P1.M1pp frontend.M1pp +pipeline_input_from_src P1/P1pp.P1pp libp1pp.P1pp +pipeline_input_from_src vendor-seed/ELF.hex2 +pipeline_input_from_src catm/catm.P1pp +pipeline_input_from_src scheme1/scheme1.P1pp # ─── pipeline ───────────────────────────────────────────────────────── -echo "[$BOOT_TAG] catm.P1pp -> catm; scheme1.P1pp -> scheme1" - -# .P1pp -> ELF: -# catm backend + frontend + libp1pp + src -> combined.M1pp +# .P1pp -> ELF, applied to each of catm.P1pp and scheme1.P1pp: +# catm backend + frontend + libp1pp + <src> -> combined.M1pp # M1pp combined.M1pp -> expanded.hex2pp # catm ELF.hex2 + expanded.hex2pp -> linked.hex2pp # hex2pp -B 0x600000 linked.hex2pp -> ELF binary +build_p1pp() { # $1 = catm-bin name (catm0 or catm), $2 = src .P1pp, $3 = out + _catm=$1; _src=$2; _out=$3 + stage "$_catm" combined.M1pp backend.M1pp frontend.M1pp libp1pp.P1pp "$_src" \ + -- backend.M1pp frontend.M1pp libp1pp.P1pp "$_src" -- combined.M1pp + stage M1pp combined.M1pp expanded.hex2pp -- combined.M1pp -- expanded.hex2pp + stage "$_catm" linked.hex2pp ELF.hex2 expanded.hex2pp -- ELF.hex2 expanded.hex2pp -- linked.hex2pp + stage hex2pp -B 0x600000 linked.hex2pp "$_out" -- linked.hex2pp -- "$_out" +} -# catm.P1pp -> catm (bootstrap with boot0 catm) -stage catm0 combined.M1pp backend.M1pp frontend.M1pp libp1pp.P1pp catm.P1pp \ - -- backend.M1pp frontend.M1pp libp1pp.P1pp catm.P1pp -- combined.M1pp -stage M1pp combined.M1pp expanded.hex2pp -- combined.M1pp -- expanded.hex2pp -stage catm0 linked.hex2pp ELF.hex2 expanded.hex2pp -- ELF.hex2 expanded.hex2pp -- linked.hex2pp -stage hex2pp -B 0x600000 linked.hex2pp catm -- linked.hex2pp -- catm - -# scheme1.P1pp -> scheme1 (uses just-built catm) -stage catm combined.M1pp backend.M1pp frontend.M1pp libp1pp.P1pp scheme1.P1pp \ - -- backend.M1pp frontend.M1pp libp1pp.P1pp scheme1.P1pp -- combined.M1pp -stage M1pp combined.M1pp expanded.hex2pp -- combined.M1pp -- expanded.hex2pp -stage catm linked.hex2pp ELF.hex2 expanded.hex2pp -- ELF.hex2 expanded.hex2pp -- linked.hex2pp -stage hex2pp -B 0x600000 linked.hex2pp scheme1 -- linked.hex2pp -- scheme1 - -pipeline_export catm -pipeline_export scheme1 +echo "[$BOOT_TAG] catm.P1pp -> catm; scheme1.P1pp -> scheme1" +build_p1pp catm0 catm.P1pp catm # bootstrap with boot0 catm +build_p1pp catm scheme1.P1pp scheme1 # uses just-built catm +pipeline_export catm scheme1 pipeline_run echo "[$BOOT_TAG] OK -> $OUT/{catm, scheme1}" diff --git a/scripts/boot3.sh b/scripts/boot3.sh @@ -1,11 +1,10 @@ #!/bin/sh -## boot3.sh — bootstrap tcc0 from cc.scm (Stage A of the four-stage tcc -## chain; the tcc0 → tcc1 → tcc2 → tcc3 rebuild lives in boot4.sh). +## boot3.sh — bootstrap tcc0 from cc.scm. ## -## README's `(define tcc (tcc1 tcc.c))`: cc.scm compiles tcc.flat.c into -## tcc0. boot3 stops there. boot4 picks up tcc0 and self-hosts the rest -## of the chain (tcc0 → tcc1 → tcc2 → tcc3, with tcc2 == tcc3 as the -## fixed-point check). +## Stage A of the four-stage tcc chain: cc.scm compiles tcc.flat.c into +## tcc0. boot4 picks up tcc0 and self-hosts the rest of the chain +## (tcc0 → tcc1 → tcc2 → tcc3, with tcc2 == tcc3 as the fixed-point +## check). ## ## tcc0 = tcc-source compiled by cc.scm ← produced here ## tcc1 = tcc-source compiled by tcc0 ← boot4 @@ -13,36 +12,30 @@ ## tcc3 = tcc-source compiled by tcc2 ← boot4 ## ## ─── Inputs (sources, from canonical tree) ─────────────────────────── -## build/$ARCH/src/src/scheme1/{prelude.scm} scheme bundle -## build/$ARCH/src/src/cc/{cc.scm,main.scm} scheme bundle -## build/$ARCH/src/src/P1/{P1-$ARCH.M1pp,P1.M1pp,P1pp.P1pp} M1pp pipeline -## build/$ARCH/src/src/P1/{entry-libc.P1pp,elf-end.P1pp} link framing -## build/$ARCH/src/src/vendor-seed/ELF.hex2 ELF header -## build/$ARCH/src/src/tcc/tcc.flat.c flattened tcc TU -## build/$ARCH/src/src/libc/libc.flat.c flattened mes-libc TU -## (populated up-front by scripts/prep-src.sh; this stage does -## no flatten/unpack/patch.) +## build/$ARCH/src/src/scheme1/prelude.scm scheme bundle +## build/$ARCH/src/src/cc/{cc.scm, main.scm} scheme bundle +## build/$ARCH/src/src/P1/{P1-$ARCH.M1pp, P1.M1pp, P1pp.P1pp} M1pp pipeline +## build/$ARCH/src/src/P1/{entry-libc.P1pp, elf-end.P1pp} link framing +## build/$ARCH/src/src/vendor-seed/ELF.hex2 ELF header +## build/$ARCH/src/src/tcc/tcc.flat.c flattened tcc TU +## build/$ARCH/src/src/libc/libc.flat.c flattened mes-libc TU ## ## ─── Inputs (binaries from prior stages) ────────────────────────────── -## build/$ARCH/$DRIVER/boot1/{M1pp, hex2pp} — built by scripts/boot1.sh -## build/$ARCH/$DRIVER/boot2/{catm, scheme1} — built by scripts/boot2.sh +## build/$ARCH/$DRIVER/boot1/{M1pp, hex2pp} +## build/$ARCH/$DRIVER/boot2/{catm, scheme1} ## ## ─── Tools ──────────────────────────────────────────────────────────── -## In container: scratch + busybox (no libc, no /etc, no resolver). -## scheme1 evaluates scripts/boot3-run.scm against the -## flat staging root (cwd=/work for podman, cwd=/ for -## the seed kernel). Same run.scm drives both. -## On host: none — Stage A is pure scheme1 + M1pp + hex2pp; no -## asm step is required. +## scheme1 evaluates scripts/boot3-run.scm against the flat staging +## root. Same run.scm drives both DRIVER=podman (cwd=/work) and +## DRIVER=seed (cwd=/). Stage A is pure scheme1 + M1pp + hex2pp; no +## asm step. ## ## ─── Outputs ────────────────────────────────────────────────────────── -## build/$ARCH/$DRIVER/boot3/tcc0 — cc.scm-built bootstrap tcc, -## consumed by scripts/boot4.sh +## build/$ARCH/$DRIVER/boot3/tcc0 — cc.scm-built bootstrap tcc ## build/$ARCH/$DRIVER/boot3/libc.P1pp — cc.scm-built mes-libc (lib mode); ## consumed by the cc-libc test suite ## build/$ARCH/$DRIVER/boot3/tcc.flat.P1pp — cc.scm-built tcc TU (lib mode); -## retained as a debug/inspection -## artifact +## debug/inspection artifact ## ## Usage: scripts/boot3.sh <arch> ## <arch> ∈ {aarch64, amd64, riscv64} for either DRIVER (default podman). @@ -52,17 +45,14 @@ set -eu . scripts/lib-arch.sh bootlib_init boot3 "${1:-}" driver_init empty +require_src BOOT1=build/$ARCH/$DRIVER/boot1 BOOT2=build/$ARCH/$DRIVER/boot2 SRC=build/$ARCH/src -OUT=build/$ARCH/$DRIVER/boot3 -STAGE=build/$ARCH/$DRIVER/.boot3-stage -# ── prerequisites ───────────────────────────────────────────────────── require_prev "$BOOT1" M1pp hex2pp require_prev "$BOOT2" catm scheme1 -[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } # ── stage inputs and run scheme1 + boot3-run.scm under $DRIVER ──────── . scripts/lib-runscm.sh @@ -77,23 +67,21 @@ runscm_input hex2pp "$BOOT1/hex2pp" # scheme1 binary itself is staged by runscm_run (so a `(run "scheme1" …)` # inside boot3-run.scm finds it at cwd-relative ./scheme1). -runscm_input_from_src src scheme1/prelude.scm -runscm_input_from_src src cc/cc.scm -runscm_input_from_src src cc/main.scm +runscm_input_from_src scheme1/prelude.scm +runscm_input_from_src cc/cc.scm +runscm_input_from_src cc/main.scm -runscm_input_from_src src "P1/P1-$ARCH.M1pp" backend.M1pp -runscm_input_from_src src P1/P1.M1pp frontend.M1pp -runscm_input_from_src src P1/P1pp.P1pp libp1pp.P1pp -runscm_input_from_src src P1/entry-libc.P1pp -runscm_input_from_src src P1/elf-end.P1pp -runscm_input_from_src src vendor-seed/ELF.hex2 +runscm_input_from_src "P1/P1-$ARCH.M1pp" backend.M1pp +runscm_input_from_src P1/P1.M1pp frontend.M1pp +runscm_input_from_src P1/P1pp.P1pp libp1pp.P1pp +runscm_input_from_src P1/entry-libc.P1pp +runscm_input_from_src P1/elf-end.P1pp +runscm_input_from_src vendor-seed/ELF.hex2 -runscm_input_from_src src tcc/tcc.flat.c -runscm_input_from_src src libc/libc.flat.c +runscm_input_from_src tcc/tcc.flat.c +runscm_input_from_src libc/libc.flat.c -runscm_export tcc0 -runscm_export libc.P1pp -runscm_export tcc.flat.P1pp +runscm_export tcc0 libc.P1pp tcc.flat.P1pp runscm_run "${BOOT3_TIMEOUT:-1800}" echo "[$BOOT_TAG] sizes: tcc0=$(wc -c <"$OUT/tcc0") libc.P1pp=$(wc -c <"$OUT/libc.P1pp")" diff --git a/scripts/boot4.sh b/scripts/boot4.sh @@ -5,14 +5,8 @@ ## the four-stage chain: tcc0 → tcc1 → tcc2 → tcc3. The bootstrap ## fixed-point check is `tcc2 == tcc3`: once tcc is compiling itself ## with no help from cc.scm, the chain reaches a byte-identical fixed -## point. tcc0 ≠ tcc1 in *behavior* (not just in code size) because -## cc.scm's emitted machine code introduces subtle codegen-decision -## differences — e.g. on riscv64 cc.scm misses several immediate-folding -## peepholes that tcc applies, so tcc0(tcc.flat.c) emits ~200 more bytes -## of `.text` than tcc1(tcc.flat.c) does. tcc1 is faithful tcc behavior -## (its source is tcc.flat.c, run through the cc.scm-built tcc0 -## translator semantically intact); tcc2 is the first binary whose -## machine code was emitted by faithful tcc. +## point. (See docs/PLAN.md for the cc.scm vs tcc codegen-divergence +## reasoning behind needing four stages rather than two.) ## ## tcc0 = tcc-source compiled by cc.scm ← boot3 ## tcc1 = tcc-source compiled by tcc0 ← produced here @@ -20,53 +14,45 @@ ## tcc3 = tcc-source compiled by tcc2 ← produced here ## ## ─── Inputs (sources, from canonical tree) ─────────────────────────── -## build/$ARCH/src/src/tcc-libc/$ARCH/{start.S,sys_stubs.S} -## — _start, sys_* syscall wrappers -## build/$ARCH/src/src/tcc-cc/mem.c memcpy/memmove/memset/memcmp -## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/libtcc1.c -## (amd64: generic compiler helper runtime) -## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/lib-arm64.c -## (aarch64 + riscv64: TFmode soft-float) -## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/va_list.c -## (amd64: __va_start / __va_arg) -## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/alloca86_64*.S -## (amd64: alloca helpers) -## build/$ARCH/src/src/tcc/tcc.flat.c flattened tcc TU -## build/$ARCH/src/src/libc/libc.flat.c flattened mes-libc TU -## build/$ARCH/src/src/test-fixtures/boot-hello.c smoke binary -## (populated up-front by scripts/prep-src.sh; this stage does -## no flatten/unpack/patch.) +## build/$ARCH/src/src/tcc-libc/$ARCH/{start.S, sys_stubs.S} +## build/$ARCH/src/src/tcc-cc/mem.c +## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/<arch-specific> +## build/$ARCH/src/src/tcc/tcc.flat.c +## build/$ARCH/src/src/libc/libc.flat.c +## build/$ARCH/src/src/test-fixtures/boot-hello.c ## ## ─── Inputs (binaries from prior stages) ────────────────────────────── -## build/$ARCH/$DRIVER/boot3/tcc0 — built by scripts/boot3.sh +## build/$ARCH/$DRIVER/boot3/tcc0 +## build/$ARCH/$DRIVER/boot2/{catm, scheme1} ## ## ─── Tools ──────────────────────────────────────────────────────────── -## In container: scratch + busybox (no libc, no /etc, no resolver). -## scheme1 evaluates a host-generated run.scm (emitted -## by scripts/boot4-gen-runscm.sh) against the flat -## staging root (cwd=/work for podman, cwd=/ for the -## seed kernel). Same run.scm drives both. -## On host: none — every arch has CONFIG_TCC_ASM and assembles -## .S inputs (start.S, sys_stubs.S) directly inside the -## container in stages B/D/E. The aarch64 assembler is -## the phase-1 arm64-asm.c that flatten patches into -## tcc-0.9.26 (see docs/TCC-ARM64-ASM.md). +## scheme1 evaluates a host-generated run.scm (from boot4-gen-runscm.sh) +## against the flat staging root. Every arch has CONFIG_TCC_ASM and +## assembles .S inputs (start.S, sys_stubs.S) directly inside the +## container; no host asm step. The aarch64 assembler is the phase-1 +## arm64-asm.c that flatten patches into tcc-0.9.26 (see +## docs/TCC-ARM64-ASM.md). ## ## ─── Outputs ────────────────────────────────────────────────────────── -## build/$ARCH/$DRIVER/boot4/tcc1 — tcc0-built tcc (stage-2 in tests) -## build/$ARCH/$DRIVER/boot4/tcc2 — tcc1-built tcc (stage-3 in tests) -## build/$ARCH/$DRIVER/boot4/tcc3 — final fixed-point self-host tcc +## build/$ARCH/$DRIVER/boot4/{tcc1, tcc2, tcc3} +## tcc2 and tcc3 are byte-identical (asserted +## below) — that equality is the fixed-point. ## build/$ARCH/$DRIVER/boot4/crt1.o -## — tcc2-built startup object, kept outside -## libc.a because it must lead link lines +## tcc2-built startup object, kept outside +## libc.a because it must lead link lines. ## build/$ARCH/$DRIVER/boot4/libc.a -## — tcc2-built archive of sys_stubs.o + mem.o -## + libc.o +## tcc2-built archive of sys_stubs.o + mem.o +## + libc.o ## build/$ARCH/$DRIVER/boot4/libtcc1.a -## — tcc2-built tcc compiler helper archive +## tcc2-built tcc compiler helper archive ## build/$ARCH/$DRIVER/boot4/hello — mes-libc-linked smoke binary -## tcc2 and tcc3 are byte-identical (asserted at the end of this -## script) — that equality is the fixed-point check. +## +## ─── Env knobs ──────────────────────────────────────────────────────── +## TCC_BOOTSTRAP_RELAX_FIXEDPOINT=1 +## After a codegen-altering tcc patch, the two-stage rule needs a +## third bounce to converge. Set this to accept tcc3 even when +## tcc2 != tcc3; the next boot4 run, started from this run's +## tcc3, will reach tcc2 == tcc3 with no extra knob. ## ## Usage: scripts/boot4.sh <arch> ## <arch> ∈ {aarch64, amd64, riscv64} for either DRIVER (default podman). @@ -76,6 +62,7 @@ set -eu . scripts/lib-arch.sh bootlib_init boot4 "${1:-}" driver_init empty +require_src case "$ARCH" in aarch64) LIBTCC1_C_SRCS="lib-arm64.c"; LIBTCC1_ASM_SRCS="" ;; @@ -86,8 +73,6 @@ esac BOOT2=build/$ARCH/$DRIVER/boot2 BOOT3=build/$ARCH/$DRIVER/boot3 SRC=build/$ARCH/src -OUT=build/$ARCH/$DRIVER/boot4 -STAGE=build/$ARCH/$DRIVER/.boot4-stage TCC_PKG=tcc-0.9.26-1147-gee75a10c TCC_LIB_REL=tcc/$TCC_PKG/lib @@ -95,54 +80,36 @@ TCC_LIB_REL=tcc/$TCC_PKG/lib # ── prerequisites ───────────────────────────────────────────────────── require_prev "$BOOT3" tcc0 require_prev "$BOOT2" catm scheme1 -[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } for f in $LIBTCC1_C_SRCS $LIBTCC1_ASM_SRCS; do - [ -e "$SRC/src/$TCC_LIB_REL/$f" ] || { echo "[$BOOT_TAG] missing $SRC/src/$TCC_LIB_REL/$f" >&2; exit 1; } + require_file "$SRC/src/$TCC_LIB_REL/$f" done # ── stage inputs and run scheme1 + boot4 run.scm under $DRIVER ──────── . scripts/lib-runscm.sh runscm_init "$STAGE" "$OUT" - -RUNSCM=$STAGE/run.scm -scripts/boot4-gen-runscm.sh "$ARCH" "$RUNSCM" -echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$RUNSCM") lines" +runscm_gen scripts/boot4-gen-runscm.sh "$ARCH" runscm_scheme1 "$BOOT2/scheme1" runscm_prelude "$SRC/src/scheme1/prelude.scm" -runscm_runscm "$RUNSCM" runscm_input tcc0 "$BOOT3/tcc0" runscm_input catm "$BOOT2/catm" -runscm_input_from_src src "tcc-libc/$ARCH/start.S" -runscm_input_from_src src "tcc-libc/$ARCH/sys_stubs.S" -runscm_input_from_src src tcc-cc/mem.c +runscm_input_from_src "tcc-libc/$ARCH/start.S" +runscm_input_from_src "tcc-libc/$ARCH/sys_stubs.S" +runscm_input_from_src tcc-cc/mem.c for f in $LIBTCC1_C_SRCS $LIBTCC1_ASM_SRCS; do - runscm_input_from_src src "$TCC_LIB_REL/$f" + runscm_input_from_src "$TCC_LIB_REL/$f" done -runscm_input_from_src src tcc/tcc.flat.c -runscm_input_from_src src libc/libc.flat.c -runscm_input_from_src src test-fixtures/boot-hello.c hello.c +runscm_input_from_src tcc/tcc.flat.c +runscm_input_from_src libc/libc.flat.c +runscm_input_from_src test-fixtures/boot-hello.c hello.c -runscm_export tcc1 -runscm_export tcc2 -runscm_export tcc3 -runscm_export s3-crt1.o -runscm_export s3-libc.a -runscm_export s3-libtcc1.a -runscm_export hello +runscm_export tcc1 tcc2 tcc3 s3-crt1.o s3-libc.a s3-libtcc1.a hello runscm_run "${BOOT4_TIMEOUT:-5400}" # ── fixed-point check (host-side) ───────────────────────────────────── -# After a codegen-altering tcc patch, tcc1 (built by tcc0 = pre-fix) and -# tcc2 (built by tcc1 = post-fix codegen but tcc0-emitted binary) won't -# agree byte-for-byte with tcc3 (built by tcc2 = post-fix throughout): -# the two-stage rule needs a third bounce to converge. Set -# TCC_BOOTSTRAP_RELAX_FIXEDPOINT=1 to skip the cmp; the very next boot4 -# run, started from this run's tcc3, will produce tcc2 == tcc3 with no -# extra knob. if ! cmp -s "$OUT/tcc2" "$OUT/tcc3"; then s2=$(wc -c <"$OUT/tcc2") s3=$(wc -c <"$OUT/tcc3") diff --git a/scripts/boot5.sh b/scripts/boot5.sh @@ -8,26 +8,19 @@ ## attribute(alias) weak refs, _Complex, x86_64 SSE/x87 inline asm). ## ## ─── Inputs ────────────────────────────────────────────────────────── -## build/$ARCH/$DRIVER/boot4/tcc3 -## — boot4's verified self-host tcc -## build/$ARCH/$DRIVER/boot4/libtcc1.a -## — boot4's tcc runtime archive -## build/$ARCH/src/src/musl/ — canonical musl tree (overrides merged, -## deletes applied, alltypes.h/syscall.h -## generated, per-arch skip filter -## applied). Built by prep-src.sh + -## prep-musl.sh. +## build/$ARCH/$DRIVER/boot4/tcc3 — boot4's verified self-host tcc +## build/$ARCH/$DRIVER/boot4/libtcc1.a — boot4's tcc runtime archive +## build/$ARCH/$DRIVER/boot2/{catm, scheme1} +## build/$ARCH/src/src/musl/ — canonical musl tree (overrides +## merged, deletes applied, +## alltypes.h/syscall.h generated, +## per-arch skip filter applied) ## build/$ARCH/src/src/tcc/stdarg-bridge.h -## — per-arch __builtin_va_list bridge. ## build/$ARCH/src/src/test-fixtures/boot-hello.c -## — smoke binary linked at the end. ## ## ─── Tools ──────────────────────────────────────────────────────────── -## In container: scratch + busybox (no libc, no /etc, no resolver). -## scheme1 evaluates a host-generated run.scm (emitted -## by scripts/boot5-gen-runscm.sh) against the flat -## staging root (cwd=/work for podman, cwd=/ for the -## seed kernel). Same run.scm drives both. +## scheme1 evaluates a host-generated run.scm (from boot5-gen-runscm.sh) +## against the flat staging root. ## ## ─── Outputs ───────────────────────────────────────────────────────── ## build/$ARCH/$DRIVER/boot5/libc.a @@ -35,28 +28,27 @@ ## build/$ARCH/$DRIVER/boot5/hello — static, runs in the container ## ## Usage: scripts/boot5.sh <arch> -## <arch> ∈ {amd64, aarch64, riscv64} for either DRIVER (default podman). +## <arch> ∈ {aarch64, amd64, riscv64} for either DRIVER (default podman). set -eu . scripts/lib-arch.sh bootlib_init boot5 "${1:-}" driver_init empty +require_src BOOT2=build/$ARCH/$DRIVER/boot2 BOOT4=build/$ARCH/$DRIVER/boot4 SRC=build/$ARCH/src -OUT=build/$ARCH/$DRIVER/boot5 -STAGE=build/$ARCH/$DRIVER/.boot5-stage MUSL_DIR=$SRC/src/musl # ── prerequisites ───────────────────────────────────────────────────── require_prev "$BOOT4" tcc3 require_prev "$BOOT2" catm scheme1 -[ -e "$BOOT4/libtcc1.a" ] || { echo "[$BOOT_TAG] missing $BOOT4/libtcc1.a (run scripts/boot4.sh $ARCH)" >&2; exit 1; } -[ -d "$MUSL_DIR" ] || { echo "[$BOOT_TAG] missing $MUSL_DIR — run scripts/prep-src.sh $ARCH and scripts/prep-musl.sh $ARCH" >&2; exit 1; } -[ -e "$MUSL_DIR/skip.txt" ] || { echo "[$BOOT_TAG] missing $MUSL_DIR/skip.txt — run scripts/prep-musl.sh $ARCH" >&2; exit 1; } -[ -e "$SRC/src/tcc/stdarg-bridge.h" ] || { echo "[$BOOT_TAG] missing $SRC/src/tcc/stdarg-bridge.h — run scripts/prep-src.sh $ARCH" >&2; exit 1; } +require_file "$BOOT4/libtcc1.a" "run scripts/boot4.sh $ARCH" +require_file "$MUSL_DIR" "run scripts/prep-src.sh $ARCH and scripts/prep-musl.sh $ARCH" +require_file "$MUSL_DIR/skip.txt" "run scripts/prep-musl.sh $ARCH" +require_file "$SRC/src/tcc/stdarg-bridge.h" "run scripts/prep-src.sh $ARCH" # ── prepare staging dirs ────────────────────────────────────────────── # $STAGE/in/ — read-only inputs (becomes /work/in or in/ in tmpfs) @@ -151,31 +143,24 @@ mkdir -p "$COBJ/crt" while read -r d; do mkdir -p "$COBJ/$d"; done < "$STAGE/_host/build-objdirs.txt" # ── generate run.scm and stage chain binaries ───────────────────────── -RUNSCM=$STAGE/run.scm -scripts/boot5-gen-runscm.sh "$MUSL_ARCH" "$STAGE/_host" "$RUNSCM" -echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$RUNSCM") lines, $(wc -c <"$RUNSCM") bytes" +runscm_gen scripts/boot5-gen-runscm.sh "$MUSL_ARCH" "$STAGE/_host" runscm_scheme1 "$BOOT2/scheme1" runscm_prelude "$SRC/src/scheme1/prelude.scm" -runscm_runscm "$RUNSCM" # Chain binaries staged at flat in/ root (cwd-relative names in run.scm). runscm_input tcc "$BOOT4/tcc3" runscm_input libtcc1.a "$BOOT4/libtcc1.a" runscm_input catm "$BOOT2/catm" -runscm_input_from_src src tcc/stdarg-bridge.h tcc-stdarg-bridge.h -runscm_input_from_src src test-fixtures/boot-hello.c hello.c +runscm_input_from_src tcc/stdarg-bridge.h tcc-stdarg-bridge.h +runscm_input_from_src test-fixtures/boot-hello.c hello.c # Stage the canonical musl tree under in/musl/. Both drivers pick it # up automatically (podman bind-mounts $STAGE/in; seed packs # `find in -type f` into the cpio). -runscm_input_tree_from_src musl src musl +runscm_input_tree_from_src musl musl -runscm_export libc.a -runscm_export crt1.o -runscm_export crti.o -runscm_export crtn.o -runscm_export hello +runscm_export libc.a crt1.o crti.o crtn.o hello # boot5 has ~1300 spawns + heavy tcc work; bump qemu memory + timeout for # the seed driver. Podman ignores QEMU_MEM and uses host memory directly. diff --git a/scripts/boot6.sh b/scripts/boot6.sh @@ -1,81 +1,69 @@ #!/bin/sh -## boot6.sh — build the seed-kernel ELF with boot4's tcc3. +## boot6.sh — build the seed-kernel ELF/Image with boot4's tcc3. ## ## Drives tcc3 to compile + link the seed kernel directly: no `ld -T -## kernel.lds`, no objcopy. aarch64 emits the flat Image QEMU expects; +## kernel.lds`, no objcopy. aarch64 emits the flat Image QEMU expects; ## amd64/riscv64 emit the ELF consumed by QEMU's -kernel path. ## -## ─── Inputs ────────────────────────────────────────────────────────── -## build/$ARCH/$DRIVER/boot4/tcc3 — boot4's verified self-host tcc -## (compiler + linker) -## build/$ARCH/$DRIVER/boot2/scheme1 — driver runtime -## build/$ARCH/src/src/kernel/arch/$ARCH/{kernel.S,mmu.c,arch.h} -## — per-arch boot stub, MMU setup, header +## ─── Inputs (sources, from canonical tree) ─────────────────────────── +## build/$ARCH/src/src/kernel/arch/$ARCH/{kernel.S, mmu.c, arch.h} ## build/$ARCH/src/src/kernel/kernel.c -## — DTB parse, MMU bring-up, syscalls, -## virtio-blk, tmpfs, ELF loader -## build/$ARCH/src/src/tcc-cc/mem.c memcpy/memset/memmove/memcmp +## build/$ARCH/src/src/tcc-cc/mem.c +## +## ─── Inputs (binaries from prior stages) ────────────────────────────── +## build/$ARCH/$DRIVER/boot4/tcc3 +## build/$ARCH/$DRIVER/boot2/scheme1 ## ## ─── Tools ──────────────────────────────────────────────────────────── -## In container: scratch + busybox (boot2-empty:$ARCH). -## scheme1 evaluates run.scm against the flat staging -## root (cwd=/work for podman, cwd=/ for the seed -## kernel). Same run.scm drives both. +## scheme1 evaluates a host-generated run.scm (from boot6-gen-runscm.sh) +## against the flat staging root. ## ## ─── Outputs ───────────────────────────────────────────────────────── -## build/$ARCH/$DRIVER/boot6/Image — flat arm64 boot Image, byte-format -## identical in shape to the gcc -## Makefile's `objcopy -O binary` -## output. QEMU's `-kernel` detects -## `ARM\x64` magic at file offset 0x38 -## and follows the arm64 boot -## protocol, putting DTB phys in x0 -## before jumping to _start. +## build/$ARCH/$DRIVER/boot6/$KERNEL_NAME +## aarch64: Image — flat boot Image, byte-format identical to the gcc +## Makefile's `objcopy -O binary` output. QEMU's `-kernel` +## detects `ARM\x64` magic at file offset 0x38 and follows +## the arm64 boot protocol, putting DTB phys in x0 before +## jumping to _start. +## amd64/riscv64: kernel.elf — ELF consumed via QEMU's -kernel path. ## ## Usage: scripts/boot6.sh <arch> -## <arch> ∈ {amd64,aarch64,riscv64} for either DRIVER (default podman). +## <arch> ∈ {aarch64, amd64, riscv64} for either DRIVER (default podman). set -eu . scripts/lib-arch.sh bootlib_init boot6 "${1:-}" driver_init empty +require_src -OUT_FILE=$KERNEL_NAME BOOT2=build/$ARCH/$DRIVER/boot2 BOOT4=build/$ARCH/$DRIVER/boot4 SRC=build/$ARCH/src -OUT=build/$ARCH/$DRIVER/boot6 -STAGE=build/$ARCH/$DRIVER/.boot6-stage # ── prerequisites ───────────────────────────────────────────────────── require_prev "$BOOT4" tcc3 require_prev "$BOOT2" scheme1 -[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } for f in kernel/arch/$ARCH/kernel.S kernel/arch/$ARCH/mmu.c kernel/arch/$ARCH/arch.h kernel/kernel.c tcc-cc/mem.c; do - [ -f "$SRC/src/$f" ] || { echo "[$BOOT_TAG] missing $SRC/src/$f" >&2; exit 1; } + require_file "$SRC/src/$f" done # ── stage inputs and run scheme1 + run.scm under $DRIVER ────────────── . scripts/lib-runscm.sh runscm_init "$STAGE" "$OUT" - -RUNSCM=$STAGE/run.scm -scripts/boot6-gen-runscm.sh "$ARCH" "$RUNSCM" -echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$RUNSCM") lines" +runscm_gen scripts/boot6-gen-runscm.sh "$ARCH" runscm_scheme1 "$BOOT2/scheme1" runscm_prelude "$SRC/src/scheme1/prelude.scm" -runscm_runscm "$RUNSCM" runscm_input tcc3 "$BOOT4/tcc3" -runscm_input_from_src src "kernel/arch/$ARCH/kernel.S" -runscm_input_from_src src kernel/kernel.c -runscm_input_from_src src "kernel/arch/$ARCH/arch.h" -runscm_input_from_src src "kernel/arch/$ARCH/mmu.c" -runscm_input_from_src src tcc-cc/mem.c +runscm_input_from_src "kernel/arch/$ARCH/kernel.S" +runscm_input_from_src kernel/kernel.c +runscm_input_from_src "kernel/arch/$ARCH/arch.h" +runscm_input_from_src "kernel/arch/$ARCH/mmu.c" +runscm_input_from_src tcc-cc/mem.c -runscm_export "$OUT_FILE" +runscm_export "$KERNEL_NAME" runscm_run "${BOOT6_TIMEOUT:-1200}" -echo "[$BOOT_TAG] OK -> $OUT/$OUT_FILE ($(wc -c <"$OUT/$OUT_FILE") bytes)" +echo "[$BOOT_TAG] OK -> $OUT/$KERNEL_NAME ($(wc -c <"$OUT/$KERNEL_NAME") bytes)" diff --git a/scripts/lib-arch.sh b/scripts/lib-arch.sh @@ -5,13 +5,16 @@ # # bootlib_init <stage> <arch> # validate <arch>, cd to repo root, # # set ARCH/PLATFORM/KERNEL_NAME/ -# # MUSL_ARCH/DRIVER/BOOT_TAG. -# driver_init [<image-kind>] # podman: build IMAGE if missing -# # (image-kind ∈ scratch|empty; -# # default scratch). -# # seed: verify boot6 kernel exists. -# require_prev <dir> <name>... # die helpfully if any <dir>/<name> -# # is missing or non-executable. +# # MUSL_ARCH/DRIVER/BOOT_STAGE/BOOT_TAG. +# driver_init [<image-kind>] # set OUT/STAGE; podman: build IMAGE +# # if missing (image-kind ∈ scratch| +# # empty; default scratch); seed: +# # verify boot6 kernel exists. +# require_src # die if build/$ARCH/src/ missing. +# require_prev <dir> <name>... # die if any <dir>/<name> is +# # missing or non-executable. +# require_file <path> [<hint>] # die if <path> missing; print a +# # uniform diagnostic with hint. # # After bootlib_init, the following shell vars are set/exported: # ARCH input architecture token (aarch64|amd64|riscv64) @@ -21,8 +24,11 @@ # KERNEL_NAME Image (aarch64) | kernel.elf (amd64,riscv64) # MUSL_ARCH aarch64 | x86_64 | riscv64 # BOOT_TAG "<stage>/<driver>/<arch>" for log prefixes +# BOOT_STAGE stage name as passed in (boot0|boot1|...) # -# After driver_init: +# After driver_init (boot stages only — prep-* skip it): +# OUT build/$ARCH/$DRIVER/$BOOT_STAGE (stage output dir) +# STAGE build/$ARCH/$DRIVER/.$BOOT_STAGE-stage (scratch staging dir) # podman: IMAGE # seed: KERNEL_IMAGE, EXTRACT, SEED_ARCH @@ -41,13 +47,56 @@ bootlib_init() { podman|seed) ;; *) echo "[$_stage/$DRIVER/$ARCH] unknown DRIVER=$DRIVER (expected podman|seed)" >&2; exit 2 ;; esac + BOOT_STAGE=$_stage BOOT_TAG="$_stage/$DRIVER/$ARCH" + BOOT_T0=$(date +%s) case "$ARCH" in aarch64) PLATFORM=linux/arm64; KERNEL_NAME=Image; MUSL_ARCH=aarch64 ;; amd64) PLATFORM=linux/amd64; KERNEL_NAME=kernel.elf; MUSL_ARCH=x86_64 ;; riscv64) PLATFORM=linux/riscv64; KERNEL_NAME=kernel.elf; MUSL_ARCH=riscv64 ;; esac - export ARCH ROOT DRIVER PLATFORM KERNEL_NAME MUSL_ARCH BOOT_TAG + export ARCH ROOT DRIVER PLATFORM KERNEL_NAME MUSL_ARCH BOOT_TAG BOOT_STAGE BOOT_T0 + trap _bootlib_finish EXIT +} + +# _bootlib_finish — EXIT trap installed by bootlib_init. Prints +# `[$BOOT_TAG] done in Xs (cum Ys)` (or `failed after Xs` on error). +# On success, records the elapsed time so later stages can sum the +# chain. Cumulative = sum of all per-stage .timing files relevant to +# the current $ARCH/$DRIVER. +_bootlib_finish() { + _exit=$? + [ -n "${BOOT_T0:-}" ] || return 0 + _elapsed=$(( $(date +%s) - BOOT_T0 )) + if [ "$_exit" != 0 ]; then + echo "[$BOOT_TAG] failed after ${_elapsed}s (exit=$_exit)" >&2 + return 0 + fi + # Record this stage's time. Boot stages have OUT (set by driver_init); + # the orchestrator (BOOT_STAGE=boot) doesn't write — its time would + # double-count. Other stages without OUT (prep-src, prep-musl) write + # to a per-arch sidecar dir. + if [ "$BOOT_STAGE" != boot ]; then + if [ -n "${OUT:-}" ] && [ -d "$OUT" ]; then + echo "$_elapsed" > "$OUT/.timing" + elif [ -d "build/$ARCH" ]; then + mkdir -p "build/$ARCH/.timings" + echo "$_elapsed" > "build/$ARCH/.timings/$BOOT_STAGE" + fi + fi + # Cumulative: sum boot-stage timings for this driver + driver- + # independent prep timings. Glob may not match — guard each path. + _cum=0 + for _f in \ + "build/$ARCH/$DRIVER"/*/.timing \ + "build/$ARCH/.timings"/* + do + [ -f "$_f" ] || continue + _v=$(cat "$_f" 2>/dev/null) || continue + case "$_v" in *[!0-9]*|'') continue ;; esac + _cum=$((_cum + _v)) + done + echo "[$BOOT_TAG] done in ${_elapsed}s (cum ${_cum}s)" } driver_init() { @@ -56,6 +105,9 @@ driver_init() { scratch|empty) ;; *) echo "[$BOOT_TAG] driver_init: image-kind must be scratch|empty (got $_image_kind)" >&2; exit 2 ;; esac + OUT=build/$ARCH/$DRIVER/$BOOT_STAGE + STAGE=build/$ARCH/$DRIVER/.$BOOT_STAGE-stage + export OUT STAGE case "$DRIVER" in podman) IMAGE=boot2-$_image_kind:$ARCH @@ -100,3 +152,26 @@ require_prev() { } done } + +# require_src — assert build/$ARCH/src/ exists (the canonical generated +# source tree built by scripts/prep-src.sh). Every bootN.sh needs it. +require_src() { + [ -d "build/$ARCH/src" ] || { + echo "[$BOOT_TAG] missing build/$ARCH/src — run scripts/prep-src.sh $ARCH" >&2 + exit 1 + } +} + +# require_file <path> [<hint>] — assert <path> exists; print a uniform +# "[$BOOT_TAG] missing <path> — <hint>" diagnostic on failure. +require_file() { + _path=$1; _hint=${2:-} + [ -e "$_path" ] || { + if [ -n "$_hint" ]; then + echo "[$BOOT_TAG] missing $_path — $_hint" >&2 + else + echo "[$BOOT_TAG] missing $_path" >&2 + fi + exit 1 + } +} diff --git a/scripts/lib-pipeline.sh b/scripts/lib-pipeline.sh @@ -19,9 +19,10 @@ # DSL (source as `. scripts/lib-pipeline.sh`): # # pipeline_init <staging-dir> <out-dir> <driver> -# pipeline_input <name> <host-path> # may be called repeatedly -# stage <bin> <argv1> <argv2>... -- <input-names...> -- <output-names...> -# pipeline_export <name> # may be called repeatedly +# pipeline_input <name> <host-path> # repeatable +# pipeline_input_from_src <subpath> [<name>] # from build/$ARCH/src/src/ +# stage <bin> <argv...> -- <inputs...> -- <outputs...> +# pipeline_export <name>... # one or more # pipeline_run # # `stage` semantics: invoke `<bin>` with argv=[<bin>, <argv1>, ...]; the @@ -82,15 +83,16 @@ pipeline_input() { P_INPUT_NAMES="$P_INPUT_NAMES $name" } -# pipeline_input_from_src — pull a file from the canonical generated -# source tree at build/$ARCH/src/{bin,src}/<subpath>. Stages under -# in/<name> where <name> defaults to basename(subpath); pass an -# override as the optional third argument when the staged name must -# differ (e.g. P1.M1 vs P1-aarch64.M1). +# pipeline_input_from_src <subpath> [<name>] +# Pull a file from the canonical generated source tree at +# build/$ARCH/src/src/<subpath>. Stages it under in/<name>; +# <name> defaults to basename(subpath). For the rare `bin/` case +# (the seed hex0-seed binary), call pipeline_input directly with +# build/$ARCH/src/bin/<file>. pipeline_input_from_src() { - _kind=$1; _subpath=$2; _name=${3:-} + _subpath=$1; _name=${2:-} [ -n "$_name" ] || _name=$(basename "$_subpath") - pipeline_input "$_name" "build/$ARCH/src/$_kind/$_subpath" + pipeline_input "$_name" "build/$ARCH/src/src/$_subpath" } # Look up a token: if it names an input, prefix `in/`; if it names a @@ -112,6 +114,14 @@ _p_bin_path() { echo "$b" } +# stage <bin> <argv...> -- <inputs...> -- <outputs...> +# +# The explicit input/output lists look redundant — most names already +# appear in <argv...> — but they are not. argv positions are tool- +# specific: a token like `M0.combined.hex2` is an output of one stage +# (catm produces it) and an input of the next (hex2 reads it). The +# framework cannot tell which from the token alone, so each stage +# declares both lists. Don't try to "simplify" by inferring from argv. stage() { bin=$1; shift P_HEAD_RAW=""; P_IN=""; P_OUT=""; _s=head @@ -258,7 +268,7 @@ in/$inp" } pipeline_export() { - P_EXPORTS="$P_EXPORTS $1" + for _n in "$@"; do P_EXPORTS="$P_EXPORTS $_n"; done } pipeline_run() { diff --git a/scripts/lib-runscm.sh b/scripts/lib-runscm.sh @@ -18,13 +18,15 @@ # DSL (source as `. scripts/lib-runscm.sh`): # # runscm_init <staging-dir> <out-dir> -# runscm_scheme1 <path> # init=scheme1 (boot2) -# runscm_prelude <path> # scheme1/prelude.scm -# runscm_runscm <path> # the host-generated driver -# runscm_input <name> <host-path> # repeatable; staged at in/<name> -# runscm_input_tree <prefix> <src-root> # repeatable; tree under in/<prefix> -# runscm_export <name> # repeatable; out file -# runscm_run [timeout-s] # default 600s +# runscm_scheme1 <path> # init=scheme1 (boot2) +# runscm_prelude <path> # scheme1/prelude.scm +# runscm_runscm <path> # static driver script +# runscm_gen <gen-script> <args...> # OR generate run.scm, +# # log size, register it. +# runscm_input <name> <host-path> # repeatable; staged at in/<name> +# runscm_input_tree <prefix> <src-root> # repeatable; tree under in/<prefix> +# runscm_export <name>... # one or more output names +# runscm_run [timeout-s] # default 600s # # Required env per driver: # podman: IMAGE, PLATFORM @@ -50,6 +52,18 @@ runscm_scheme1() { S_SCHEME1=$1; } runscm_prelude() { S_PRELUDE=$1; } runscm_runscm() { S_RUNSCM=$1; } +# runscm_gen <gen-script> <args...> +# Run a host-side generator that emits run.scm to $S_STAGE_DIR/run.scm, +# log its size, and register it as the driver script. Used by +# boot4/5/6 which build their run.scm dynamically. +runscm_gen() { + _gen=$1; shift + _runscm=$S_STAGE_DIR/run.scm + "$_gen" "$@" "$_runscm" + echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$_runscm") lines, $(wc -c <"$_runscm") bytes" + S_RUNSCM=$_runscm +} + runscm_input() { name=$1; src=$2 case "$name" in @@ -70,24 +84,27 @@ runscm_input_tree() { done } -# runscm_input_from_src — pull a file from the canonical generated -# source tree at build/$ARCH/src/{bin,src}/<subpath>. Stages under -# in/<name> where <name> defaults to basename(subpath). +# runscm_input_from_src <subpath> [<name>] +# Pull a file from the canonical generated source tree at +# build/$ARCH/src/src/<subpath>. Stages it under in/<name>; +# <name> defaults to basename(subpath). For the rare `bin/` case, +# call runscm_input directly with build/$ARCH/src/bin/<file>. runscm_input_from_src() { - _kind=$1; _subpath=$2; _name=${3:-} + _subpath=$1; _name=${2:-} [ -n "$_name" ] || _name=$(basename "$_subpath") - runscm_input "$_name" "build/$ARCH/src/$_kind/$_subpath" + runscm_input "$_name" "build/$ARCH/src/src/$_subpath" } -# runscm_input_tree_from_src — same as runscm_input_tree, but the -# source root is build/$ARCH/src/{bin,src}/<subpath>. +# runscm_input_tree_from_src <prefix> <subpath> +# Same as runscm_input_tree, but the source root is +# build/$ARCH/src/src/<subpath>. runscm_input_tree_from_src() { - _prefix=$1; _kind=$2; _subpath=$3 - runscm_input_tree "$_prefix" "build/$ARCH/src/$_kind/$_subpath" + _prefix=$1; _subpath=$2 + runscm_input_tree "$_prefix" "build/$ARCH/src/src/$_subpath" } runscm_export() { - S_EXPORTS="$S_EXPORTS $1" + for _n in "$@"; do S_EXPORTS="$S_EXPORTS $_n"; done } runscm_run() {