boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit fb0a4e33c70a5965ed86b9c838e2b83df3827e93
parent 6c7c1475733065ae0d98ff556718651667d1891d
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Wed,  6 May 2026 12:26:32 -0700

A0: bootN.sh read sources from canonical build/<arch>/src/ tree

Wires the canonical generated source tree (built by prep-src.sh +
prep-musl.sh in the prior commit) into the boot pipeline:

- lib-pipeline.sh + lib-runscm.sh: new *_input_from_src and
  runscm_input_tree_from_src helpers that pull from
  build/$ARCH/src/{bin,src}/...
- boot.sh: invokes prep-src before boot0 and prep-musl before boot5.
- boot{0..6}.sh: every source input switches to the canonical-tree
  helper. Inputs are now declarative.
- boot3 + boot4: drop the auto-invoke stage1-flatten/libc-flatten
  blocks; the canonical tree is the single source of flattened TUs.
- boot5: reads the canonical musl tree directly. The in-stage tarball
  unpack + overrides + deletes + alltypes/syscall header generation
  + skip-list subtraction are gone (prep-src + prep-musl own all of
  that); only the per-arch override-vs-base enumeration stays.
- boot5-gen-runscm.sh: emits in/musl/<rel> + out/obj/musl/<rel>
  instead of in/tmp/musl-1.2.5/... + out/obj/musl-1.2.5/...
- prep-src.sh: extends the canonical tree with src/tcc-libc/$ARCH/
  (start.S, sys_stubs.S), src/tcc-cc/ (mem.c), and src/test-fixtures/
  (boot-hello.c) so boot4/5/6 read every source from one place.

Validation: full scripts/boot.sh aarch64 (DRIVER=podman) green.
boot{0..4,6}/ are byte-identical to pre-A0 outputs. boot5/{libc.a,
crti.o,crtn.o} differ only in the embedded source-path string tcc
stamps into each .o (in/musl/... vs the legacy in/tmp/musl-1.2.5/...).

Diffstat:
Mdocs/PLAN.md | 169+++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------
Mscripts/boot.sh | 11+++++++++++
Mscripts/boot0.sh | 20++++++++++++--------
Mscripts/boot1.sh | 18+++++++++++-------
Mscripts/boot2.sh | 23+++++++++++++----------
Mscripts/boot3.sh | 79+++++++++++++++++++++++++------------------------------------------------------
Mscripts/boot4.sh | 76+++++++++++++++++++++++++++-------------------------------------------------
Mscripts/boot5-gen-runscm.sh | 14++++++++------
Mscripts/boot5.sh | 116++++++++++++++++++++++++-------------------------------------------------------
Mscripts/boot6.sh | 38+++++++++++++++++++-------------------
Mscripts/lib-pipeline.sh | 11+++++++++++
Mscripts/lib-runscm.sh | 16++++++++++++++++
Mscripts/prep-src.sh | 16++++++++++++++++
13 files changed, 326 insertions(+), 281 deletions(-)

diff --git a/docs/PLAN.md b/docs/PLAN.md @@ -52,7 +52,7 @@ are actually stage-specific. --- -### A3. Per-driver output trees +### [DONE] A3. Per-driver output trees **Goal.** `build/<arch>/<driver>/...` everywhere. Kill the `build/.seed-bootstrap/` shuffle in `boot.sh`. Two drivers can coexist on @@ -78,7 +78,7 @@ disk; nothing gets clobbered when you switch. --- -### A0. Canonical generated source tree (host source-prep) +### [DONE] A0. Canonical generated source tree (host source-prep) **Goal.** All host-side source preparation happens once, up front, into a single canonical tree at `build/<arch>/src/`. This tree is the audit @@ -211,54 +211,129 @@ unchanged from pre-A4. 0.9.26 is incomplete. After AT, every arch is on the same footing in seed-kernel and the build pipeline. +The four subitems are independent — none gates another. AT.3 was +previously claimed to gate AT.4 (per the "once asm-support patches +land" wording); audit found that wrong (AT.4 is a pure refactor). + **In scope.** -- **amd64 `.quad` truncation in `gen_le64`.** Currently `arch/amd64/mmu.c` - encodes GDT entries as `.long lo, hi` pairs. Patch tcc's amd64 - assembler to accept full 64-bit immediates; revert the workaround. - (Existing memory note: `project_tcc_arm64_svcul_truncation.md` - describes a related arm64 issue — track separately or in the same - pass.) -- **`.note.*` SHT_NOTE.** tcc emits `.note.*` sections as PROGBITS, - forcing the post-link `seed-kernel/scripts/elf-pvh-note.c` tool to - rewrite the ELF for amd64 PVH boot. Patch tcc to recognize `.note.*` - → SHT_NOTE and emit a PT_NOTE phdr. Delete `elf-pvh-note.c` and the - amd64 boot6 fixup. -- **riscv64 inline-asm constraints / register-asm silent drop.** Existing - memory `project_tcc_inline_asm_silent_drop.md` documents tcc 0.9.26 - silently dropping register-asm constraints. If this affects any - riscv64 codepath in the kernel or musl build, fix it here. -- **Audit `seed-kernel/simple-patches/` (if any) and per-arch `arch.h` - externs vs inlines.** amd64 and riscv64 declare arch helpers as - externs (workaround for tcc inline-asm gaps) while aarch64 inlines - them. Once the asm-support patches above land, normalize to the - aarch64 style. - -**Out of scope (still tracked separately).** -- riscv64 boot0 stage-4 user-trap bug (`docs/SEED-RISCV64-TODO.md`) — - may or may not be tcc; investigate after AT. -- amd64 boot3+ validation under DRIVER=seed - (`docs/SEED-AMD64-TODO.md`) — compute-bound, not a tcc issue. + +- **AT.1. amd64 `.quad` 64-bit-literal truncation.** `seed-kernel/arch/amd64/kernel.S:406–413` + encodes the GDT as `.long lo, hi` pairs because `.quad 0x00af9a000000ffff` + truncates. Root cause is **not** in tcc's `gen_le64` (that path is 64-bit + clean); it's in mes-libc's `vendor/mes-libc/mes/abtol.c:29`, which + accumulates `strtoull` into an `int`, so any 64-bit hex literal in + `tccasm.c:asm_expr_unary` loses its high bits at parse time. Two fixes: + - **(preferred)** one-line edit in `abtol.c` (`int i` → `long long i`). + Repairs every `strto*` caller, not just asm. + - **(alternative)** new patch `tccasm-asm-expr-parse-number.{before,after}` + that inlines a 64-bit hex/oct/dec accumulator in `asm_expr_unary`, + bypassing `strtoull`. + Then revert `kernel.S:406–413` to plain `.quad` GDT entries. (`mmu.c` + has no workaround — uses C-side `u64` constants — and needs no change.) + Existing memory `project_tcc_arm64_svcul_truncation.md` describes an + unrelated arm64 codegen-side truncation already patched in + `scripts/simple-patches/tcc-0.9.26/arm64-svcul-no-truncate*`; track + separately. + +- **AT.2. `.note.*` SHT_NOTE / PT_NOTE.** tcc emits `.note.*` sections + as `SHT_PROGBITS` and never writes a `PT_NOTE` phdr, forcing the + post-link `seed-kernel/scripts/elf-pvh-note.c` tool to rewrite the + ELF for amd64 PVH boot. Two patches: + - `note-section-sht-note.{before,after}` — in `tccelf.c:251` + `find_section()`, name-prefix-check `.note*` → create as `SHT_NOTE`. + - `pt-note-phdr.{before,after}` — in `tccelf.c:elf_output_file` + (~2202), gate `phnum++` on "any `SHT_NOTE+SHF_ALLOC` section + exists" (else aarch64/riscv64 phnum perturbs and breaks the A5 + reproducibility target); compute `min(sh_offset)` / + `max(sh_offset+sh_size)` over those sections; emit one `PT_NOTE`. + Then delete `seed-kernel/scripts/elf-pvh-note.c`, + `scripts/prep-src.sh:121–123`, `scripts/boot6.sh:80–89`, + `scripts/boot6-gen-runscm.sh:99–117`. `seed-kernel/arch/amd64/kernel.lds` + is unchanged (consumed by clang+ld.lld, not tcc). + +- **AT.3. riscv64 inline-asm — audit + memory update only, no patch.** + Audit found `riscv64-asm.c:801–819` ships only stubs for + `subst_asm_operand` (loud error), `asm_gen_code` (empty), and + `asm_compute_constraints` (empty) — same silent-drop class as arm64 + per memory `project_tcc_inline_asm_silent_drop.md`. But every + register-asm callsite in the repo is already worked around: musl + syscall/tp/atomics overrides under + `vendor/upstream/musl-1.2.5-overrides/arch/riscv64/`, and 13 externs + in `seed-kernel/arch/riscv64/arch.h` defined in `kernel.S`. There + is no live miscompile to fix. A real Phase-3 inline-asm port (~400+ + LOC, parallel to the unstarted arm64 Phase 3) is well out of scope. + AT.3 work: + - Update `project_tcc_inline_asm_silent_drop.md` to add a riscv64 + section. + - Document in `docs/TCC.md` that riscv64 and arm64 share the + silent-drop bug class until Phase 3 lands. + Three small wins that *don't* need any tcc work and can land here: + inline `arch_mmio_ptr` (pure pointer arithmetic), `arch_system_off` + (one MMIO write), and the `saved_user_sp` accessors in pure C — see + AT.4 for context. + +- **AT.4. arch.h API uniformity (refactor, not inline-asm).** The + previous framing — "amd64/riscv64 use externs, aarch64 inlines" — + was wrong. aarch64 also externs every primitive (arch.h:51–60); the + difference is one layer up: aarch64 exposes a small primitive set + (`sysreg_read/write`, `arm64_barrier(kind)`, `cpu_pause(kind)`, + `arm64_psci_call`) and synthesizes the `arch_*()` API as macros + (arch.h:62–72). amd64/riscv64 spell each `arch_*` as its own + dedicated extern. There is **zero** `__asm__`/inline-asm in any + seed-kernel C file, so AT.4 has no tcc dependency. Refactor: + - amd64: introduce `amd64_fence(kind)`, port-io and msr dispatcher + helpers; collapse the per-API externs in + `seed-kernel/arch/amd64/kernel.S:219–378` into the primitive set; + rewrite `arch_*()` in `arch.h` as macros. + - riscv64: introduce `riscv_csr_read/write(id)`, + `riscv_fence(kind)` (parallel to `sysreg_read`/`arm64_barrier`); + same macro rewrite for `arch_*()`. + - `arch_idle_forever` and `arch_system_off` may stay as their own + externs (don't compress into a `(kind)` dispatcher cleanly) — + aarch64 keeps `arm64_psci_call` in the same shape. **Touch list.** -- `vendor/tcc/…` (or wherever the tcc tree lives): the patches. -- `scripts/stage1-flatten.sh`: re-flatten with the new patches; commit - the resulting `tcc.flat.c` if it's vendored, otherwise just regenerate. -- `seed-kernel/arch/amd64/mmu.c`: revert `.quad` workarounds. -- `seed-kernel/arch/amd64/kernel.S`: reinstate native `.note.*` if the - workaround required emitting them differently. -- `seed-kernel/arch/{amd64,riscv64}/arch.h`: inline what was extern. -- `seed-kernel/arch/{amd64,riscv64}/{kernel.S,mmu.c}`: drop quirks. -- `seed-kernel/scripts/elf-pvh-note.c`: delete. -- `scripts/boot6.sh`, `scripts/boot6-gen-runscm.sh`: delete the - amd64-only post-link fixup block. -- `docs/TCC.md`: document the new tcc patch set; retire any - workaround-explanation sections that are now moot. -- Update memory entries `project_tcc_*` as patches eliminate the - recorded bugs. - -**Validation.** seed-kernel builds on all three arches with no -arch-conditional shell logic in boot6. `make build/<arch>/podman/boot6/Image` -produces a clean ELF without post-link fixup. All test suites still pass. +- `vendor/mes-libc/mes/abtol.c` (AT.1 preferred fix) **or** new patch + `scripts/simple-patches/tcc-0.9.26/tccasm-asm-expr-parse-number.*` + (AT.1 alternative). +- `scripts/simple-patches/tcc-0.9.26/note-section-sht-note.*` and + `pt-note-phdr.*` (AT.2). The patch directory is + `scripts/simple-patches/tcc-0.9.26/` (not `seed-kernel/simple-patches/`, + which doesn't exist). +- `scripts/stage1-flatten.sh`: list any newly-added patches in + `apply_our_patch` (around line ~223); regenerate the flattened tree. +- `seed-kernel/arch/amd64/kernel.S`: lines 406–413 (`.quad` GDT + revert, AT.1); lines 219–378 (refactor to primitive set, AT.4). +- `seed-kernel/arch/amd64/arch.h`: rewrite `arch_*()` as macros (AT.4). +- `seed-kernel/arch/riscv64/{kernel.S,arch.h}`: same refactor (AT.4). +- `seed-kernel/arch/riscv64/arch.h`: inline `arch_mmio_ptr`, + `arch_system_off`, `arch_read_user_sp`, `arch_write_user_sp` (AT.3 + side wins). +- `seed-kernel/scripts/elf-pvh-note.c`: delete (AT.2). +- `scripts/prep-src.sh`: drop lines 121–123 staging elf-pvh-note.c + (AT.2). +- `scripts/boot6.sh`: drop lines 80–89 (AT.2). +- `scripts/boot6-gen-runscm.sh`: drop lines 99–117 (AT.2). +- `docs/TCC.md`: document the AT patch set; add the riscv64+arm64 + silent-drop note (AT.3). +- Update `project_tcc_inline_asm_silent_drop.md` to cover riscv64 + (AT.3). Other `project_tcc_*` memories stay as historical record. + +**Validation.** +- AT.1: `readelf -x .data build/amd64/$DRIVER/boot1/M1pp` (or any + amd64 binary using a 64-bit hex literal) shows correct upper bits. + amd64 boots after `.quad` revert. +- AT.2: `readelf -S build/amd64/$DRIVER/boot6/Image` shows `.note.Xen` + type `NOTE`; `readelf -l` shows one `PT_NOTE` covering it. `cmp` + vs the pre-patch post-fixup ELF byte-identical at the affected + offsets. amd64 boots through QEMU PVH `-kernel`. +- AT.2 reproducibility: aarch64/riscv64 `readelf -l` phdr counts + unchanged from pre-patch (proves the SHT_NOTE-presence gate works). +- AT.3: docs+memory updated; no functional change expected. +- AT.4: seed-kernel builds on all three arches with the new macro + layer; amd64/riscv64 byte output unchanged (refactor only). +- All test suites still pass. `boot6` has no amd64-conditional shell + logic. --- diff --git a/scripts/boot.sh b/scripts/boot.sh @@ -43,11 +43,22 @@ stage() { echo "[$BOOT_TAG] $name: $((e - s))s (cum $((e - T0))s)" } +# A0a: build the canonical generated source tree at build/$ARCH/src/. +# Boot stages read source from there exclusively (no flatten/unpack/ +# patch inside boot{N}.sh). +stage prep-src ./scripts/prep-src.sh $ARCH + stage boot0 ./scripts/boot0.sh $ARCH stage boot1 ./scripts/boot1.sh $ARCH stage boot2 ./scripts/boot2.sh $ARCH stage boot3 ./scripts/boot3.sh $ARCH stage boot4 ./scripts/boot4.sh $ARCH + +# A0b: apply the per-arch musl skip filter (needs tcc3 from boot4 if +# the calibration list is missing; the committed list is the common +# case and runs without compiler). +stage prep-musl ./scripts/prep-musl.sh $ARCH + stage boot5 ./scripts/boot5.sh $ARCH # boot6 builds the seed-kernel ELF/Image with boot4's tcc3 (no `ld -T`, diff --git a/scripts/boot0.sh b/scripts/boot0.sh @@ -5,9 +5,11 @@ ## brings up: hex0 -> hex1 -> hex2 -> catm -> M0. Three of those (hex2, ## catm, M0) are the binaries every later stage depends on. ## -## ─── Inputs (sources) ───────────────────────────────────────────────── -## vendor/seed/$ARCH/{hex0-seed, hex0.hex0, hex1.hex0, hex2.hex1, -## catm.hex2, M0.hex2, ELF.hex2} +## ─── Inputs (sources, from canonical tree) ─────────────────────────── +## build/$ARCH/src/bin/hex0-seed +## build/$ARCH/src/src/vendor-seed/{hex0.hex0, hex1.hex0, hex2.hex1, +## catm.hex2, M0.hex2, ELF.hex2} +## (populated by scripts/prep-src.sh from vendor/seed/$ARCH/.) ## ## ─── Outputs ────────────────────────────────────────────────────────── ## build/$ARCH/$DRIVER/boot0/{hex2, catm, M0} @@ -21,17 +23,19 @@ set -eu bootlib_init boot0 "${1:-}" driver_init scratch -SEED=vendor/seed/$ARCH +SRC=build/$ARCH/src OUT=build/$ARCH/$DRIVER/boot0 STAGE=build/$ARCH/$DRIVER/.boot0-stage +[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } + . scripts/lib-pipeline.sh pipeline_init "$STAGE" "$OUT" "$DRIVER" -# ─── inputs ─────────────────────────────────────────────────────────── -for f in hex0-seed hex0.hex0 hex1.hex0 hex2.hex1 catm.hex2 M0.hex2 ELF.hex2; do - [ -e "$SEED/$f" ] || { echo "[$BOOT_TAG] missing input: $SEED/$f" >&2; exit 1; } - pipeline_input "$f" "$SEED/$f" +# ─── inputs (from canonical src tree) ───────────────────────────────── +pipeline_input_from_src bin hex0-seed +for f in hex0.hex0 hex1.hex0 hex2.hex1 catm.hex2 M0.hex2 ELF.hex2; do + pipeline_input_from_src src "vendor-seed/$f" done # ─── pipeline ───────────────────────────────────────────────────────── diff --git a/scripts/boot1.sh b/scripts/boot1.sh @@ -5,9 +5,11 @@ ## hex2pp pair, built from their .P1 sources via the seed M0 + hex2 ## chain. catm is rebuilt from catm.P1pp in boot2. ## -## ─── Inputs (sources) ───────────────────────────────────────────────── -## M1pp/M1pp.P1, hex2pp/hex2pp.P1 -## P1/P1-$ARCH.M1, vendor/seed/$ARCH/ELF.hex2 +## ─── Inputs (sources, from canonical tree) ─────────────────────────── +## build/$ARCH/src/src/M1pp/M1pp.P1 +## build/$ARCH/src/src/hex2pp/hex2pp.P1 +## build/$ARCH/src/src/P1/P1-$ARCH.M1 +## build/$ARCH/src/src/vendor-seed/ELF.hex2 ## ## ─── Inputs (binaries from prior stages) ────────────────────────────── ## build/$ARCH/$DRIVER/boot0/{hex2, M0, catm} @@ -25,10 +27,12 @@ bootlib_init boot1 "${1:-}" driver_init scratch BOOT0=build/$ARCH/$DRIVER/boot0 +SRC=build/$ARCH/src OUT=build/$ARCH/$DRIVER/boot1 STAGE=build/$ARCH/$DRIVER/.boot1-stage require_prev "$BOOT0" hex2 M0 catm +[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } . scripts/lib-pipeline.sh pipeline_init "$STAGE" "$OUT" "$DRIVER" @@ -37,10 +41,10 @@ pipeline_init "$STAGE" "$OUT" "$DRIVER" pipeline_input hex2 "$BOOT0/hex2" pipeline_input M0 "$BOOT0/M0" pipeline_input catm "$BOOT0/catm" -pipeline_input P1.M1 "P1/P1-$ARCH.M1" -pipeline_input ELF.hex2 "vendor/seed/$ARCH/ELF.hex2" -pipeline_input M1pp.P1 "M1pp/M1pp.P1" -pipeline_input hex2pp.P1 "hex2pp/hex2pp.P1" +pipeline_input_from_src src "P1/P1-$ARCH.M1" P1.M1 +pipeline_input_from_src src vendor-seed/ELF.hex2 +pipeline_input_from_src src M1pp/M1pp.P1 +pipeline_input_from_src src hex2pp/hex2pp.P1 # ─── pipeline ───────────────────────────────────────────────────────── echo "[$BOOT_TAG] M1pp.P1 + hex2pp.P1 -> M1pp + hex2pp" diff --git a/scripts/boot2.sh b/scripts/boot2.sh @@ -6,10 +6,11 @@ ## boot0 catm so later stages run with zero boot0 dependencies); then ## builds the scheme1 interpreter from scheme1.P1pp using the new catm. ## -## ─── Inputs (sources) ───────────────────────────────────────────────── -## catm/catm.P1pp, scheme1/scheme1.P1pp -## P1/P1-$ARCH.M1pp, P1/P1.M1pp, P1/P1pp.P1pp -## vendor/seed/$ARCH/ELF.hex2 +## ─── Inputs (sources, from canonical tree) ─────────────────────────── +## build/$ARCH/src/src/catm/catm.P1pp +## build/$ARCH/src/src/scheme1/scheme1.P1pp +## build/$ARCH/src/src/P1/{P1-$ARCH.M1pp, P1.M1pp, P1pp.P1pp} +## build/$ARCH/src/src/vendor-seed/ELF.hex2 ## ## ─── Inputs (binaries from prior stages) ────────────────────────────── ## build/$ARCH/$DRIVER/boot0/catm (only to bootstrap catm.P1pp build) @@ -29,11 +30,13 @@ driver_init scratch BOOT0=build/$ARCH/$DRIVER/boot0 BOOT1=build/$ARCH/$DRIVER/boot1 +SRC=build/$ARCH/src OUT=build/$ARCH/$DRIVER/boot2 STAGE=build/$ARCH/$DRIVER/.boot2-stage require_prev "$BOOT0" catm require_prev "$BOOT1" M1pp hex2pp +[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } . scripts/lib-pipeline.sh pipeline_init "$STAGE" "$OUT" "$DRIVER" @@ -42,12 +45,12 @@ pipeline_init "$STAGE" "$OUT" "$DRIVER" pipeline_input catm0 "$BOOT0/catm" # bootstrap; replaced by output 'catm' pipeline_input M1pp "$BOOT1/M1pp" pipeline_input hex2pp "$BOOT1/hex2pp" -pipeline_input backend.M1pp "P1/P1-$ARCH.M1pp" -pipeline_input frontend.M1pp "P1/P1.M1pp" -pipeline_input libp1pp.P1pp "P1/P1pp.P1pp" -pipeline_input ELF.hex2 "vendor/seed/$ARCH/ELF.hex2" -pipeline_input catm.P1pp "catm/catm.P1pp" -pipeline_input scheme1.P1pp "scheme1/scheme1.P1pp" +pipeline_input_from_src src "P1/P1-$ARCH.M1pp" backend.M1pp +pipeline_input_from_src src P1/P1.M1pp frontend.M1pp +pipeline_input_from_src src P1/P1pp.P1pp libp1pp.P1pp +pipeline_input_from_src src vendor-seed/ELF.hex2 +pipeline_input_from_src src catm/catm.P1pp +pipeline_input_from_src src scheme1/scheme1.P1pp # ─── pipeline ───────────────────────────────────────────────────────── echo "[$BOOT_TAG] catm.P1pp -> catm; scheme1.P1pp -> scheme1" diff --git a/scripts/boot3.sh b/scripts/boot3.sh @@ -12,22 +12,16 @@ ## tcc2 = tcc-source compiled by tcc1 ← boot4 ## tcc3 = tcc-source compiled by tcc2 ← boot4 ## -## ─── Inputs (host-side, auto-built if missing) ──────────────────────── -## build/$ARCH/vendor/tcc/tcc.flat.c -## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/{include,lib} -## — flattened tcc TU + unpacked tree; built -## via scripts/stage1-flatten.sh --arch -## $ARCH (host cc -E, no container) -## build/$ARCH/vendor/mes-libc/libc.flat.c -## — flattened mes-libc TU; built via -## scripts/libc-flatten.sh --arch $ARCH -## (host cc -E, no container) -## -## ─── Inputs (sources, copied into staging) ──────────────────────────── -## scheme1/prelude.scm cc/cc.scm cc/main.scm — catm'd to cc.scm bundle -## P1/P1-$ARCH.M1pp P1/P1.M1pp P1/P1pp.P1pp — M1pp pipeline -## P1/entry-libc.P1pp P1/elf-end.P1pp — link-time framing -## vendor/seed/$ARCH/ELF.hex2 — ELF header fragment +## ─── Inputs (sources, from canonical tree) ─────────────────────────── +## build/$ARCH/src/src/scheme1/{prelude.scm} scheme bundle +## build/$ARCH/src/src/cc/{cc.scm,main.scm} scheme bundle +## build/$ARCH/src/src/P1/{P1-$ARCH.M1pp,P1.M1pp,P1pp.P1pp} M1pp pipeline +## build/$ARCH/src/src/P1/{entry-libc.P1pp,elf-end.P1pp} link framing +## build/$ARCH/src/src/vendor-seed/ELF.hex2 ELF header +## build/$ARCH/src/src/tcc/tcc.flat.c flattened tcc TU +## build/$ARCH/src/src/libc/libc.flat.c flattened mes-libc TU +## (populated up-front by scripts/prep-src.sh; this stage does +## no flatten/unpack/patch.) ## ## ─── Inputs (binaries from prior stages) ────────────────────────────── ## build/$ARCH/$DRIVER/boot1/{M1pp, hex2pp} — built by scripts/boot1.sh @@ -56,43 +50,20 @@ driver_init empty BOOT1=build/$ARCH/$DRIVER/boot1 BOOT2=build/$ARCH/$DRIVER/boot2 +SRC=build/$ARCH/src OUT=build/$ARCH/$DRIVER/boot3 STAGE=build/$ARCH/$DRIVER/.boot3-stage -TCC_VENDOR=build/$ARCH/vendor/tcc -TCC_DIR=$TCC_VENDOR/tcc-0.9.26-1147-gee75a10c -TCC_FLAT=$TCC_VENDOR/tcc.flat.c -LIBC_FLAT=build/$ARCH/vendor/mes-libc/libc.flat.c - -# ── prerequisite: prior-stage binaries ──────────────────────────────── +# ── prerequisites ───────────────────────────────────────────────────── require_prev "$BOOT1" M1pp hex2pp require_prev "$BOOT2" catm scheme1 - -# ── prerequisite: host-flattened sources + unpacked tcc tree ────────── -# tcc.flat.c + the unpacked $TCC_DIR/{include,lib} tree are produced -# together by stage1-flatten.sh; libc.flat.c by libc-flatten.sh. Both -# run on the host (cc -E), no container — auto-invoke if missing. -if [ ! -e "$TCC_FLAT" ] || [ ! -d "$TCC_DIR/include" ] || [ ! -e "$TCC_VENDOR/stdarg-bridge.h" ]; then - echo "[$BOOT_TAG] flatten tcc.flat.c (host)" - scripts/stage1-flatten.sh --arch "$ARCH" -fi -if [ ! -e "$LIBC_FLAT" ]; then - echo "[$BOOT_TAG] flatten libc.flat.c (host)" - scripts/libc-flatten.sh --arch "$ARCH" -fi - -BACKEND_M1PP=P1/P1-$ARCH.M1pp -FRONTEND_M1PP=P1/P1.M1pp -LIBP1PP=P1/P1pp.P1pp -ENTRY_LIBC=P1/entry-libc.P1pp -ELF_END=P1/elf-end.P1pp -ELF_HEX2=vendor/seed/$ARCH/ELF.hex2 +[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } # ── stage inputs and run scheme1 + boot3-run.scm under $DRIVER ──────── . scripts/lib-runscm.sh runscm_init "$STAGE" "$OUT" runscm_scheme1 "$BOOT2/scheme1" -runscm_prelude scheme1/prelude.scm +runscm_prelude "$SRC/src/scheme1/prelude.scm" runscm_runscm scripts/boot3-run.scm runscm_input catm "$BOOT2/catm" @@ -101,19 +72,19 @@ runscm_input hex2pp "$BOOT1/hex2pp" # scheme1 binary itself is staged by runscm_run (so a `(run "scheme1" …)` # inside boot3-run.scm finds it at cwd-relative ./scheme1). -runscm_input prelude.scm scheme1/prelude.scm -runscm_input cc.scm cc/cc.scm -runscm_input main.scm cc/main.scm +runscm_input_from_src src scheme1/prelude.scm +runscm_input_from_src src cc/cc.scm +runscm_input_from_src src cc/main.scm -runscm_input backend.M1pp "$BACKEND_M1PP" -runscm_input frontend.M1pp "$FRONTEND_M1PP" -runscm_input libp1pp.P1pp "$LIBP1PP" -runscm_input entry-libc.P1pp "$ENTRY_LIBC" -runscm_input elf-end.P1pp "$ELF_END" -runscm_input ELF.hex2 "$ELF_HEX2" +runscm_input_from_src src "P1/P1-$ARCH.M1pp" backend.M1pp +runscm_input_from_src src P1/P1.M1pp frontend.M1pp +runscm_input_from_src src P1/P1pp.P1pp libp1pp.P1pp +runscm_input_from_src src P1/entry-libc.P1pp +runscm_input_from_src src P1/elf-end.P1pp +runscm_input_from_src src vendor-seed/ELF.hex2 -runscm_input tcc.flat.c "$TCC_FLAT" -runscm_input libc.flat.c "$LIBC_FLAT" +runscm_input_from_src src tcc/tcc.flat.c +runscm_input_from_src src libc/libc.flat.c runscm_export tcc0 runscm_run "${BOOT3_TIMEOUT:-1800}" diff --git a/scripts/boot4.sh b/scripts/boot4.sh @@ -19,32 +19,23 @@ ## tcc2 = tcc-source compiled by tcc1 ← produced here ## tcc3 = tcc-source compiled by tcc2 ← produced here ## -## ─── Inputs (host-side, auto-built if missing) ──────────────────────── -## build/$ARCH/vendor/tcc/tcc.flat.c -## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/{include,lib} -## — flattened tcc TU + unpacked tree; built -## via scripts/stage1-flatten.sh --arch -## $ARCH (host cc -E, no container) -## build/$ARCH/vendor/mes-libc/libc.flat.c -## — flattened mes-libc TU; built via -## scripts/libc-flatten.sh --arch $ARCH -## (host cc -E, no container) -## -## ─── Inputs (sources, copied into staging) ──────────────────────────── -## tcc-libc/$ARCH/start.S — _start, calls __libc_init+main -## tcc-libc/$ARCH/sys_stubs.S — sys_* syscall wrappers -## tcc-cc/mem.c — memcpy/memmove/memset/memcmp -## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/lib/libtcc1.c +## ─── Inputs (sources, from canonical tree) ─────────────────────────── +## build/$ARCH/src/src/tcc-libc/$ARCH/{start.S,sys_stubs.S} +## — _start, sys_* syscall wrappers +## build/$ARCH/src/src/tcc-cc/mem.c memcpy/memmove/memset/memcmp +## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/libtcc1.c ## (amd64: generic compiler helper runtime) -## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/lib/lib-arm64.c +## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/lib-arm64.c ## (aarch64 + riscv64: TFmode soft-float) -## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/lib/va_list.c +## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/va_list.c ## (amd64: __va_start / __va_arg) -## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/lib/alloca86_64*.S +## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/alloca86_64*.S ## (amd64: alloca helpers) -## build/$ARCH/vendor/tcc/tcc.flat.c — flattened tcc TU -## build/$ARCH/vendor/mes-libc/libc.flat.c — flattened mes-libc TU -## scripts/boot-hello.c — smoke binary +## build/$ARCH/src/src/tcc/tcc.flat.c flattened tcc TU +## build/$ARCH/src/src/libc/libc.flat.c flattened mes-libc TU +## build/$ARCH/src/src/test-fixtures/boot-hello.c smoke binary +## (populated up-front by scripts/prep-src.sh; this stage does +## no flatten/unpack/patch.) ## ## ─── Inputs (binaries from prior stages) ────────────────────────────── ## build/$ARCH/$DRIVER/boot3/tcc0 — built by scripts/boot3.sh @@ -92,32 +83,19 @@ esac BOOT2=build/$ARCH/$DRIVER/boot2 BOOT3=build/$ARCH/$DRIVER/boot3 +SRC=build/$ARCH/src OUT=build/$ARCH/$DRIVER/boot4 STAGE=build/$ARCH/$DRIVER/.boot4-stage -TCC_VENDOR=build/$ARCH/vendor/tcc -TCC_DIR=$TCC_VENDOR/tcc-0.9.26-1147-gee75a10c -TCC_FLAT=$TCC_VENDOR/tcc.flat.c -LIBC_FLAT=build/$ARCH/vendor/mes-libc/libc.flat.c +TCC_PKG=tcc-0.9.26-1147-gee75a10c +TCC_LIB_REL=tcc/$TCC_PKG/lib -# ── prerequisite: prior-stage binaries ──────────────────────────────── +# ── prerequisites ───────────────────────────────────────────────────── require_prev "$BOOT3" tcc0 require_prev "$BOOT2" catm scheme1 - -# ── prerequisite: host-flattened sources + unpacked tcc tree ────────── -# Normally these were produced by boot3 (auto-invoked by stage1-flatten -# / libc-flatten there). Re-check here so boot4 runs standalone if a -# user has tcc0 but blew away build/$ARCH/vendor/tcc/. -if [ ! -e "$TCC_FLAT" ] || [ ! -d "$TCC_DIR/include" ] || [ ! -e "$TCC_DIR/lib/lib-arm64.c" ] || [ ! -e "$TCC_VENDOR/stdarg-bridge.h" ]; then - echo "[$BOOT_TAG] flatten tcc.flat.c (host)" - scripts/stage1-flatten.sh --arch "$ARCH" -fi -if [ ! -e "$LIBC_FLAT" ]; then - echo "[$BOOT_TAG] flatten libc.flat.c (host)" - scripts/libc-flatten.sh --arch "$ARCH" -fi +[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } for f in $LIBTCC1_C_SRCS $LIBTCC1_ASM_SRCS; do - [ -e "$TCC_DIR/lib/$f" ] || { echo "[$BOOT_TAG] missing $TCC_DIR/lib/$f" >&2; exit 1; } + [ -e "$SRC/src/$TCC_LIB_REL/$f" ] || { echo "[$BOOT_TAG] missing $SRC/src/$TCC_LIB_REL/$f" >&2; exit 1; } done # ── stage inputs and run scheme1 + boot4 run.scm under $DRIVER ──────── @@ -129,22 +107,22 @@ scripts/boot4-gen-runscm.sh "$ARCH" "$RUNSCM" echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$RUNSCM") lines" runscm_scheme1 "$BOOT2/scheme1" -runscm_prelude scheme1/prelude.scm +runscm_prelude "$SRC/src/scheme1/prelude.scm" runscm_runscm "$RUNSCM" runscm_input tcc0 "$BOOT3/tcc0" runscm_input catm "$BOOT2/catm" -runscm_input start.S "tcc-libc/$ARCH/start.S" -runscm_input sys_stubs.S "tcc-libc/$ARCH/sys_stubs.S" -runscm_input mem.c tcc-cc/mem.c +runscm_input_from_src src "tcc-libc/$ARCH/start.S" +runscm_input_from_src src "tcc-libc/$ARCH/sys_stubs.S" +runscm_input_from_src src tcc-cc/mem.c for f in $LIBTCC1_C_SRCS $LIBTCC1_ASM_SRCS; do - runscm_input "$f" "$TCC_DIR/lib/$f" + runscm_input_from_src src "$TCC_LIB_REL/$f" done -runscm_input tcc.flat.c "$TCC_FLAT" -runscm_input libc.flat.c "$LIBC_FLAT" -runscm_input hello.c scripts/boot-hello.c +runscm_input_from_src src tcc/tcc.flat.c +runscm_input_from_src src libc/libc.flat.c +runscm_input_from_src src test-fixtures/boot-hello.c hello.c runscm_export tcc1 runscm_export tcc2 diff --git a/scripts/boot5-gen-runscm.sh b/scripts/boot5-gen-runscm.sh @@ -14,10 +14,12 @@ ## ## Conventions (cwd-relative; resolves to / under seed init, /work under ## podman bind-mount): -## musl tree in/tmp/musl-1.2.5/<rel-path> (read-only) -## pre-gen hdrs in/tmp/musl-1.2.5/obj/include/bits/{alltypes,syscall}.h, -## in/tmp/musl-1.2.5/obj/src/internal/version.h -## .o outputs out/obj/musl-1.2.5/<src-with-.o> (rw; pre-mkdir'd by host) +## musl tree in/musl/<rel-path> (read-only; canonical +## tree from prep-src/ +## prep-musl) +## pre-gen hdrs in/musl/obj/include/bits/{alltypes,syscall}.h, +## in/musl/obj/src/internal/version.h +## .o outputs out/obj/musl/<src-with-.o> (rw; pre-mkdir'd by host) ## tcc binary in/tcc (input) ## libtcc1.a in/libtcc1.a (input) ## stdarg bridge in/tcc-stdarg-bridge.h @@ -33,8 +35,8 @@ SRCS=$STAGE_HOST/build-srcs.txt CRT_MODE=$(cat "$STAGE_HOST/crt-mode") [ -e "$SRCS" ] || { echo "missing $SRCS" >&2; exit 1; } -CIN=in/tmp/musl-1.2.5 -COUT=out/obj/musl-1.2.5 +CIN=in/musl +COUT=out/obj/musl # Mirrors boot5.sh's CFLAGS_BASE exactly; the only difference is that # every per-arg token is quoted as its own scheme bytevector. The leading diff --git a/scripts/boot5.sh b/scripts/boot5.sh @@ -12,26 +12,15 @@ ## — boot4's verified self-host tcc ## build/$ARCH/$DRIVER/boot4/libtcc1.a ## — boot4's tcc runtime archive -## vendor/upstream/musl-1.2.5.tar.gz -## — pristine musl source -## vendor/upstream/musl-1.2.5-overrides/ -## — tree of files that replace upstream -## ones (tcc-compat patches; the post- -## patch state vendored directly so the -## build needs no `patch` binary). See -## docs/MUSL.md. -## vendor/upstream/musl-1.2.5-deletes.txt -## — list of upstream files removed by the -## same patch set (one path per line, -## relative to musl-1.2.5/). -## build/$ARCH/vendor/tcc/stdarg-bridge.h -## — per-arch __builtin_va_list bridge, -## generated by scripts/stage1-flatten.sh -## (shared with boot3/boot4; the file is -## byte-identical across arches but a -## per-arch copy is written so every -## artifact under build/$ARCH/ comes from -## a single boot.sh $ARCH invocation) +## build/$ARCH/src/src/musl/ — canonical musl tree (overrides merged, +## deletes applied, alltypes.h/syscall.h +## generated, per-arch skip filter +## applied). Built by prep-src.sh + +## prep-musl.sh. +## build/$ARCH/src/src/tcc/stdarg-bridge.h +## — per-arch __builtin_va_list bridge. +## build/$ARCH/src/src/test-fixtures/boot-hello.c +## — smoke binary linked at the end. ## ## ─── Tools ──────────────────────────────────────────────────────────── ## In container: scratch + busybox (no libc, no /etc, no resolver). @@ -56,55 +45,33 @@ driver_init empty BOOT2=build/$ARCH/$DRIVER/boot2 BOOT4=build/$ARCH/$DRIVER/boot4 +SRC=build/$ARCH/src OUT=build/$ARCH/$DRIVER/boot5 STAGE=build/$ARCH/$DRIVER/.boot5-stage -MUSL_TARBALL=vendor/upstream/musl-1.2.5.tar.gz -MUSL_OVERRIDES=vendor/upstream/musl-1.2.5-overrides -MUSL_DELETES=vendor/upstream/musl-1.2.5-deletes.txt -MUSL_GENERATED=vendor/upstream/musl-1.2.5-generated/$MUSL_ARCH -MUSL_SKIP=vendor/upstream/musl-1.2.5-skip-$ARCH.txt -BRIDGE_FILE=build/$ARCH/vendor/tcc/stdarg-bridge.h +MUSL_DIR=$SRC/src/musl # ── prerequisites ───────────────────────────────────────────────────── require_prev "$BOOT4" tcc3 require_prev "$BOOT2" catm scheme1 [ -e "$BOOT4/libtcc1.a" ] || { echo "[$BOOT_TAG] missing $BOOT4/libtcc1.a (run scripts/boot4.sh $ARCH)" >&2; exit 1; } -[ -e "$MUSL_TARBALL" ] || { echo "[$BOOT_TAG] missing $MUSL_TARBALL" >&2; exit 1; } -[ -d "$MUSL_OVERRIDES" ] || { echo "[$BOOT_TAG] missing $MUSL_OVERRIDES" >&2; exit 1; } -[ -e "$MUSL_DELETES" ] || { echo "[$BOOT_TAG] missing $MUSL_DELETES" >&2; exit 1; } -[ -d "$MUSL_GENERATED" ] || { echo "[$BOOT_TAG] missing $MUSL_GENERATED (run scripts/musl-vendor.sh)" >&2; exit 1; } -[ -e "$MUSL_SKIP" ] || { echo "[$BOOT_TAG] missing $MUSL_SKIP (run scripts/boot5-calibrate.sh $ARCH)" >&2; exit 1; } -[ -e "$BRIDGE_FILE" ] || { echo "[$BOOT_TAG] missing $BRIDGE_FILE (run scripts/stage1-flatten.sh)" >&2; exit 1; } - -# ── prepare staging dirs and musl tree on host ──────────────────────── +[ -d "$MUSL_DIR" ] || { echo "[$BOOT_TAG] missing $MUSL_DIR — run scripts/prep-src.sh $ARCH and scripts/prep-musl.sh $ARCH" >&2; exit 1; } +[ -e "$MUSL_DIR/skip.txt" ] || { echo "[$BOOT_TAG] missing $MUSL_DIR/skip.txt — run scripts/prep-musl.sh $ARCH" >&2; exit 1; } +[ -e "$SRC/src/tcc/stdarg-bridge.h" ] || { echo "[$BOOT_TAG] missing $SRC/src/tcc/stdarg-bridge.h — run scripts/prep-src.sh $ARCH" >&2; exit 1; } + +# ── prepare staging dirs ────────────────────────────────────────────── # $STAGE/in/ — read-only inputs (becomes /work/in or in/ in tmpfs) # $STAGE/out/ — writable outputs (becomes /work/out or out/ in tmpfs) -# $STAGE/_host/ — host-side scratch (enumeration outputs, intermediates); -# never visible to the container/kernel -# runscm_init wipes $STAGE then mkdirs in/ and out/. Do that first so -# we control the layout below. +# $STAGE/_host/ — host-side scratch (enumeration outputs); never +# visible to the container/kernel. . scripts/lib-runscm.sh runscm_init "$STAGE" "$OUT" mkdir -p "$STAGE/_host" -# Extract musl directly into in/tmp/musl-1.2.5/, then apply overrides + -# deletes — gives us a fully-prepared tree we can enumerate to drive the -# (kaem-friendly) flat run.scm. The podman bind mount reads it in place; -# the seed driver picks it up via the `find in -type f` cpio walk. -MUSL_DIR=$STAGE/in/tmp/musl-1.2.5 -mkdir -p "$STAGE/in/tmp" -tar xzf "$MUSL_TARBALL" -C "$STAGE/in/tmp/" -cp -R "$MUSL_OVERRIDES/." "$MUSL_DIR/" -while read -r p; do - [ -n "$p" ] && rm -rf "$MUSL_DIR/$p" -done < "$MUSL_DELETES" - -# ── enumerate musl sources on the host (kaem-friendly: no for/while/ -# case/${%}/${#}/$((..)) inside the container) ─────────────────────── +# ── enumerate musl sources from the canonical tree ──────────────────── # Mirrors musl's Makefile rule: a per-arch override (under -# $d/$MUSL_ARCH/) replaces the same-stem base file (under $d/). We -# subtract the calibration skip list so the run.scm never needs an -# `if $TCC ...; then ok else skip fi` branch. +# $d/$MUSL_ARCH/) replaces the same-stem base file (under $d/). The +# canonical tree already had the per-arch skip filter applied by +# prep-musl.sh, so no skip subtraction is needed here. SRC_TOP="src/aio src/conf src/crypt src/ctype src/dirent src/env src/errno src/exit src/fcntl src/fenv src/internal src/ipc src/legacy src/linux src/locale src/malloc @@ -133,7 +100,7 @@ SRC_TOP="src/aio src/conf src/crypt src/ctype src/dirent ) > "$STAGE/_host/arch.txt" # REPLACED: bases that have arch-specific overrides (drop them from -# BASE). KEEP = (BASE - REPLACED) ∪ ARCH, then minus calibration skips. +# BASE). KEEP = (BASE - REPLACED) ∪ ARCH. awk -v ARCH="$MUSL_ARCH" ' { sub(/\.[^.]*$/, "") # strip extension @@ -155,17 +122,10 @@ awk -v REPF="$STAGE/_host/replaced.txt" ' } ' "$STAGE/_host/base.txt" > "$STAGE/_host/keep_base.txt" -cat "$STAGE/_host/keep_base.txt" "$STAGE/_host/arch.txt" | sort -u > "$STAGE/_host/keep.txt" - -# Subtract the calibration skip list. Lines without a / are bogus; the -# skip file is one path per line, comments allowed via leading '#'. -awk -v SKIPF="$MUSL_SKIP" ' - BEGIN { while ((getline l < SKIPF) > 0) if (l !~ /^#/ && l != "") skip[l] = 1 } - { if (!($0 in skip)) print } -' "$STAGE/_host/keep.txt" > "$STAGE/_host/build-srcs.txt" +cat "$STAGE/_host/keep_base.txt" "$STAGE/_host/arch.txt" | sort -u > "$STAGE/_host/build-srcs.txt" n_src=$(wc -l < "$STAGE/_host/build-srcs.txt") -n_skip=$(wc -l < "$MUSL_SKIP") +n_skip=$(grep -cv '^[[:space:]]*\(#\|$\)' "$MUSL_DIR/skip.txt" || true) echo "[$BOOT_TAG] keep=$n_src skip=$n_skip (calibrated)" # Record CRT mode (asm vs c) so the gen-runscm step picks the right @@ -176,8 +136,8 @@ else echo c > "$STAGE/_host/crt-mode" fi -# Pre-create per-source obj/ directories under $STAGE/out/obj/musl-1.2.5/ -# so scheme1's (run "in/tcc" -c …) doesn't need to mkdir at runtime (tcc +# Pre-create per-source obj/ directories under $STAGE/out/obj/musl/ so +# scheme1's (run "in/tcc" -c …) doesn't need to mkdir at runtime (tcc # errors out if the parent dir is missing, and scheme1 has no mkdir # primitive). awk ' @@ -186,36 +146,30 @@ awk ' if (match($0, /\/[^\/]*$/)) print substr($0, 1, RSTART - 1) } ' "$STAGE/_host/build-srcs.txt" | sort -u > "$STAGE/_host/build-objdirs.txt" -COBJ=$STAGE/out/obj/musl-1.2.5 +COBJ=$STAGE/out/obj/musl mkdir -p "$COBJ/crt" while read -r d; do mkdir -p "$COBJ/$d"; done < "$STAGE/_host/build-objdirs.txt" -# Pre-generated alltypes.h + syscall.h for $MUSL_ARCH; live under in/ -# (read at compile time via -I$CIN/obj/include and -I$CIN/obj/src/internal). -mkdir -p "$MUSL_DIR/obj/include/bits" "$MUSL_DIR/obj/src/internal" -cp "$MUSL_GENERATED/alltypes.h" "$MUSL_DIR/obj/include/bits/alltypes.h" -cp "$MUSL_GENERATED/syscall.h" "$MUSL_DIR/obj/include/bits/syscall.h" -echo '#define VERSION "1.2.5-tcc-boot5"' > "$MUSL_DIR/obj/src/internal/version.h" - # ── generate run.scm and stage chain binaries ───────────────────────── RUNSCM=$STAGE/run.scm scripts/boot5-gen-runscm.sh "$MUSL_ARCH" "$STAGE/_host" "$RUNSCM" echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$RUNSCM") lines, $(wc -c <"$RUNSCM") bytes" runscm_scheme1 "$BOOT2/scheme1" -runscm_prelude scheme1/prelude.scm +runscm_prelude "$SRC/src/scheme1/prelude.scm" runscm_runscm "$RUNSCM" # Chain binaries staged at flat in/ root (cwd-relative names in run.scm). runscm_input tcc "$BOOT4/tcc3" runscm_input libtcc1.a "$BOOT4/libtcc1.a" runscm_input catm "$BOOT2/catm" -runscm_input tcc-stdarg-bridge.h "$BRIDGE_FILE" -runscm_input hello.c scripts/boot-hello.c +runscm_input_from_src src tcc/stdarg-bridge.h tcc-stdarg-bridge.h +runscm_input_from_src src test-fixtures/boot-hello.c hello.c -# Musl tree is already laid out under $STAGE/in/tmp/musl-1.2.5/ above; -# both drivers pick it up automatically (podman bind-mounts $STAGE/in; -# seed packs `find in -type f` into the cpio). +# Stage the canonical musl tree under in/musl/. Both drivers pick it +# up automatically (podman bind-mounts $STAGE/in; seed packs +# `find in -type f` into the cpio). +runscm_input_tree_from_src musl src musl runscm_export libc.a runscm_export crt1.o diff --git a/scripts/boot6.sh b/scripts/boot6.sh @@ -9,16 +9,14 @@ ## build/$ARCH/$DRIVER/boot4/tcc3 — boot4's verified self-host tcc ## (compiler + linker) ## build/$ARCH/$DRIVER/boot2/scheme1 — driver runtime -## seed-kernel/arch/aarch64/kernel.S -## — boot stub, vector table, asm thunks, -## trailing 64 KB stack reserved as -## plain `.bss` (kstack_top is the end -## label of that reservation) -## seed-kernel/kernel.c — DTB parse, MMU bring-up, syscalls, +## build/$ARCH/src/src/kernel/arch/$ARCH/{kernel.S,mmu.c,arch.h} +## — per-arch boot stub, MMU setup, header +## build/$ARCH/src/src/kernel/kernel.c +## — DTB parse, MMU bring-up, syscalls, ## virtio-blk, tmpfs, ELF loader -## seed-kernel/arch/aarch64/mmu.c -## — arm64 page-table setup and pool swap -## tcc-cc/mem.c — memcpy/memset/memmove/memcmp +## build/$ARCH/src/src/tcc-cc/mem.c memcpy/memset/memmove/memcmp +## build/$ARCH/src/src/kernel/scripts/elf-pvh-note.c +## — amd64-only post-link PT_NOTE fixup ## ## ─── Tools ──────────────────────────────────────────────────────────── ## In container: scratch + busybox (boot2-empty:$ARCH). @@ -48,14 +46,16 @@ driver_init empty OUT_FILE=$KERNEL_NAME BOOT2=build/$ARCH/$DRIVER/boot2 BOOT4=build/$ARCH/$DRIVER/boot4 +SRC=build/$ARCH/src OUT=build/$ARCH/$DRIVER/boot6 STAGE=build/$ARCH/$DRIVER/.boot6-stage # ── prerequisites ───────────────────────────────────────────────────── require_prev "$BOOT4" tcc3 require_prev "$BOOT2" scheme1 -for f in seed-kernel/arch/$ARCH/kernel.S seed-kernel/arch/$ARCH/mmu.c seed-kernel/arch/$ARCH/arch.h seed-kernel/kernel.c tcc-cc/mem.c; do - [ -f "$f" ] || { echo "[$BOOT_TAG] missing $f" >&2; exit 1; } +[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; } +for f in kernel/arch/$ARCH/kernel.S kernel/arch/$ARCH/mmu.c kernel/arch/$ARCH/arch.h kernel/kernel.c tcc-cc/mem.c; do + [ -f "$SRC/src/$f" ] || { echo "[$BOOT_TAG] missing $SRC/src/$f" >&2; exit 1; } done # ── stage inputs and run scheme1 + run.scm under $DRIVER ────────────── @@ -67,28 +67,28 @@ scripts/boot6-gen-runscm.sh "$ARCH" "$RUNSCM" echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$RUNSCM") lines" runscm_scheme1 "$BOOT2/scheme1" -runscm_prelude scheme1/prelude.scm +runscm_prelude "$SRC/src/scheme1/prelude.scm" runscm_runscm "$RUNSCM" runscm_input tcc3 "$BOOT4/tcc3" -runscm_input kernel.S seed-kernel/arch/$ARCH/kernel.S -runscm_input kernel.c seed-kernel/kernel.c -runscm_input arch.h seed-kernel/arch/$ARCH/arch.h -runscm_input mmu.c seed-kernel/arch/$ARCH/mmu.c -runscm_input mem.c tcc-cc/mem.c +runscm_input_from_src src "kernel/arch/$ARCH/kernel.S" +runscm_input_from_src src kernel/kernel.c +runscm_input_from_src src "kernel/arch/$ARCH/arch.h" +runscm_input_from_src src "kernel/arch/$ARCH/mmu.c" +runscm_input_from_src src tcc-cc/mem.c # amd64 needs a post-link fixup — tcc3 doesn't emit PT_NOTE phdrs, so # QEMU's PVH `-kernel` path can't find the Xen 18 note that names the # 32-bit entry. The fixup is a hosted C tool we build inside the same # run.scm with tcc3 + boot4's libc/crt1/libtcc1, then run on kernel.elf. if [ "$ARCH" = amd64 ]; then - runscm_input elf-pvh-note.c seed-kernel/scripts/elf-pvh-note.c + runscm_input_from_src src kernel/scripts/elf-pvh-note.c runscm_input crt1.o "$BOOT4/crt1.o" runscm_input libc.a "$BOOT4/libc.a" runscm_input libtcc1.a "$BOOT4/libtcc1.a" fi runscm_export "$OUT_FILE" -runscm_run 1200 +runscm_run "${BOOT6_TIMEOUT:-1200}" echo "[$BOOT_TAG] OK -> $OUT/$OUT_FILE ($(wc -c <"$OUT/$OUT_FILE") bytes)" diff --git a/scripts/lib-pipeline.sh b/scripts/lib-pipeline.sh @@ -82,6 +82,17 @@ pipeline_input() { P_INPUT_NAMES="$P_INPUT_NAMES $name" } +# pipeline_input_from_src — pull a file from the canonical generated +# source tree at build/$ARCH/src/{bin,src}/<subpath>. Stages under +# in/<name> where <name> defaults to basename(subpath); pass an +# override as the optional third argument when the staged name must +# differ (e.g. P1.M1 vs P1-aarch64.M1). +pipeline_input_from_src() { + _kind=$1; _subpath=$2; _name=${3:-} + [ -n "$_name" ] || _name=$(basename "$_subpath") + pipeline_input "$_name" "build/$ARCH/src/$_kind/$_subpath" +} + # Look up a token: if it names an input, prefix `in/`; if it names a # previously produced output, prefix `out/`; else leave unchanged. _p_lookup() { diff --git a/scripts/lib-runscm.sh b/scripts/lib-runscm.sh @@ -70,6 +70,22 @@ runscm_input_tree() { done } +# runscm_input_from_src — pull a file from the canonical generated +# source tree at build/$ARCH/src/{bin,src}/<subpath>. Stages under +# in/<name> where <name> defaults to basename(subpath). +runscm_input_from_src() { + _kind=$1; _subpath=$2; _name=${3:-} + [ -n "$_name" ] || _name=$(basename "$_subpath") + runscm_input "$_name" "build/$ARCH/src/$_kind/$_subpath" +} + +# runscm_input_tree_from_src — same as runscm_input_tree, but the +# source root is build/$ARCH/src/{bin,src}/<subpath>. +runscm_input_tree_from_src() { + _prefix=$1; _kind=$2; _subpath=$3 + runscm_input_tree "$_prefix" "build/$ARCH/src/$_kind/$_subpath" +} + runscm_export() { S_EXPORTS="$S_EXPORTS $1" } diff --git a/scripts/prep-src.sh b/scripts/prep-src.sh @@ -21,6 +21,8 @@ ## cc/ cc.scm, main.scm ## tcc/ tcc.flat.c, stdarg-bridge.h, plus ## tcc-0.9.26-1147-gee75a10c/{include,lib} +## tcc-libc/$ARCH/ start.S, sys_stubs.S +## tcc-cc/ mem.c (memcpy/memmove/memset/memcmp) ## libc/ libc.flat.c (mes-libc flattened) ## musl/ filtered musl-1.2.5 tree (overrides ## merged, deletes applied, generated @@ -28,6 +30,7 @@ ## prep-musl.sh applies the per-arch ## skip filter on top. ## kernel/ seed-kernel sources for this arch +## test-fixtures/ boot-hello.c smoke binary ## ## A0 is split: prep-src.sh runs before boot0 and produces everything ## that doesn't need a working compiler. prep-musl.sh runs after boot4 @@ -91,6 +94,19 @@ mkdir -p "$DST_SRC/cc" cp cc/cc.scm "$DST_SRC/cc/cc.scm" cp cc/main.scm "$DST_SRC/cc/main.scm" +# tcc-libc: per-arch _start + sys_* wrappers consumed by boot4. +mkdir -p "$DST_SRC/tcc-libc/$ARCH" +cp "tcc-libc/$ARCH/start.S" "$DST_SRC/tcc-libc/$ARCH/start.S" +cp "tcc-libc/$ARCH/sys_stubs.S" "$DST_SRC/tcc-libc/$ARCH/sys_stubs.S" + +# tcc-cc: tiny mem helpers consumed by boot4 + boot6. +mkdir -p "$DST_SRC/tcc-cc" +cp tcc-cc/mem.c "$DST_SRC/tcc-cc/mem.c" + +# Smoke binary linked by boot4 + boot5. +mkdir -p "$DST_SRC/test-fixtures" +cp scripts/boot-hello.c "$DST_SRC/test-fixtures/boot-hello.c" + # ── (3) seed-kernel sources for this arch ───────────────────────────── mkdir -p "$DST_SRC/kernel/arch/$ARCH" "$DST_SRC/kernel/user" "$DST_SRC/kernel/scripts" cp seed-kernel/kernel.c "$DST_SRC/kernel/kernel.c"