commit fb0a4e33c70a5965ed86b9c838e2b83df3827e93
parent 6c7c1475733065ae0d98ff556718651667d1891d
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Wed, 6 May 2026 12:26:32 -0700
A0: bootN.sh read sources from canonical build/<arch>/src/ tree
Wires the canonical generated source tree (built by prep-src.sh +
prep-musl.sh in the prior commit) into the boot pipeline:
- lib-pipeline.sh + lib-runscm.sh: new *_input_from_src and
runscm_input_tree_from_src helpers that pull from
build/$ARCH/src/{bin,src}/...
- boot.sh: invokes prep-src before boot0 and prep-musl before boot5.
- boot{0..6}.sh: every source input switches to the canonical-tree
helper. Inputs are now declarative.
- boot3 + boot4: drop the auto-invoke stage1-flatten/libc-flatten
blocks; the canonical tree is the single source of flattened TUs.
- boot5: reads the canonical musl tree directly. The in-stage tarball
unpack + overrides + deletes + alltypes/syscall header generation
+ skip-list subtraction are gone (prep-src + prep-musl own all of
that); only the per-arch override-vs-base enumeration stays.
- boot5-gen-runscm.sh: emits in/musl/<rel> + out/obj/musl/<rel>
instead of in/tmp/musl-1.2.5/... + out/obj/musl-1.2.5/...
- prep-src.sh: extends the canonical tree with src/tcc-libc/$ARCH/
(start.S, sys_stubs.S), src/tcc-cc/ (mem.c), and src/test-fixtures/
(boot-hello.c) so boot4/5/6 read every source from one place.
Validation: full scripts/boot.sh aarch64 (DRIVER=podman) green.
boot{0..4,6}/ are byte-identical to pre-A0 outputs. boot5/{libc.a,
crti.o,crtn.o} differ only in the embedded source-path string tcc
stamps into each .o (in/musl/... vs the legacy in/tmp/musl-1.2.5/...).
Diffstat:
13 files changed, 326 insertions(+), 281 deletions(-)
diff --git a/docs/PLAN.md b/docs/PLAN.md
@@ -52,7 +52,7 @@ are actually stage-specific.
---
-### A3. Per-driver output trees
+### [DONE] A3. Per-driver output trees
**Goal.** `build/<arch>/<driver>/...` everywhere. Kill the
`build/.seed-bootstrap/` shuffle in `boot.sh`. Two drivers can coexist on
@@ -78,7 +78,7 @@ disk; nothing gets clobbered when you switch.
---
-### A0. Canonical generated source tree (host source-prep)
+### [DONE] A0. Canonical generated source tree (host source-prep)
**Goal.** All host-side source preparation happens once, up front, into a
single canonical tree at `build/<arch>/src/`. This tree is the audit
@@ -211,54 +211,129 @@ unchanged from pre-A4.
0.9.26 is incomplete. After AT, every arch is on the same footing in
seed-kernel and the build pipeline.
+The four subitems are independent — none gates another. AT.3 was
+previously claimed to gate AT.4 (per the "once asm-support patches
+land" wording); audit found that wrong (AT.4 is a pure refactor).
+
**In scope.**
-- **amd64 `.quad` truncation in `gen_le64`.** Currently `arch/amd64/mmu.c`
- encodes GDT entries as `.long lo, hi` pairs. Patch tcc's amd64
- assembler to accept full 64-bit immediates; revert the workaround.
- (Existing memory note: `project_tcc_arm64_svcul_truncation.md`
- describes a related arm64 issue — track separately or in the same
- pass.)
-- **`.note.*` SHT_NOTE.** tcc emits `.note.*` sections as PROGBITS,
- forcing the post-link `seed-kernel/scripts/elf-pvh-note.c` tool to
- rewrite the ELF for amd64 PVH boot. Patch tcc to recognize `.note.*`
- → SHT_NOTE and emit a PT_NOTE phdr. Delete `elf-pvh-note.c` and the
- amd64 boot6 fixup.
-- **riscv64 inline-asm constraints / register-asm silent drop.** Existing
- memory `project_tcc_inline_asm_silent_drop.md` documents tcc 0.9.26
- silently dropping register-asm constraints. If this affects any
- riscv64 codepath in the kernel or musl build, fix it here.
-- **Audit `seed-kernel/simple-patches/` (if any) and per-arch `arch.h`
- externs vs inlines.** amd64 and riscv64 declare arch helpers as
- externs (workaround for tcc inline-asm gaps) while aarch64 inlines
- them. Once the asm-support patches above land, normalize to the
- aarch64 style.
-
-**Out of scope (still tracked separately).**
-- riscv64 boot0 stage-4 user-trap bug (`docs/SEED-RISCV64-TODO.md`) —
- may or may not be tcc; investigate after AT.
-- amd64 boot3+ validation under DRIVER=seed
- (`docs/SEED-AMD64-TODO.md`) — compute-bound, not a tcc issue.
+
+- **AT.1. amd64 `.quad` 64-bit-literal truncation.** `seed-kernel/arch/amd64/kernel.S:406–413`
+ encodes the GDT as `.long lo, hi` pairs because `.quad 0x00af9a000000ffff`
+ truncates. Root cause is **not** in tcc's `gen_le64` (that path is 64-bit
+ clean); it's in mes-libc's `vendor/mes-libc/mes/abtol.c:29`, which
+ accumulates `strtoull` into an `int`, so any 64-bit hex literal in
+ `tccasm.c:asm_expr_unary` loses its high bits at parse time. Two fixes:
+ - **(preferred)** one-line edit in `abtol.c` (`int i` → `long long i`).
+ Repairs every `strto*` caller, not just asm.
+ - **(alternative)** new patch `tccasm-asm-expr-parse-number.{before,after}`
+ that inlines a 64-bit hex/oct/dec accumulator in `asm_expr_unary`,
+ bypassing `strtoull`.
+ Then revert `kernel.S:406–413` to plain `.quad` GDT entries. (`mmu.c`
+ has no workaround — uses C-side `u64` constants — and needs no change.)
+ Existing memory `project_tcc_arm64_svcul_truncation.md` describes an
+ unrelated arm64 codegen-side truncation already patched in
+ `scripts/simple-patches/tcc-0.9.26/arm64-svcul-no-truncate*`; track
+ separately.
+
+- **AT.2. `.note.*` SHT_NOTE / PT_NOTE.** tcc emits `.note.*` sections
+ as `SHT_PROGBITS` and never writes a `PT_NOTE` phdr, forcing the
+ post-link `seed-kernel/scripts/elf-pvh-note.c` tool to rewrite the
+ ELF for amd64 PVH boot. Two patches:
+ - `note-section-sht-note.{before,after}` — in `tccelf.c:251`
+ `find_section()`, name-prefix-check `.note*` → create as `SHT_NOTE`.
+ - `pt-note-phdr.{before,after}` — in `tccelf.c:elf_output_file`
+ (~2202), gate `phnum++` on "any `SHT_NOTE+SHF_ALLOC` section
+ exists" (else aarch64/riscv64 phnum perturbs and breaks the A5
+ reproducibility target); compute `min(sh_offset)` /
+ `max(sh_offset+sh_size)` over those sections; emit one `PT_NOTE`.
+ Then delete `seed-kernel/scripts/elf-pvh-note.c`,
+ `scripts/prep-src.sh:121–123`, `scripts/boot6.sh:80–89`,
+ `scripts/boot6-gen-runscm.sh:99–117`. `seed-kernel/arch/amd64/kernel.lds`
+ is unchanged (consumed by clang+ld.lld, not tcc).
+
+- **AT.3. riscv64 inline-asm — audit + memory update only, no patch.**
+ Audit found `riscv64-asm.c:801–819` ships only stubs for
+ `subst_asm_operand` (loud error), `asm_gen_code` (empty), and
+ `asm_compute_constraints` (empty) — same silent-drop class as arm64
+ per memory `project_tcc_inline_asm_silent_drop.md`. But every
+ register-asm callsite in the repo is already worked around: musl
+ syscall/tp/atomics overrides under
+ `vendor/upstream/musl-1.2.5-overrides/arch/riscv64/`, and 13 externs
+ in `seed-kernel/arch/riscv64/arch.h` defined in `kernel.S`. There
+ is no live miscompile to fix. A real Phase-3 inline-asm port (~400+
+ LOC, parallel to the unstarted arm64 Phase 3) is well out of scope.
+ AT.3 work:
+ - Update `project_tcc_inline_asm_silent_drop.md` to add a riscv64
+ section.
+ - Document in `docs/TCC.md` that riscv64 and arm64 share the
+ silent-drop bug class until Phase 3 lands.
+ Three small wins that *don't* need any tcc work and can land here:
+ inline `arch_mmio_ptr` (pure pointer arithmetic), `arch_system_off`
+ (one MMIO write), and the `saved_user_sp` accessors in pure C — see
+ AT.4 for context.
+
+- **AT.4. arch.h API uniformity (refactor, not inline-asm).** The
+ previous framing — "amd64/riscv64 use externs, aarch64 inlines" —
+ was wrong. aarch64 also externs every primitive (arch.h:51–60); the
+ difference is one layer up: aarch64 exposes a small primitive set
+ (`sysreg_read/write`, `arm64_barrier(kind)`, `cpu_pause(kind)`,
+ `arm64_psci_call`) and synthesizes the `arch_*()` API as macros
+ (arch.h:62–72). amd64/riscv64 spell each `arch_*` as its own
+ dedicated extern. There is **zero** `__asm__`/inline-asm in any
+ seed-kernel C file, so AT.4 has no tcc dependency. Refactor:
+ - amd64: introduce `amd64_fence(kind)`, port-io and msr dispatcher
+ helpers; collapse the per-API externs in
+ `seed-kernel/arch/amd64/kernel.S:219–378` into the primitive set;
+ rewrite `arch_*()` in `arch.h` as macros.
+ - riscv64: introduce `riscv_csr_read/write(id)`,
+ `riscv_fence(kind)` (parallel to `sysreg_read`/`arm64_barrier`);
+ same macro rewrite for `arch_*()`.
+ - `arch_idle_forever` and `arch_system_off` may stay as their own
+ externs (don't compress into a `(kind)` dispatcher cleanly) —
+ aarch64 keeps `arm64_psci_call` in the same shape.
**Touch list.**
-- `vendor/tcc/…` (or wherever the tcc tree lives): the patches.
-- `scripts/stage1-flatten.sh`: re-flatten with the new patches; commit
- the resulting `tcc.flat.c` if it's vendored, otherwise just regenerate.
-- `seed-kernel/arch/amd64/mmu.c`: revert `.quad` workarounds.
-- `seed-kernel/arch/amd64/kernel.S`: reinstate native `.note.*` if the
- workaround required emitting them differently.
-- `seed-kernel/arch/{amd64,riscv64}/arch.h`: inline what was extern.
-- `seed-kernel/arch/{amd64,riscv64}/{kernel.S,mmu.c}`: drop quirks.
-- `seed-kernel/scripts/elf-pvh-note.c`: delete.
-- `scripts/boot6.sh`, `scripts/boot6-gen-runscm.sh`: delete the
- amd64-only post-link fixup block.
-- `docs/TCC.md`: document the new tcc patch set; retire any
- workaround-explanation sections that are now moot.
-- Update memory entries `project_tcc_*` as patches eliminate the
- recorded bugs.
-
-**Validation.** seed-kernel builds on all three arches with no
-arch-conditional shell logic in boot6. `make build/<arch>/podman/boot6/Image`
-produces a clean ELF without post-link fixup. All test suites still pass.
+- `vendor/mes-libc/mes/abtol.c` (AT.1 preferred fix) **or** new patch
+ `scripts/simple-patches/tcc-0.9.26/tccasm-asm-expr-parse-number.*`
+ (AT.1 alternative).
+- `scripts/simple-patches/tcc-0.9.26/note-section-sht-note.*` and
+ `pt-note-phdr.*` (AT.2). The patch directory is
+ `scripts/simple-patches/tcc-0.9.26/` (not `seed-kernel/simple-patches/`,
+ which doesn't exist).
+- `scripts/stage1-flatten.sh`: list any newly-added patches in
+ `apply_our_patch` (around line ~223); regenerate the flattened tree.
+- `seed-kernel/arch/amd64/kernel.S`: lines 406–413 (`.quad` GDT
+ revert, AT.1); lines 219–378 (refactor to primitive set, AT.4).
+- `seed-kernel/arch/amd64/arch.h`: rewrite `arch_*()` as macros (AT.4).
+- `seed-kernel/arch/riscv64/{kernel.S,arch.h}`: same refactor (AT.4).
+- `seed-kernel/arch/riscv64/arch.h`: inline `arch_mmio_ptr`,
+ `arch_system_off`, `arch_read_user_sp`, `arch_write_user_sp` (AT.3
+ side wins).
+- `seed-kernel/scripts/elf-pvh-note.c`: delete (AT.2).
+- `scripts/prep-src.sh`: drop lines 121–123 staging elf-pvh-note.c
+ (AT.2).
+- `scripts/boot6.sh`: drop lines 80–89 (AT.2).
+- `scripts/boot6-gen-runscm.sh`: drop lines 99–117 (AT.2).
+- `docs/TCC.md`: document the AT patch set; add the riscv64+arm64
+ silent-drop note (AT.3).
+- Update `project_tcc_inline_asm_silent_drop.md` to cover riscv64
+ (AT.3). Other `project_tcc_*` memories stay as historical record.
+
+**Validation.**
+- AT.1: `readelf -x .data build/amd64/$DRIVER/boot1/M1pp` (or any
+ amd64 binary using a 64-bit hex literal) shows correct upper bits.
+ amd64 boots after `.quad` revert.
+- AT.2: `readelf -S build/amd64/$DRIVER/boot6/Image` shows `.note.Xen`
+ type `NOTE`; `readelf -l` shows one `PT_NOTE` covering it. `cmp`
+ vs the pre-patch post-fixup ELF byte-identical at the affected
+ offsets. amd64 boots through QEMU PVH `-kernel`.
+- AT.2 reproducibility: aarch64/riscv64 `readelf -l` phdr counts
+ unchanged from pre-patch (proves the SHT_NOTE-presence gate works).
+- AT.3: docs+memory updated; no functional change expected.
+- AT.4: seed-kernel builds on all three arches with the new macro
+ layer; amd64/riscv64 byte output unchanged (refactor only).
+- All test suites still pass. `boot6` has no amd64-conditional shell
+ logic.
---
diff --git a/scripts/boot.sh b/scripts/boot.sh
@@ -43,11 +43,22 @@ stage() {
echo "[$BOOT_TAG] $name: $((e - s))s (cum $((e - T0))s)"
}
+# A0a: build the canonical generated source tree at build/$ARCH/src/.
+# Boot stages read source from there exclusively (no flatten/unpack/
+# patch inside boot{N}.sh).
+stage prep-src ./scripts/prep-src.sh $ARCH
+
stage boot0 ./scripts/boot0.sh $ARCH
stage boot1 ./scripts/boot1.sh $ARCH
stage boot2 ./scripts/boot2.sh $ARCH
stage boot3 ./scripts/boot3.sh $ARCH
stage boot4 ./scripts/boot4.sh $ARCH
+
+# A0b: apply the per-arch musl skip filter (needs tcc3 from boot4 if
+# the calibration list is missing; the committed list is the common
+# case and runs without compiler).
+stage prep-musl ./scripts/prep-musl.sh $ARCH
+
stage boot5 ./scripts/boot5.sh $ARCH
# boot6 builds the seed-kernel ELF/Image with boot4's tcc3 (no `ld -T`,
diff --git a/scripts/boot0.sh b/scripts/boot0.sh
@@ -5,9 +5,11 @@
## brings up: hex0 -> hex1 -> hex2 -> catm -> M0. Three of those (hex2,
## catm, M0) are the binaries every later stage depends on.
##
-## ─── Inputs (sources) ─────────────────────────────────────────────────
-## vendor/seed/$ARCH/{hex0-seed, hex0.hex0, hex1.hex0, hex2.hex1,
-## catm.hex2, M0.hex2, ELF.hex2}
+## ─── Inputs (sources, from canonical tree) ───────────────────────────
+## build/$ARCH/src/bin/hex0-seed
+## build/$ARCH/src/src/vendor-seed/{hex0.hex0, hex1.hex0, hex2.hex1,
+## catm.hex2, M0.hex2, ELF.hex2}
+## (populated by scripts/prep-src.sh from vendor/seed/$ARCH/.)
##
## ─── Outputs ──────────────────────────────────────────────────────────
## build/$ARCH/$DRIVER/boot0/{hex2, catm, M0}
@@ -21,17 +23,19 @@ set -eu
bootlib_init boot0 "${1:-}"
driver_init scratch
-SEED=vendor/seed/$ARCH
+SRC=build/$ARCH/src
OUT=build/$ARCH/$DRIVER/boot0
STAGE=build/$ARCH/$DRIVER/.boot0-stage
+[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; }
+
. scripts/lib-pipeline.sh
pipeline_init "$STAGE" "$OUT" "$DRIVER"
-# ─── inputs ───────────────────────────────────────────────────────────
-for f in hex0-seed hex0.hex0 hex1.hex0 hex2.hex1 catm.hex2 M0.hex2 ELF.hex2; do
- [ -e "$SEED/$f" ] || { echo "[$BOOT_TAG] missing input: $SEED/$f" >&2; exit 1; }
- pipeline_input "$f" "$SEED/$f"
+# ─── inputs (from canonical src tree) ─────────────────────────────────
+pipeline_input_from_src bin hex0-seed
+for f in hex0.hex0 hex1.hex0 hex2.hex1 catm.hex2 M0.hex2 ELF.hex2; do
+ pipeline_input_from_src src "vendor-seed/$f"
done
# ─── pipeline ─────────────────────────────────────────────────────────
diff --git a/scripts/boot1.sh b/scripts/boot1.sh
@@ -5,9 +5,11 @@
## hex2pp pair, built from their .P1 sources via the seed M0 + hex2
## chain. catm is rebuilt from catm.P1pp in boot2.
##
-## ─── Inputs (sources) ─────────────────────────────────────────────────
-## M1pp/M1pp.P1, hex2pp/hex2pp.P1
-## P1/P1-$ARCH.M1, vendor/seed/$ARCH/ELF.hex2
+## ─── Inputs (sources, from canonical tree) ───────────────────────────
+## build/$ARCH/src/src/M1pp/M1pp.P1
+## build/$ARCH/src/src/hex2pp/hex2pp.P1
+## build/$ARCH/src/src/P1/P1-$ARCH.M1
+## build/$ARCH/src/src/vendor-seed/ELF.hex2
##
## ─── Inputs (binaries from prior stages) ──────────────────────────────
## build/$ARCH/$DRIVER/boot0/{hex2, M0, catm}
@@ -25,10 +27,12 @@ bootlib_init boot1 "${1:-}"
driver_init scratch
BOOT0=build/$ARCH/$DRIVER/boot0
+SRC=build/$ARCH/src
OUT=build/$ARCH/$DRIVER/boot1
STAGE=build/$ARCH/$DRIVER/.boot1-stage
require_prev "$BOOT0" hex2 M0 catm
+[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; }
. scripts/lib-pipeline.sh
pipeline_init "$STAGE" "$OUT" "$DRIVER"
@@ -37,10 +41,10 @@ pipeline_init "$STAGE" "$OUT" "$DRIVER"
pipeline_input hex2 "$BOOT0/hex2"
pipeline_input M0 "$BOOT0/M0"
pipeline_input catm "$BOOT0/catm"
-pipeline_input P1.M1 "P1/P1-$ARCH.M1"
-pipeline_input ELF.hex2 "vendor/seed/$ARCH/ELF.hex2"
-pipeline_input M1pp.P1 "M1pp/M1pp.P1"
-pipeline_input hex2pp.P1 "hex2pp/hex2pp.P1"
+pipeline_input_from_src src "P1/P1-$ARCH.M1" P1.M1
+pipeline_input_from_src src vendor-seed/ELF.hex2
+pipeline_input_from_src src M1pp/M1pp.P1
+pipeline_input_from_src src hex2pp/hex2pp.P1
# ─── pipeline ─────────────────────────────────────────────────────────
echo "[$BOOT_TAG] M1pp.P1 + hex2pp.P1 -> M1pp + hex2pp"
diff --git a/scripts/boot2.sh b/scripts/boot2.sh
@@ -6,10 +6,11 @@
## boot0 catm so later stages run with zero boot0 dependencies); then
## builds the scheme1 interpreter from scheme1.P1pp using the new catm.
##
-## ─── Inputs (sources) ─────────────────────────────────────────────────
-## catm/catm.P1pp, scheme1/scheme1.P1pp
-## P1/P1-$ARCH.M1pp, P1/P1.M1pp, P1/P1pp.P1pp
-## vendor/seed/$ARCH/ELF.hex2
+## ─── Inputs (sources, from canonical tree) ───────────────────────────
+## build/$ARCH/src/src/catm/catm.P1pp
+## build/$ARCH/src/src/scheme1/scheme1.P1pp
+## build/$ARCH/src/src/P1/{P1-$ARCH.M1pp, P1.M1pp, P1pp.P1pp}
+## build/$ARCH/src/src/vendor-seed/ELF.hex2
##
## ─── Inputs (binaries from prior stages) ──────────────────────────────
## build/$ARCH/$DRIVER/boot0/catm (only to bootstrap catm.P1pp build)
@@ -29,11 +30,13 @@ driver_init scratch
BOOT0=build/$ARCH/$DRIVER/boot0
BOOT1=build/$ARCH/$DRIVER/boot1
+SRC=build/$ARCH/src
OUT=build/$ARCH/$DRIVER/boot2
STAGE=build/$ARCH/$DRIVER/.boot2-stage
require_prev "$BOOT0" catm
require_prev "$BOOT1" M1pp hex2pp
+[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; }
. scripts/lib-pipeline.sh
pipeline_init "$STAGE" "$OUT" "$DRIVER"
@@ -42,12 +45,12 @@ pipeline_init "$STAGE" "$OUT" "$DRIVER"
pipeline_input catm0 "$BOOT0/catm" # bootstrap; replaced by output 'catm'
pipeline_input M1pp "$BOOT1/M1pp"
pipeline_input hex2pp "$BOOT1/hex2pp"
-pipeline_input backend.M1pp "P1/P1-$ARCH.M1pp"
-pipeline_input frontend.M1pp "P1/P1.M1pp"
-pipeline_input libp1pp.P1pp "P1/P1pp.P1pp"
-pipeline_input ELF.hex2 "vendor/seed/$ARCH/ELF.hex2"
-pipeline_input catm.P1pp "catm/catm.P1pp"
-pipeline_input scheme1.P1pp "scheme1/scheme1.P1pp"
+pipeline_input_from_src src "P1/P1-$ARCH.M1pp" backend.M1pp
+pipeline_input_from_src src P1/P1.M1pp frontend.M1pp
+pipeline_input_from_src src P1/P1pp.P1pp libp1pp.P1pp
+pipeline_input_from_src src vendor-seed/ELF.hex2
+pipeline_input_from_src src catm/catm.P1pp
+pipeline_input_from_src src scheme1/scheme1.P1pp
# ─── pipeline ─────────────────────────────────────────────────────────
echo "[$BOOT_TAG] catm.P1pp -> catm; scheme1.P1pp -> scheme1"
diff --git a/scripts/boot3.sh b/scripts/boot3.sh
@@ -12,22 +12,16 @@
## tcc2 = tcc-source compiled by tcc1 ← boot4
## tcc3 = tcc-source compiled by tcc2 ← boot4
##
-## ─── Inputs (host-side, auto-built if missing) ────────────────────────
-## build/$ARCH/vendor/tcc/tcc.flat.c
-## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/{include,lib}
-## — flattened tcc TU + unpacked tree; built
-## via scripts/stage1-flatten.sh --arch
-## $ARCH (host cc -E, no container)
-## build/$ARCH/vendor/mes-libc/libc.flat.c
-## — flattened mes-libc TU; built via
-## scripts/libc-flatten.sh --arch $ARCH
-## (host cc -E, no container)
-##
-## ─── Inputs (sources, copied into staging) ────────────────────────────
-## scheme1/prelude.scm cc/cc.scm cc/main.scm — catm'd to cc.scm bundle
-## P1/P1-$ARCH.M1pp P1/P1.M1pp P1/P1pp.P1pp — M1pp pipeline
-## P1/entry-libc.P1pp P1/elf-end.P1pp — link-time framing
-## vendor/seed/$ARCH/ELF.hex2 — ELF header fragment
+## ─── Inputs (sources, from canonical tree) ───────────────────────────
+## build/$ARCH/src/src/scheme1/{prelude.scm} scheme bundle
+## build/$ARCH/src/src/cc/{cc.scm,main.scm} scheme bundle
+## build/$ARCH/src/src/P1/{P1-$ARCH.M1pp,P1.M1pp,P1pp.P1pp} M1pp pipeline
+## build/$ARCH/src/src/P1/{entry-libc.P1pp,elf-end.P1pp} link framing
+## build/$ARCH/src/src/vendor-seed/ELF.hex2 ELF header
+## build/$ARCH/src/src/tcc/tcc.flat.c flattened tcc TU
+## build/$ARCH/src/src/libc/libc.flat.c flattened mes-libc TU
+## (populated up-front by scripts/prep-src.sh; this stage does
+## no flatten/unpack/patch.)
##
## ─── Inputs (binaries from prior stages) ──────────────────────────────
## build/$ARCH/$DRIVER/boot1/{M1pp, hex2pp} — built by scripts/boot1.sh
@@ -56,43 +50,20 @@ driver_init empty
BOOT1=build/$ARCH/$DRIVER/boot1
BOOT2=build/$ARCH/$DRIVER/boot2
+SRC=build/$ARCH/src
OUT=build/$ARCH/$DRIVER/boot3
STAGE=build/$ARCH/$DRIVER/.boot3-stage
-TCC_VENDOR=build/$ARCH/vendor/tcc
-TCC_DIR=$TCC_VENDOR/tcc-0.9.26-1147-gee75a10c
-TCC_FLAT=$TCC_VENDOR/tcc.flat.c
-LIBC_FLAT=build/$ARCH/vendor/mes-libc/libc.flat.c
-
-# ── prerequisite: prior-stage binaries ────────────────────────────────
+# ── prerequisites ─────────────────────────────────────────────────────
require_prev "$BOOT1" M1pp hex2pp
require_prev "$BOOT2" catm scheme1
-
-# ── prerequisite: host-flattened sources + unpacked tcc tree ──────────
-# tcc.flat.c + the unpacked $TCC_DIR/{include,lib} tree are produced
-# together by stage1-flatten.sh; libc.flat.c by libc-flatten.sh. Both
-# run on the host (cc -E), no container — auto-invoke if missing.
-if [ ! -e "$TCC_FLAT" ] || [ ! -d "$TCC_DIR/include" ] || [ ! -e "$TCC_VENDOR/stdarg-bridge.h" ]; then
- echo "[$BOOT_TAG] flatten tcc.flat.c (host)"
- scripts/stage1-flatten.sh --arch "$ARCH"
-fi
-if [ ! -e "$LIBC_FLAT" ]; then
- echo "[$BOOT_TAG] flatten libc.flat.c (host)"
- scripts/libc-flatten.sh --arch "$ARCH"
-fi
-
-BACKEND_M1PP=P1/P1-$ARCH.M1pp
-FRONTEND_M1PP=P1/P1.M1pp
-LIBP1PP=P1/P1pp.P1pp
-ENTRY_LIBC=P1/entry-libc.P1pp
-ELF_END=P1/elf-end.P1pp
-ELF_HEX2=vendor/seed/$ARCH/ELF.hex2
+[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; }
# ── stage inputs and run scheme1 + boot3-run.scm under $DRIVER ────────
. scripts/lib-runscm.sh
runscm_init "$STAGE" "$OUT"
runscm_scheme1 "$BOOT2/scheme1"
-runscm_prelude scheme1/prelude.scm
+runscm_prelude "$SRC/src/scheme1/prelude.scm"
runscm_runscm scripts/boot3-run.scm
runscm_input catm "$BOOT2/catm"
@@ -101,19 +72,19 @@ runscm_input hex2pp "$BOOT1/hex2pp"
# scheme1 binary itself is staged by runscm_run (so a `(run "scheme1" …)`
# inside boot3-run.scm finds it at cwd-relative ./scheme1).
-runscm_input prelude.scm scheme1/prelude.scm
-runscm_input cc.scm cc/cc.scm
-runscm_input main.scm cc/main.scm
+runscm_input_from_src src scheme1/prelude.scm
+runscm_input_from_src src cc/cc.scm
+runscm_input_from_src src cc/main.scm
-runscm_input backend.M1pp "$BACKEND_M1PP"
-runscm_input frontend.M1pp "$FRONTEND_M1PP"
-runscm_input libp1pp.P1pp "$LIBP1PP"
-runscm_input entry-libc.P1pp "$ENTRY_LIBC"
-runscm_input elf-end.P1pp "$ELF_END"
-runscm_input ELF.hex2 "$ELF_HEX2"
+runscm_input_from_src src "P1/P1-$ARCH.M1pp" backend.M1pp
+runscm_input_from_src src P1/P1.M1pp frontend.M1pp
+runscm_input_from_src src P1/P1pp.P1pp libp1pp.P1pp
+runscm_input_from_src src P1/entry-libc.P1pp
+runscm_input_from_src src P1/elf-end.P1pp
+runscm_input_from_src src vendor-seed/ELF.hex2
-runscm_input tcc.flat.c "$TCC_FLAT"
-runscm_input libc.flat.c "$LIBC_FLAT"
+runscm_input_from_src src tcc/tcc.flat.c
+runscm_input_from_src src libc/libc.flat.c
runscm_export tcc0
runscm_run "${BOOT3_TIMEOUT:-1800}"
diff --git a/scripts/boot4.sh b/scripts/boot4.sh
@@ -19,32 +19,23 @@
## tcc2 = tcc-source compiled by tcc1 ← produced here
## tcc3 = tcc-source compiled by tcc2 ← produced here
##
-## ─── Inputs (host-side, auto-built if missing) ────────────────────────
-## build/$ARCH/vendor/tcc/tcc.flat.c
-## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/{include,lib}
-## — flattened tcc TU + unpacked tree; built
-## via scripts/stage1-flatten.sh --arch
-## $ARCH (host cc -E, no container)
-## build/$ARCH/vendor/mes-libc/libc.flat.c
-## — flattened mes-libc TU; built via
-## scripts/libc-flatten.sh --arch $ARCH
-## (host cc -E, no container)
-##
-## ─── Inputs (sources, copied into staging) ────────────────────────────
-## tcc-libc/$ARCH/start.S — _start, calls __libc_init+main
-## tcc-libc/$ARCH/sys_stubs.S — sys_* syscall wrappers
-## tcc-cc/mem.c — memcpy/memmove/memset/memcmp
-## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/lib/libtcc1.c
+## ─── Inputs (sources, from canonical tree) ───────────────────────────
+## build/$ARCH/src/src/tcc-libc/$ARCH/{start.S,sys_stubs.S}
+## — _start, sys_* syscall wrappers
+## build/$ARCH/src/src/tcc-cc/mem.c memcpy/memmove/memset/memcmp
+## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/libtcc1.c
## (amd64: generic compiler helper runtime)
-## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/lib/lib-arm64.c
+## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/lib-arm64.c
## (aarch64 + riscv64: TFmode soft-float)
-## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/lib/va_list.c
+## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/va_list.c
## (amd64: __va_start / __va_arg)
-## build/$ARCH/vendor/tcc/tcc-0.9.26-1147-gee75a10c/lib/alloca86_64*.S
+## build/$ARCH/src/src/tcc/tcc-0.9.26-1147-gee75a10c/lib/alloca86_64*.S
## (amd64: alloca helpers)
-## build/$ARCH/vendor/tcc/tcc.flat.c — flattened tcc TU
-## build/$ARCH/vendor/mes-libc/libc.flat.c — flattened mes-libc TU
-## scripts/boot-hello.c — smoke binary
+## build/$ARCH/src/src/tcc/tcc.flat.c flattened tcc TU
+## build/$ARCH/src/src/libc/libc.flat.c flattened mes-libc TU
+## build/$ARCH/src/src/test-fixtures/boot-hello.c smoke binary
+## (populated up-front by scripts/prep-src.sh; this stage does
+## no flatten/unpack/patch.)
##
## ─── Inputs (binaries from prior stages) ──────────────────────────────
## build/$ARCH/$DRIVER/boot3/tcc0 — built by scripts/boot3.sh
@@ -92,32 +83,19 @@ esac
BOOT2=build/$ARCH/$DRIVER/boot2
BOOT3=build/$ARCH/$DRIVER/boot3
+SRC=build/$ARCH/src
OUT=build/$ARCH/$DRIVER/boot4
STAGE=build/$ARCH/$DRIVER/.boot4-stage
-TCC_VENDOR=build/$ARCH/vendor/tcc
-TCC_DIR=$TCC_VENDOR/tcc-0.9.26-1147-gee75a10c
-TCC_FLAT=$TCC_VENDOR/tcc.flat.c
-LIBC_FLAT=build/$ARCH/vendor/mes-libc/libc.flat.c
+TCC_PKG=tcc-0.9.26-1147-gee75a10c
+TCC_LIB_REL=tcc/$TCC_PKG/lib
-# ── prerequisite: prior-stage binaries ────────────────────────────────
+# ── prerequisites ─────────────────────────────────────────────────────
require_prev "$BOOT3" tcc0
require_prev "$BOOT2" catm scheme1
-
-# ── prerequisite: host-flattened sources + unpacked tcc tree ──────────
-# Normally these were produced by boot3 (auto-invoked by stage1-flatten
-# / libc-flatten there). Re-check here so boot4 runs standalone if a
-# user has tcc0 but blew away build/$ARCH/vendor/tcc/.
-if [ ! -e "$TCC_FLAT" ] || [ ! -d "$TCC_DIR/include" ] || [ ! -e "$TCC_DIR/lib/lib-arm64.c" ] || [ ! -e "$TCC_VENDOR/stdarg-bridge.h" ]; then
- echo "[$BOOT_TAG] flatten tcc.flat.c (host)"
- scripts/stage1-flatten.sh --arch "$ARCH"
-fi
-if [ ! -e "$LIBC_FLAT" ]; then
- echo "[$BOOT_TAG] flatten libc.flat.c (host)"
- scripts/libc-flatten.sh --arch "$ARCH"
-fi
+[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; }
for f in $LIBTCC1_C_SRCS $LIBTCC1_ASM_SRCS; do
- [ -e "$TCC_DIR/lib/$f" ] || { echo "[$BOOT_TAG] missing $TCC_DIR/lib/$f" >&2; exit 1; }
+ [ -e "$SRC/src/$TCC_LIB_REL/$f" ] || { echo "[$BOOT_TAG] missing $SRC/src/$TCC_LIB_REL/$f" >&2; exit 1; }
done
# ── stage inputs and run scheme1 + boot4 run.scm under $DRIVER ────────
@@ -129,22 +107,22 @@ scripts/boot4-gen-runscm.sh "$ARCH" "$RUNSCM"
echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$RUNSCM") lines"
runscm_scheme1 "$BOOT2/scheme1"
-runscm_prelude scheme1/prelude.scm
+runscm_prelude "$SRC/src/scheme1/prelude.scm"
runscm_runscm "$RUNSCM"
runscm_input tcc0 "$BOOT3/tcc0"
runscm_input catm "$BOOT2/catm"
-runscm_input start.S "tcc-libc/$ARCH/start.S"
-runscm_input sys_stubs.S "tcc-libc/$ARCH/sys_stubs.S"
-runscm_input mem.c tcc-cc/mem.c
+runscm_input_from_src src "tcc-libc/$ARCH/start.S"
+runscm_input_from_src src "tcc-libc/$ARCH/sys_stubs.S"
+runscm_input_from_src src tcc-cc/mem.c
for f in $LIBTCC1_C_SRCS $LIBTCC1_ASM_SRCS; do
- runscm_input "$f" "$TCC_DIR/lib/$f"
+ runscm_input_from_src src "$TCC_LIB_REL/$f"
done
-runscm_input tcc.flat.c "$TCC_FLAT"
-runscm_input libc.flat.c "$LIBC_FLAT"
-runscm_input hello.c scripts/boot-hello.c
+runscm_input_from_src src tcc/tcc.flat.c
+runscm_input_from_src src libc/libc.flat.c
+runscm_input_from_src src test-fixtures/boot-hello.c hello.c
runscm_export tcc1
runscm_export tcc2
diff --git a/scripts/boot5-gen-runscm.sh b/scripts/boot5-gen-runscm.sh
@@ -14,10 +14,12 @@
##
## Conventions (cwd-relative; resolves to / under seed init, /work under
## podman bind-mount):
-## musl tree in/tmp/musl-1.2.5/<rel-path> (read-only)
-## pre-gen hdrs in/tmp/musl-1.2.5/obj/include/bits/{alltypes,syscall}.h,
-## in/tmp/musl-1.2.5/obj/src/internal/version.h
-## .o outputs out/obj/musl-1.2.5/<src-with-.o> (rw; pre-mkdir'd by host)
+## musl tree in/musl/<rel-path> (read-only; canonical
+## tree from prep-src/
+## prep-musl)
+## pre-gen hdrs in/musl/obj/include/bits/{alltypes,syscall}.h,
+## in/musl/obj/src/internal/version.h
+## .o outputs out/obj/musl/<src-with-.o> (rw; pre-mkdir'd by host)
## tcc binary in/tcc (input)
## libtcc1.a in/libtcc1.a (input)
## stdarg bridge in/tcc-stdarg-bridge.h
@@ -33,8 +35,8 @@ SRCS=$STAGE_HOST/build-srcs.txt
CRT_MODE=$(cat "$STAGE_HOST/crt-mode")
[ -e "$SRCS" ] || { echo "missing $SRCS" >&2; exit 1; }
-CIN=in/tmp/musl-1.2.5
-COUT=out/obj/musl-1.2.5
+CIN=in/musl
+COUT=out/obj/musl
# Mirrors boot5.sh's CFLAGS_BASE exactly; the only difference is that
# every per-arg token is quoted as its own scheme bytevector. The leading
diff --git a/scripts/boot5.sh b/scripts/boot5.sh
@@ -12,26 +12,15 @@
## — boot4's verified self-host tcc
## build/$ARCH/$DRIVER/boot4/libtcc1.a
## — boot4's tcc runtime archive
-## vendor/upstream/musl-1.2.5.tar.gz
-## — pristine musl source
-## vendor/upstream/musl-1.2.5-overrides/
-## — tree of files that replace upstream
-## ones (tcc-compat patches; the post-
-## patch state vendored directly so the
-## build needs no `patch` binary). See
-## docs/MUSL.md.
-## vendor/upstream/musl-1.2.5-deletes.txt
-## — list of upstream files removed by the
-## same patch set (one path per line,
-## relative to musl-1.2.5/).
-## build/$ARCH/vendor/tcc/stdarg-bridge.h
-## — per-arch __builtin_va_list bridge,
-## generated by scripts/stage1-flatten.sh
-## (shared with boot3/boot4; the file is
-## byte-identical across arches but a
-## per-arch copy is written so every
-## artifact under build/$ARCH/ comes from
-## a single boot.sh $ARCH invocation)
+## build/$ARCH/src/src/musl/ — canonical musl tree (overrides merged,
+## deletes applied, alltypes.h/syscall.h
+## generated, per-arch skip filter
+## applied). Built by prep-src.sh +
+## prep-musl.sh.
+## build/$ARCH/src/src/tcc/stdarg-bridge.h
+## — per-arch __builtin_va_list bridge.
+## build/$ARCH/src/src/test-fixtures/boot-hello.c
+## — smoke binary linked at the end.
##
## ─── Tools ────────────────────────────────────────────────────────────
## In container: scratch + busybox (no libc, no /etc, no resolver).
@@ -56,55 +45,33 @@ driver_init empty
BOOT2=build/$ARCH/$DRIVER/boot2
BOOT4=build/$ARCH/$DRIVER/boot4
+SRC=build/$ARCH/src
OUT=build/$ARCH/$DRIVER/boot5
STAGE=build/$ARCH/$DRIVER/.boot5-stage
-MUSL_TARBALL=vendor/upstream/musl-1.2.5.tar.gz
-MUSL_OVERRIDES=vendor/upstream/musl-1.2.5-overrides
-MUSL_DELETES=vendor/upstream/musl-1.2.5-deletes.txt
-MUSL_GENERATED=vendor/upstream/musl-1.2.5-generated/$MUSL_ARCH
-MUSL_SKIP=vendor/upstream/musl-1.2.5-skip-$ARCH.txt
-BRIDGE_FILE=build/$ARCH/vendor/tcc/stdarg-bridge.h
+MUSL_DIR=$SRC/src/musl
# ── prerequisites ─────────────────────────────────────────────────────
require_prev "$BOOT4" tcc3
require_prev "$BOOT2" catm scheme1
[ -e "$BOOT4/libtcc1.a" ] || { echo "[$BOOT_TAG] missing $BOOT4/libtcc1.a (run scripts/boot4.sh $ARCH)" >&2; exit 1; }
-[ -e "$MUSL_TARBALL" ] || { echo "[$BOOT_TAG] missing $MUSL_TARBALL" >&2; exit 1; }
-[ -d "$MUSL_OVERRIDES" ] || { echo "[$BOOT_TAG] missing $MUSL_OVERRIDES" >&2; exit 1; }
-[ -e "$MUSL_DELETES" ] || { echo "[$BOOT_TAG] missing $MUSL_DELETES" >&2; exit 1; }
-[ -d "$MUSL_GENERATED" ] || { echo "[$BOOT_TAG] missing $MUSL_GENERATED (run scripts/musl-vendor.sh)" >&2; exit 1; }
-[ -e "$MUSL_SKIP" ] || { echo "[$BOOT_TAG] missing $MUSL_SKIP (run scripts/boot5-calibrate.sh $ARCH)" >&2; exit 1; }
-[ -e "$BRIDGE_FILE" ] || { echo "[$BOOT_TAG] missing $BRIDGE_FILE (run scripts/stage1-flatten.sh)" >&2; exit 1; }
-
-# ── prepare staging dirs and musl tree on host ────────────────────────
+[ -d "$MUSL_DIR" ] || { echo "[$BOOT_TAG] missing $MUSL_DIR — run scripts/prep-src.sh $ARCH and scripts/prep-musl.sh $ARCH" >&2; exit 1; }
+[ -e "$MUSL_DIR/skip.txt" ] || { echo "[$BOOT_TAG] missing $MUSL_DIR/skip.txt — run scripts/prep-musl.sh $ARCH" >&2; exit 1; }
+[ -e "$SRC/src/tcc/stdarg-bridge.h" ] || { echo "[$BOOT_TAG] missing $SRC/src/tcc/stdarg-bridge.h — run scripts/prep-src.sh $ARCH" >&2; exit 1; }
+
+# ── prepare staging dirs ──────────────────────────────────────────────
# $STAGE/in/ — read-only inputs (becomes /work/in or in/ in tmpfs)
# $STAGE/out/ — writable outputs (becomes /work/out or out/ in tmpfs)
-# $STAGE/_host/ — host-side scratch (enumeration outputs, intermediates);
-# never visible to the container/kernel
-# runscm_init wipes $STAGE then mkdirs in/ and out/. Do that first so
-# we control the layout below.
+# $STAGE/_host/ — host-side scratch (enumeration outputs); never
+# visible to the container/kernel.
. scripts/lib-runscm.sh
runscm_init "$STAGE" "$OUT"
mkdir -p "$STAGE/_host"
-# Extract musl directly into in/tmp/musl-1.2.5/, then apply overrides +
-# deletes — gives us a fully-prepared tree we can enumerate to drive the
-# (kaem-friendly) flat run.scm. The podman bind mount reads it in place;
-# the seed driver picks it up via the `find in -type f` cpio walk.
-MUSL_DIR=$STAGE/in/tmp/musl-1.2.5
-mkdir -p "$STAGE/in/tmp"
-tar xzf "$MUSL_TARBALL" -C "$STAGE/in/tmp/"
-cp -R "$MUSL_OVERRIDES/." "$MUSL_DIR/"
-while read -r p; do
- [ -n "$p" ] && rm -rf "$MUSL_DIR/$p"
-done < "$MUSL_DELETES"
-
-# ── enumerate musl sources on the host (kaem-friendly: no for/while/
-# case/${%}/${#}/$((..)) inside the container) ───────────────────────
+# ── enumerate musl sources from the canonical tree ────────────────────
# Mirrors musl's Makefile rule: a per-arch override (under
-# $d/$MUSL_ARCH/) replaces the same-stem base file (under $d/). We
-# subtract the calibration skip list so the run.scm never needs an
-# `if $TCC ...; then ok else skip fi` branch.
+# $d/$MUSL_ARCH/) replaces the same-stem base file (under $d/). The
+# canonical tree already had the per-arch skip filter applied by
+# prep-musl.sh, so no skip subtraction is needed here.
SRC_TOP="src/aio src/conf src/crypt src/ctype src/dirent
src/env src/errno src/exit src/fcntl src/fenv src/internal
src/ipc src/legacy src/linux src/locale src/malloc
@@ -133,7 +100,7 @@ SRC_TOP="src/aio src/conf src/crypt src/ctype src/dirent
) > "$STAGE/_host/arch.txt"
# REPLACED: bases that have arch-specific overrides (drop them from
-# BASE). KEEP = (BASE - REPLACED) ∪ ARCH, then minus calibration skips.
+# BASE). KEEP = (BASE - REPLACED) ∪ ARCH.
awk -v ARCH="$MUSL_ARCH" '
{
sub(/\.[^.]*$/, "") # strip extension
@@ -155,17 +122,10 @@ awk -v REPF="$STAGE/_host/replaced.txt" '
}
' "$STAGE/_host/base.txt" > "$STAGE/_host/keep_base.txt"
-cat "$STAGE/_host/keep_base.txt" "$STAGE/_host/arch.txt" | sort -u > "$STAGE/_host/keep.txt"
-
-# Subtract the calibration skip list. Lines without a / are bogus; the
-# skip file is one path per line, comments allowed via leading '#'.
-awk -v SKIPF="$MUSL_SKIP" '
- BEGIN { while ((getline l < SKIPF) > 0) if (l !~ /^#/ && l != "") skip[l] = 1 }
- { if (!($0 in skip)) print }
-' "$STAGE/_host/keep.txt" > "$STAGE/_host/build-srcs.txt"
+cat "$STAGE/_host/keep_base.txt" "$STAGE/_host/arch.txt" | sort -u > "$STAGE/_host/build-srcs.txt"
n_src=$(wc -l < "$STAGE/_host/build-srcs.txt")
-n_skip=$(wc -l < "$MUSL_SKIP")
+n_skip=$(grep -cv '^[[:space:]]*\(#\|$\)' "$MUSL_DIR/skip.txt" || true)
echo "[$BOOT_TAG] keep=$n_src skip=$n_skip (calibrated)"
# Record CRT mode (asm vs c) so the gen-runscm step picks the right
@@ -176,8 +136,8 @@ else
echo c > "$STAGE/_host/crt-mode"
fi
-# Pre-create per-source obj/ directories under $STAGE/out/obj/musl-1.2.5/
-# so scheme1's (run "in/tcc" -c …) doesn't need to mkdir at runtime (tcc
+# Pre-create per-source obj/ directories under $STAGE/out/obj/musl/ so
+# scheme1's (run "in/tcc" -c …) doesn't need to mkdir at runtime (tcc
# errors out if the parent dir is missing, and scheme1 has no mkdir
# primitive).
awk '
@@ -186,36 +146,30 @@ awk '
if (match($0, /\/[^\/]*$/)) print substr($0, 1, RSTART - 1)
}
' "$STAGE/_host/build-srcs.txt" | sort -u > "$STAGE/_host/build-objdirs.txt"
-COBJ=$STAGE/out/obj/musl-1.2.5
+COBJ=$STAGE/out/obj/musl
mkdir -p "$COBJ/crt"
while read -r d; do mkdir -p "$COBJ/$d"; done < "$STAGE/_host/build-objdirs.txt"
-# Pre-generated alltypes.h + syscall.h for $MUSL_ARCH; live under in/
-# (read at compile time via -I$CIN/obj/include and -I$CIN/obj/src/internal).
-mkdir -p "$MUSL_DIR/obj/include/bits" "$MUSL_DIR/obj/src/internal"
-cp "$MUSL_GENERATED/alltypes.h" "$MUSL_DIR/obj/include/bits/alltypes.h"
-cp "$MUSL_GENERATED/syscall.h" "$MUSL_DIR/obj/include/bits/syscall.h"
-echo '#define VERSION "1.2.5-tcc-boot5"' > "$MUSL_DIR/obj/src/internal/version.h"
-
# ── generate run.scm and stage chain binaries ─────────────────────────
RUNSCM=$STAGE/run.scm
scripts/boot5-gen-runscm.sh "$MUSL_ARCH" "$STAGE/_host" "$RUNSCM"
echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$RUNSCM") lines, $(wc -c <"$RUNSCM") bytes"
runscm_scheme1 "$BOOT2/scheme1"
-runscm_prelude scheme1/prelude.scm
+runscm_prelude "$SRC/src/scheme1/prelude.scm"
runscm_runscm "$RUNSCM"
# Chain binaries staged at flat in/ root (cwd-relative names in run.scm).
runscm_input tcc "$BOOT4/tcc3"
runscm_input libtcc1.a "$BOOT4/libtcc1.a"
runscm_input catm "$BOOT2/catm"
-runscm_input tcc-stdarg-bridge.h "$BRIDGE_FILE"
-runscm_input hello.c scripts/boot-hello.c
+runscm_input_from_src src tcc/stdarg-bridge.h tcc-stdarg-bridge.h
+runscm_input_from_src src test-fixtures/boot-hello.c hello.c
-# Musl tree is already laid out under $STAGE/in/tmp/musl-1.2.5/ above;
-# both drivers pick it up automatically (podman bind-mounts $STAGE/in;
-# seed packs `find in -type f` into the cpio).
+# Stage the canonical musl tree under in/musl/. Both drivers pick it
+# up automatically (podman bind-mounts $STAGE/in; seed packs
+# `find in -type f` into the cpio).
+runscm_input_tree_from_src musl src musl
runscm_export libc.a
runscm_export crt1.o
diff --git a/scripts/boot6.sh b/scripts/boot6.sh
@@ -9,16 +9,14 @@
## build/$ARCH/$DRIVER/boot4/tcc3 — boot4's verified self-host tcc
## (compiler + linker)
## build/$ARCH/$DRIVER/boot2/scheme1 — driver runtime
-## seed-kernel/arch/aarch64/kernel.S
-## — boot stub, vector table, asm thunks,
-## trailing 64 KB stack reserved as
-## plain `.bss` (kstack_top is the end
-## label of that reservation)
-## seed-kernel/kernel.c — DTB parse, MMU bring-up, syscalls,
+## build/$ARCH/src/src/kernel/arch/$ARCH/{kernel.S,mmu.c,arch.h}
+## — per-arch boot stub, MMU setup, header
+## build/$ARCH/src/src/kernel/kernel.c
+## — DTB parse, MMU bring-up, syscalls,
## virtio-blk, tmpfs, ELF loader
-## seed-kernel/arch/aarch64/mmu.c
-## — arm64 page-table setup and pool swap
-## tcc-cc/mem.c — memcpy/memset/memmove/memcmp
+## build/$ARCH/src/src/tcc-cc/mem.c memcpy/memset/memmove/memcmp
+## build/$ARCH/src/src/kernel/scripts/elf-pvh-note.c
+## — amd64-only post-link PT_NOTE fixup
##
## ─── Tools ────────────────────────────────────────────────────────────
## In container: scratch + busybox (boot2-empty:$ARCH).
@@ -48,14 +46,16 @@ driver_init empty
OUT_FILE=$KERNEL_NAME
BOOT2=build/$ARCH/$DRIVER/boot2
BOOT4=build/$ARCH/$DRIVER/boot4
+SRC=build/$ARCH/src
OUT=build/$ARCH/$DRIVER/boot6
STAGE=build/$ARCH/$DRIVER/.boot6-stage
# ── prerequisites ─────────────────────────────────────────────────────
require_prev "$BOOT4" tcc3
require_prev "$BOOT2" scheme1
-for f in seed-kernel/arch/$ARCH/kernel.S seed-kernel/arch/$ARCH/mmu.c seed-kernel/arch/$ARCH/arch.h seed-kernel/kernel.c tcc-cc/mem.c; do
- [ -f "$f" ] || { echo "[$BOOT_TAG] missing $f" >&2; exit 1; }
+[ -d "$SRC" ] || { echo "[$BOOT_TAG] missing $SRC — run scripts/prep-src.sh $ARCH" >&2; exit 1; }
+for f in kernel/arch/$ARCH/kernel.S kernel/arch/$ARCH/mmu.c kernel/arch/$ARCH/arch.h kernel/kernel.c tcc-cc/mem.c; do
+ [ -f "$SRC/src/$f" ] || { echo "[$BOOT_TAG] missing $SRC/src/$f" >&2; exit 1; }
done
# ── stage inputs and run scheme1 + run.scm under $DRIVER ──────────────
@@ -67,28 +67,28 @@ scripts/boot6-gen-runscm.sh "$ARCH" "$RUNSCM"
echo "[$BOOT_TAG] generated run.scm: $(wc -l <"$RUNSCM") lines"
runscm_scheme1 "$BOOT2/scheme1"
-runscm_prelude scheme1/prelude.scm
+runscm_prelude "$SRC/src/scheme1/prelude.scm"
runscm_runscm "$RUNSCM"
runscm_input tcc3 "$BOOT4/tcc3"
-runscm_input kernel.S seed-kernel/arch/$ARCH/kernel.S
-runscm_input kernel.c seed-kernel/kernel.c
-runscm_input arch.h seed-kernel/arch/$ARCH/arch.h
-runscm_input mmu.c seed-kernel/arch/$ARCH/mmu.c
-runscm_input mem.c tcc-cc/mem.c
+runscm_input_from_src src "kernel/arch/$ARCH/kernel.S"
+runscm_input_from_src src kernel/kernel.c
+runscm_input_from_src src "kernel/arch/$ARCH/arch.h"
+runscm_input_from_src src "kernel/arch/$ARCH/mmu.c"
+runscm_input_from_src src tcc-cc/mem.c
# amd64 needs a post-link fixup — tcc3 doesn't emit PT_NOTE phdrs, so
# QEMU's PVH `-kernel` path can't find the Xen 18 note that names the
# 32-bit entry. The fixup is a hosted C tool we build inside the same
# run.scm with tcc3 + boot4's libc/crt1/libtcc1, then run on kernel.elf.
if [ "$ARCH" = amd64 ]; then
- runscm_input elf-pvh-note.c seed-kernel/scripts/elf-pvh-note.c
+ runscm_input_from_src src kernel/scripts/elf-pvh-note.c
runscm_input crt1.o "$BOOT4/crt1.o"
runscm_input libc.a "$BOOT4/libc.a"
runscm_input libtcc1.a "$BOOT4/libtcc1.a"
fi
runscm_export "$OUT_FILE"
-runscm_run 1200
+runscm_run "${BOOT6_TIMEOUT:-1200}"
echo "[$BOOT_TAG] OK -> $OUT/$OUT_FILE ($(wc -c <"$OUT/$OUT_FILE") bytes)"
diff --git a/scripts/lib-pipeline.sh b/scripts/lib-pipeline.sh
@@ -82,6 +82,17 @@ pipeline_input() {
P_INPUT_NAMES="$P_INPUT_NAMES $name"
}
+# pipeline_input_from_src — pull a file from the canonical generated
+# source tree at build/$ARCH/src/{bin,src}/<subpath>. Stages under
+# in/<name> where <name> defaults to basename(subpath); pass an
+# override as the optional third argument when the staged name must
+# differ (e.g. P1.M1 vs P1-aarch64.M1).
+pipeline_input_from_src() {
+ _kind=$1; _subpath=$2; _name=${3:-}
+ [ -n "$_name" ] || _name=$(basename "$_subpath")
+ pipeline_input "$_name" "build/$ARCH/src/$_kind/$_subpath"
+}
+
# Look up a token: if it names an input, prefix `in/`; if it names a
# previously produced output, prefix `out/`; else leave unchanged.
_p_lookup() {
diff --git a/scripts/lib-runscm.sh b/scripts/lib-runscm.sh
@@ -70,6 +70,22 @@ runscm_input_tree() {
done
}
+# runscm_input_from_src — pull a file from the canonical generated
+# source tree at build/$ARCH/src/{bin,src}/<subpath>. Stages under
+# in/<name> where <name> defaults to basename(subpath).
+runscm_input_from_src() {
+ _kind=$1; _subpath=$2; _name=${3:-}
+ [ -n "$_name" ] || _name=$(basename "$_subpath")
+ runscm_input "$_name" "build/$ARCH/src/$_kind/$_subpath"
+}
+
+# runscm_input_tree_from_src — same as runscm_input_tree, but the
+# source root is build/$ARCH/src/{bin,src}/<subpath>.
+runscm_input_tree_from_src() {
+ _prefix=$1; _kind=$2; _subpath=$3
+ runscm_input_tree "$_prefix" "build/$ARCH/src/$_kind/$_subpath"
+}
+
runscm_export() {
S_EXPORTS="$S_EXPORTS $1"
}
diff --git a/scripts/prep-src.sh b/scripts/prep-src.sh
@@ -21,6 +21,8 @@
## cc/ cc.scm, main.scm
## tcc/ tcc.flat.c, stdarg-bridge.h, plus
## tcc-0.9.26-1147-gee75a10c/{include,lib}
+## tcc-libc/$ARCH/ start.S, sys_stubs.S
+## tcc-cc/ mem.c (memcpy/memmove/memset/memcmp)
## libc/ libc.flat.c (mes-libc flattened)
## musl/ filtered musl-1.2.5 tree (overrides
## merged, deletes applied, generated
@@ -28,6 +30,7 @@
## prep-musl.sh applies the per-arch
## skip filter on top.
## kernel/ seed-kernel sources for this arch
+## test-fixtures/ boot-hello.c smoke binary
##
## A0 is split: prep-src.sh runs before boot0 and produces everything
## that doesn't need a working compiler. prep-musl.sh runs after boot4
@@ -91,6 +94,19 @@ mkdir -p "$DST_SRC/cc"
cp cc/cc.scm "$DST_SRC/cc/cc.scm"
cp cc/main.scm "$DST_SRC/cc/main.scm"
+# tcc-libc: per-arch _start + sys_* wrappers consumed by boot4.
+mkdir -p "$DST_SRC/tcc-libc/$ARCH"
+cp "tcc-libc/$ARCH/start.S" "$DST_SRC/tcc-libc/$ARCH/start.S"
+cp "tcc-libc/$ARCH/sys_stubs.S" "$DST_SRC/tcc-libc/$ARCH/sys_stubs.S"
+
+# tcc-cc: tiny mem helpers consumed by boot4 + boot6.
+mkdir -p "$DST_SRC/tcc-cc"
+cp tcc-cc/mem.c "$DST_SRC/tcc-cc/mem.c"
+
+# Smoke binary linked by boot4 + boot5.
+mkdir -p "$DST_SRC/test-fixtures"
+cp scripts/boot-hello.c "$DST_SRC/test-fixtures/boot-hello.c"
+
# ── (3) seed-kernel sources for this arch ─────────────────────────────
mkdir -p "$DST_SRC/kernel/arch/$ARCH" "$DST_SRC/kernel/user" "$DST_SRC/kernel/scripts"
cp seed-kernel/kernel.c "$DST_SRC/kernel/kernel.c"