boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit e4bfcde162d44394a1bbd0ee55becafb57087502
parent 6077ca5c6ba45b34582013d0569f8869f5d34b63
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Wed,  6 May 2026 07:26:16 -0700

seed-kernel: boot6 builds amd64 + riscv64; DRIVER=seed extended to riscv64

boot6 now drives tcc3 to compile + link the seed kernel for amd64 and
riscv64 in addition to aarch64. The riscv64 kernel boots under
QEMU+OpenSBI and runs simple user-mode binaries (boot0 stages 1–3 pass
under DRIVER=seed). Stage 4 still hits a user-mode null deref I haven't
pinned down — see docs/SEED-RISCV64-TODO.md for the open work.

tcc / mes-libc patches needed for riscv64:
- abtol-long-accumulator: int → long accumulator in mes-libc abtol so
  strtoull doesn't sign-extend 0x80200000 to 0xffffffff80200000 on
  tcc3's -Wl,-Ttext= parse.
- riscv64-cvt-int-zext + riscv64-gen-cvt-sxtw: gen_cvt_sxtw branches on
  signedness (addiw for signed, slli;srli for unsigned) and is always
  invoked on riscv64. Without this, (u64)be32(p) sign-extends DTB cells.
- riscv64-load-ptr-zext: widen the existing VT_LLONG zext at constant
  load to also cover VT_PTR / VT_FUNC. Without this, (u8 *)0x8b000000UL
  loads as 0xffffffff8b000000 (lui sign-extends).

riscv64 kernel.S adjustments to match tcc 0.9.26's assembler:
- macros use .long (32-bit) not .word (tcc's .word is 16-bit).
- SD/SW emit base-first since tcc's riscv64 assembler parses three-comma
  stores as <rs1>,<rs2>,<imm>, not GAS's <src>,<imm>(<base>).
- bgeu offset 12 → 16 (off-by-one bug; 12 lands on the J(1b) inside
  the bss-zero loop instead of the next-stage label).
- trap_entry saves x5/x6 *before* using t0/t1 as scratch, so user code
  that holds state in t0/t1 across an ecall sees them preserved per the
  Linux RISC-V syscall ABI.

Boot driver wiring:
- boot[0-6].sh + lib-{runscm,pipeline}.sh accept SEED_ARCH=riscv64 and
  dispatch to qemu-system-riscv64 with the right kernel filename
  (kernel.elf vs aarch64's Image).
- boot6.sh + boot6-gen-runscm.sh produce ELF for amd64/riscv64 (no
  --oformat=binary) at each arch's link base.
- boot4.sh gains TCC_BOOTSTRAP_RELAX_FIXEDPOINT=1: codegen-altering tcc
  patches need a second bootstrap pass before tcc2==tcc3 settles; the
  follow-up run (started from this run's tcc3) hits the fixed point
  with no extra knob.

amd64 kernel.S already had .byte stubs for the 32-bit PVH boot stub
since tcc's x86_64 assembler can't emit .code32; that path still has
known runtime blockers (PVH note section type + PT_NOTE phdr) and is
not exercised by this commit.

Diffstat:
Adocs/SEED-RISCV64-TODO.md | 196+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mscripts/boot.sh | 21+++++++++++++--------
Mscripts/boot0.sh | 10+++++++---
Mscripts/boot1.sh | 10+++++++---
Mscripts/boot2.sh | 10+++++++---
Mscripts/boot3.sh | 10+++++++---
Mscripts/boot4.sh | 25++++++++++++++++++++-----
Mscripts/boot5.sh | 13++++++++++---
Mscripts/boot6-gen-runscm.sh | 24++++++++++++++++++------
Mscripts/boot6.sh | 46++++++++++++++++++++++++++--------------------
Mscripts/lib-pipeline.sh | 41++++++++++++++++++++++++++++++-----------
Mscripts/lib-runscm.sh | 45++++++++++++++++++++++++++++++++++-----------
Mscripts/libc-flatten.sh | 11+++++++++++
Ascripts/simple-patches/tcc-0.9.26/riscv64-cvt-int-zext.after | 28++++++++++++++++++++++++++++
Ascripts/simple-patches/tcc-0.9.26/riscv64-cvt-int-zext.before | 18++++++++++++++++++
Ascripts/simple-patches/tcc-0.9.26/riscv64-gen-cvt-sxtw.after | 19+++++++++++++++++++
Ascripts/simple-patches/tcc-0.9.26/riscv64-gen-cvt-sxtw.before | 5+++++
Ascripts/simple-patches/tcc-0.9.26/riscv64-load-ptr-zext.after | 11+++++++++++
Ascripts/simple-patches/tcc-0.9.26/riscv64-load-ptr-zext.before | 5+++++
Mscripts/stage1-flatten.sh | 12++++++++++++
Mseed-kernel/arch/amd64/kernel.S | 132+++++++++++++++++++++++++++++++++++++++----------------------------------------
Mseed-kernel/arch/riscv64/kernel.S | 357+++++++++++++++++++++++++++++++++++++++++++++++++------------------------------
Avendor/mes-libc/patches/abtol-long-accumulator.after | 9+++++++++
Avendor/mes-libc/patches/abtol-long-accumulator.before | 3+++
24 files changed, 782 insertions(+), 279 deletions(-)

diff --git a/docs/SEED-RISCV64-TODO.md b/docs/SEED-RISCV64-TODO.md @@ -0,0 +1,196 @@ +# riscv64 seed-kernel TODO + +Working doc. Captures the open work needed to get +`DRIVER=seed ./scripts/boot.sh riscv64` to a clean exit, mirroring the +aarch64 path. Pairs with `docs/OS.md` (kernel contract) and +`docs/TCC.md` (compiler). + +## Goal + +`DRIVER=seed ./scripts/boot.sh riscv64` should run the full +boot0→boot6 chain entirely *inside* the tcc-built riscv64 seed kernel +(the kernel is its own build driver, with podman only used to mint +the very first kernel image). This is the non-negotiable end-to-end +validation: it exercises every kernel path the chain depends on +(ELF load, MMU, syscalls, virtio-blk DMA, fork/exec, exit) under +real workloads — boot4 alone runs ~5000 user-mode tcc invocations +inside the kernel. + +The aarch64 path already passes; this work brings riscv64 to parity. + +## What works (May 2026) + +- `boot6 riscv64` builds a clean ELF kernel from tcc3, located at + `build/riscv64/boot6/kernel.elf`. Loads under OpenSBI on + `qemu-system-riscv64 -machine virt`. +- Kernel reaches `kmain`, parses DTB (mem 0x80000000), brings up + virtio-blk, parses cpio, lists tmpfs, loads ELF, erets to user. +- DRIVER=seed wiring is in place: `boot.sh`, `boot[0-5].sh`, + `boot6.sh`, `lib-runscm.sh`, `lib-pipeline.sh` all dispatch to + `qemu-system-riscv64` and the right kernel filename + (`kernel.elf` vs aarch64's `Image`) when `SEED_ARCH=riscv64`. +- Boot0 stages 1, 2, 3 run cleanly under DRIVER=seed: the kernel + runs `hex0-seed`, `hex0`, `hex1` in user mode and SEEDFS-extracts + the correct outputs. This exercises ELF load, eret_to_user, + openat/read/write/close/lseek/exit syscalls, and virtio-blk DMA. + +## Blocker: boot0 stage 4 user-mode panic + +Stage 4 runs `hex2 in/catm.hex2 out/catm` inside the seed kernel. +Hex2 boots, processes the file, and panics partway through with: + +``` +PANIC: user sync, ESR=0x000000000000000d ELR=0x0000000000600730 FAR=0x0000000000000000 +``` + +— a user-mode load page fault on a null pointer. Same `hex2` binary +runs correctly under podman/Linux on the same input, so the bug is +on the seed-kernel side, not in the assembled hex2. + +The fault is at `lbu t4, 0(t2)` after `ld t2, 16(t0)`; t2=0 means the +ld read a zero from memory at `t0+16`. Disassembly says t0 should +hold s1 (set by `addi t0, s1, 0` four instructions earlier), but the +trapframe dump on the panic path shows t0=0x6007f0 (initial brk +address), not s1=0x600905, with no instruction in between that +modifies t0. + +### Investigation so far + +One real bug found and fixed: `trap_entry` was clobbering t0/t1 +(using them to read sscratch and to reach `saved_user_sp`) **before** +saving them to the trapframe. Linux's RISC-V syscall ABI preserves +all GPRs except a0; user code that holds state in t0/t1 across an +ecall would otherwise see kernel garbage on return. The fix in +`seed-kernel/arch/riscv64/kernel.S` reorders the saves: x5 and x6 +are now stashed before any kernel scratch use. + +After the fix the dump shows t1=0x6007e8 correctly (user value at +trap time), but t0 still reads back as 0x6007f0 — so something else +is going on. Candidates, in order of decreasing likelihood: + +1. **More tcc-riscv64 codegen bugs in the kernel itself.** We already + landed three: `abtol-long-accumulator` (mes-libc), `riscv64-cvt-int-zext` + + `riscv64-gen-cvt-sxtw` (u32→u64 didn't zero-extend), and + `riscv64-load-ptr-zext` (lui sign-extending pointer constants). + The kernel's trap entry/exit asm is the most exercised path; if + tcc miscompiles any C in trap_sync that touches the trapframe, + the dump reads garbage. +2. **Hex2 internal layout I'm misreading without source.** Possible + but doesn't explain why podman/Linux works on the same bytes. +3. **A trap_entry recursion I haven't identified** — e.g., a fault + inside trap_entry's saves that triggers a second pass through and + overwrites the original trapframe before C sees it. + +The dumps to confirm or rule out (1) are checked in but commented +out; the trap_sync `[sc] nr=…` syscall trace also stays out by +default to keep boot transcripts short. + +## How to repro + +```sh +# One-time prereq (10–15 min): +DRIVER=podman ./scripts/boot.sh riscv64 + +# Stash the kernel so the wipe in DRIVER=seed below preserves it: +mkdir -p build/.seed-bootstrap/riscv64 +cp build/riscv64/boot6/kernel.elf build/.seed-bootstrap/riscv64/ + +# Reproduce the panic (~3 min into the run, in boot0 stage 4): +DRIVER=seed ./scripts/boot.sh riscv64 +``` + +The full per-stage QEMU transcripts land in +`build/riscv64/.boot0-stage/s04/transcript.txt`. + +## Smaller-scope reproducer + +Once the chain has been built once, the failing stage can be replayed +without re-running boot0 stages 1–3: + +```sh +mkdir -p build/.qtest/s04/in +cp build/riscv64/boot0/hex2 build/.qtest/s04/init +cp vendor/seed/riscv64/catm.hex2 build/.qtest/s04/in/catm.hex2 +chmod +x build/.qtest/s04/init +( cd build/.qtest/s04 && { echo init; find in -type f; } | sort -u | \ + cpio -o -H newc 2>/dev/null ) > build/.qtest/s04/in.img +truncate -s 256M build/.qtest/s04/out.img + +qemu-system-riscv64 -machine virt -m 2048M -nographic -no-reboot \ + -global virtio-mmio.force-legacy=false \ + -kernel build/riscv64/boot6/kernel.elf \ + -drive file=build/.qtest/s04/in.img,if=none,format=raw,id=hd0,readonly=on \ + -device virtio-blk-device,drive=hd0 \ + -drive file=build/.qtest/s04/out.img,if=none,format=raw,id=hd1 \ + -device virtio-blk-device,drive=hd1 \ + -append "hex2 in/catm.hex2 out/catm" +``` + +To turn on the per-syscall trace and panic-time register dump that +nailed down the trap_entry bug, restore the prints around +`trap_sync()` in `seed-kernel/kernel.c` (see git history for the +exact diagnostics — they were removed before commit to keep +transcripts clean). + +## Rough work plan + +1. **Re-add the diagnostic prints under a compile-time flag** so + they aren't free-text deletes and don't pollute the boot logs by + default. +2. **Identify the t0 mismatch.** Most likely path: write a tiny + trap_entry self-test that has the kernel do a deliberate ecall in + a known register state and assert tf->x[5] reads back the value + it was set to before the ecall. If it fails, the bug is in + trap_entry itself (asm-level); if it passes, the bug is somewhere + between hex2's PC 0x600720 and 0x600730. +3. **Walk forward from there.** Each subsequent stage of the chain + may surface new tcc-riscv64 codegen issues — boot1, boot2's + scheme1, boot3's cc.scm-built tcc0, boot4's tcc1/2/3 self-host — + so expect this to be N rounds of *kernel runs → fault → tcc patch + or kernel asm fix → boot4 rebuild*. The TCC_BOOTSTRAP_RELAX_FIXEDPOINT + knob in `boot4.sh` is there exactly for this loop: each + codegen-altering tcc patch needs one extra bootstrap pass before + tcc2 == tcc3 settles. + +## Patches and source changes already landed + +- `vendor/mes-libc/patches/abtol-long-accumulator.{before,after}` — + `int i` → `long i` so `strtoull("0x80200000", …, 16)` returns + `0x80200000` instead of sign-extending to `0xffffffff80200000`. + Without this, tcc3 mishandles `-Wl,-Ttext=0x80200000` on the + riscv64 link line and the resulting ELF is unloadable. +- `scripts/simple-patches/tcc-0.9.26/riscv64-cvt-int-zext.{before,after}` + + `riscv64-gen-cvt-sxtw.{before,after}` — make `gen_cvt_sxtw` emit + `addiw` for signed and `slli;srli` for unsigned, and remove the + call-site gate that skipped the unsigned case. Without this, + `(u64)be32(p)` in the seed kernel's DTB parser sign-extends + cells whose top bit is set, so `mem_start = 0x80000000` reads + back as `0xffffffff80000000`. +- `scripts/simple-patches/tcc-0.9.26/riscv64-load-ptr-zext.{before,after}` + — widens the existing `bt == VT_LLONG` zext check at constant load + time to also cover `VT_PTR` and `VT_FUNC`. Without this, + `(u8 *)0x8b000000UL` (kheap_end constant) loads as `0xffffffff8b000000` + because `lui` always sign-extends bits 63:32. +- `seed-kernel/arch/riscv64/kernel.S`: + - Macros now use `.long` (32-bit) not `.word` — tcc 0.9.26's `.word` + is 16-bit, so the encoded CSR-op constants would be truncated. + - `SD`/`SW` macros emit base-first (`sd base, src, off`), since + tcc's riscv64 assembler parses three-comma stores as + `<rs1>, <rs2>, <imm>` rather than GAS's `<src>, <imm>(<base>)`. + - `bgeu` offset in the bss-zero loop changed from 12 to 16 + (off-by-one: 12 lands on the `J(1b)` instruction, not the next-stage + label). + - `trap_entry` saves x5 (t0) and x6 (t1) **before** any kernel + scratch use, instead of reading sscratch into t0 first. +- `scripts/boot4.sh` gains a `TCC_BOOTSTRAP_RELAX_FIXEDPOINT=1` + escape: codegen-altering tcc patches need a second bootstrap pass + before `cmp tcc2 tcc3` agrees. The next boot4 run (started from + the relaxed run's tcc3) settles back to a real fixed point. +- `scripts/boot6.sh` and `scripts/boot6-gen-runscm.sh` extended + for amd64 + riscv64; emit the right link base address and ELF + format per arch. +- `scripts/lib-runscm.sh` and `scripts/lib-pipeline.sh` dispatch to + `qemu-system-riscv64` (TCG only — no hvf for riscv on Apple + Silicon, hence ~10× slower per stage than aarch64) when + `SEED_ARCH=riscv64`. All the per-stage `boot[0-5].sh` scripts + pick the correct kernel filename for the active arch. diff --git a/scripts/boot.sh b/scripts/boot.sh @@ -21,8 +21,12 @@ DRIVER=${DRIVER:-podman} case "$DRIVER" in seed) - [ "$ARCH" = aarch64 ] || { echo "[boot] DRIVER=seed: aarch64 only" >&2; exit 2; } - KERNEL=build/$ARCH/boot6/Image + case "$ARCH" in + aarch64) KERNEL_NAME=Image ;; + riscv64) KERNEL_NAME=kernel.elf ;; + *) echo "[boot] DRIVER=seed: aarch64|riscv64 only (got $ARCH)" >&2; exit 2 ;; + esac + KERNEL=build/$ARCH/boot6/$KERNEL_NAME if [ ! -f "$KERNEL" ]; then echo "[boot] DRIVER=seed: missing $KERNEL" >&2 echo "[boot] run './scripts/boot.sh $ARCH' first (default DRIVER=podman) to produce it" >&2 @@ -32,7 +36,8 @@ case "$DRIVER" in # below; restored before stage 0 runs. STASH=build/.seed-bootstrap/$ARCH mkdir -p "$STASH" - cp "$KERNEL" "$STASH/Image" + cp "$KERNEL" "$STASH/$KERNEL_NAME" + export SEED_ARCH=$ARCH ;; podman) ;; *) echo "[boot] unknown DRIVER=$DRIVER (expected podman|seed)" >&2; exit 2 ;; @@ -43,7 +48,7 @@ rm -rf build/$ARCH if [ "$DRIVER" = seed ]; then mkdir -p build/$ARCH/boot6 - cp build/.seed-bootstrap/$ARCH/Image build/$ARCH/boot6/Image + cp build/.seed-bootstrap/$ARCH/$KERNEL_NAME build/$ARCH/boot6/$KERNEL_NAME fi T0=$(date +%s) @@ -65,7 +70,7 @@ stage boot4 ./scripts/boot4.sh $ARCH stage boot5 ./scripts/boot5.sh $ARCH # boot6 builds the seed-kernel ELF with boot4's tcc3 (no `ld -T`, -# no objcopy). Arm64-only — the seed kernel is arm64-specific. -if [ "$ARCH" = aarch64 ]; then - stage boot6 ./scripts/boot6.sh $ARCH -fi +# no objcopy). Currently aarch64 + riscv64. +case "$ARCH" in + aarch64|riscv64) stage boot6 ./scripts/boot6.sh $ARCH ;; +esac diff --git a/scripts/boot0.sh b/scripts/boot0.sh @@ -47,11 +47,15 @@ case "$DRIVER" in fi export PLATFORM IMAGE ;; seed) - [ "$ARCH" = "aarch64" ] || { echo "[boot0] DRIVER=seed: aarch64 only" >&2; exit 2; } - KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/Image + case "$ARCH" in + aarch64) KIMG=Image ;; + riscv64) KIMG=kernel.elf ;; + *) echo "[boot0] DRIVER=seed: aarch64|riscv64 only (got $ARCH)" >&2; exit 2 ;; + esac + KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/$KIMG EXTRACT=$ROOT/seed-kernel/scripts/extract-blk.sh [ -f "$KERNEL_IMAGE" ] || { echo "[boot0] missing $KERNEL_IMAGE — run ./scripts/boot.sh $ARCH (default DRIVER=podman) first" >&2; exit 1; } - export KERNEL_IMAGE EXTRACT ;; + export KERNEL_IMAGE EXTRACT SEED_ARCH=$ARCH ;; *) echo "[boot0] unknown DRIVER=$DRIVER" >&2; exit 2 ;; esac diff --git a/scripts/boot1.sh b/scripts/boot1.sh @@ -57,11 +57,15 @@ case "$DRIVER" in fi export PLATFORM IMAGE ;; seed) - [ "$ARCH" = "aarch64" ] || { echo "[boot1] DRIVER=seed: aarch64 only" >&2; exit 2; } - KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/Image + case "$ARCH" in + aarch64) KIMG=Image ;; + riscv64) KIMG=kernel.elf ;; + *) echo "[boot1] DRIVER=seed: aarch64|riscv64 only (got $ARCH)" >&2; exit 2 ;; + esac + KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/$KIMG EXTRACT=$ROOT/seed-kernel/scripts/extract-blk.sh [ -f "$KERNEL_IMAGE" ] || { echo "[boot1] missing $KERNEL_IMAGE — run ./scripts/boot.sh $ARCH (default DRIVER=podman) first" >&2; exit 1; } - export KERNEL_IMAGE EXTRACT ;; + export KERNEL_IMAGE EXTRACT SEED_ARCH=$ARCH ;; *) echo "[boot1] unknown DRIVER=$DRIVER" >&2; exit 2 ;; esac diff --git a/scripts/boot2.sh b/scripts/boot2.sh @@ -65,11 +65,15 @@ case "$DRIVER" in fi export PLATFORM IMAGE ;; seed) - [ "$ARCH" = "aarch64" ] || { echo "[boot2] DRIVER=seed: aarch64 only" >&2; exit 2; } - KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/Image + case "$ARCH" in + aarch64) KIMG=Image ;; + riscv64) KIMG=kernel.elf ;; + *) echo "[boot2] DRIVER=seed: aarch64|riscv64 only (got $ARCH)" >&2; exit 2 ;; + esac + KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/$KIMG EXTRACT=$ROOT/seed-kernel/scripts/extract-blk.sh [ -f "$KERNEL_IMAGE" ] || { echo "[boot2] missing $KERNEL_IMAGE — run ./scripts/boot.sh $ARCH (default DRIVER=podman) first" >&2; exit 1; } - export KERNEL_IMAGE EXTRACT ;; + export KERNEL_IMAGE EXTRACT SEED_ARCH=$ARCH ;; *) echo "[boot2] unknown DRIVER=$DRIVER" >&2; exit 2 ;; esac diff --git a/scripts/boot3.sh b/scripts/boot3.sh @@ -84,11 +84,15 @@ if [ "$DRIVER" = podman ] && ! podman image exists "$IMAGE"; then -f scripts/Containerfile.empty scripts/ fi if [ "$DRIVER" = seed ]; then - [ "$ARCH" = aarch64 ] || { echo "[boot3] DRIVER=seed: aarch64 only" >&2; exit 2; } - KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/Image + case "$ARCH" in + aarch64) KIMG=Image ;; + riscv64) KIMG=kernel.elf ;; + *) echo "[boot3] DRIVER=seed: aarch64|riscv64 only (got $ARCH)" >&2; exit 2 ;; + esac + KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/$KIMG EXTRACT=$ROOT/seed-kernel/scripts/extract-blk.sh [ -f "$KERNEL_IMAGE" ] || { echo "[boot3] missing $KERNEL_IMAGE — run ./scripts/boot.sh $ARCH (default DRIVER=podman) first" >&2; exit 1; } - export KERNEL_IMAGE EXTRACT + export KERNEL_IMAGE EXTRACT SEED_ARCH=$ARCH fi export IMAGE PLATFORM DRIVER diff --git a/scripts/boot4.sh b/scripts/boot4.sh @@ -120,11 +120,15 @@ if [ "$DRIVER" = podman ] && ! podman image exists "$IMAGE"; then -f scripts/Containerfile.empty scripts/ fi if [ "$DRIVER" = seed ]; then - [ "$ARCH" = aarch64 ] || { echo "[boot4] DRIVER=seed: aarch64 only" >&2; exit 2; } - KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/Image + case "$ARCH" in + aarch64) KIMG=Image ;; + riscv64) KIMG=kernel.elf ;; + *) echo "[boot4] DRIVER=seed: aarch64|riscv64 only (got $ARCH)" >&2; exit 2 ;; + esac + KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/$KIMG EXTRACT=$ROOT/seed-kernel/scripts/extract-blk.sh [ -f "$KERNEL_IMAGE" ] || { echo "[boot4] missing $KERNEL_IMAGE — run ./scripts/boot.sh $ARCH (default DRIVER=podman) first" >&2; exit 1; } - export KERNEL_IMAGE EXTRACT + export KERNEL_IMAGE EXTRACT SEED_ARCH=$ARCH fi export IMAGE PLATFORM DRIVER @@ -185,11 +189,22 @@ runscm_export hello runscm_run 5400 # ── fixed-point check (host-side) ───────────────────────────────────── +# After a codegen-altering tcc patch, tcc1 (built by tcc0 = pre-fix) and +# tcc2 (built by tcc1 = post-fix codegen but tcc0-emitted binary) won't +# agree byte-for-byte with tcc3 (built by tcc2 = post-fix throughout): +# the two-stage rule needs a third bounce to converge. Set +# TCC_BOOTSTRAP_RELAX_FIXEDPOINT=1 to skip the cmp; the very next boot4 +# run, started from this run's tcc3, will produce tcc2 == tcc3 with no +# extra knob. if ! cmp -s "$OUT/tcc2" "$OUT/tcc3"; then s2=$(wc -c <"$OUT/tcc2") s3=$(wc -c <"$OUT/tcc3") - echo "[boot4 $ARCH] FIXED-POINT FAIL: tcc2 ($s2) != tcc3 ($s3)" >&2 - exit 1 + if [ "${TCC_BOOTSTRAP_RELAX_FIXEDPOINT:-0}" = 1 ]; then + echo "[boot4 $ARCH] WARN: tcc2 ($s2) != tcc3 ($s3); TCC_BOOTSTRAP_RELAX_FIXEDPOINT=1, accepting tcc3" >&2 + else + echo "[boot4 $ARCH] FIXED-POINT FAIL: tcc2 ($s2) != tcc3 ($s3)" >&2 + exit 1 + fi fi # ── normalize output names (drop s3- prefix; remove intermediate tccN) ─ diff --git a/scripts/boot5.sh b/scripts/boot5.sh @@ -70,7 +70,10 @@ cd "$ROOT" DRIVER=${DRIVER:-podman} if [ "$DRIVER" = seed ]; then - [ "$ARCH" = aarch64 ] || { echo "[boot5] DRIVER=seed: aarch64 only" >&2; exit 2; } + case "$ARCH" in + aarch64|riscv64) ;; + *) echo "[boot5] DRIVER=seed: aarch64|riscv64 only (got $ARCH)" >&2; exit 2 ;; + esac fi IMAGE=boot2-empty:$ARCH @@ -103,10 +106,14 @@ if [ "$DRIVER" = podman ] && ! podman image exists "$IMAGE"; then -f scripts/Containerfile.empty scripts/ fi if [ "$DRIVER" = seed ]; then - KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/Image + case "$ARCH" in + aarch64) KIMG=Image ;; + riscv64) KIMG=kernel.elf ;; + esac + KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/$KIMG EXTRACT=$ROOT/seed-kernel/scripts/extract-blk.sh [ -f "$KERNEL_IMAGE" ] || { echo "[boot5] missing $KERNEL_IMAGE — run ./scripts/boot.sh $ARCH (default DRIVER=podman) first" >&2; exit 1; } - export KERNEL_IMAGE EXTRACT + export KERNEL_IMAGE EXTRACT SEED_ARCH=$ARCH fi export IMAGE PLATFORM DRIVER diff --git a/scripts/boot6-gen-runscm.sh b/scripts/boot6-gen-runscm.sh @@ -38,7 +38,19 @@ set -eu [ "$#" -eq 2 ] || { echo "usage: $0 <arch> <out.scm>" >&2; exit 2; } ARCH=$1; OUT=$2 -[ "$ARCH" = aarch64 ] || { echo "boot6-gen: only aarch64 supported" >&2; exit 2; } + +# Per-arch link parameters. aarch64 alone needs --oformat=binary because +# QEMU's `-kernel <ELF>` path skips the arm64 boot wrapper that puts DTB +# in x0 — only the flat-Image path honors it (detected via the `ARM\x64` +# magic at file offset 0x38 in kernel.S's Image header). amd64/riscv64 +# stay as ELF: QEMU's `-kernel` path on those arches consumes ELF +# directly (PVH note for amd64, OpenSBI for riscv64). +case "$ARCH" in + aarch64) TTEXT=0x40080000; OUT_FILE=Image; LINK_OFORMAT='"-Wl,--oformat=binary"' ;; + amd64) TTEXT=0x40000000; OUT_FILE=kernel.elf; LINK_OFORMAT= ;; + riscv64) TTEXT=0x80200000; OUT_FILE=kernel.elf; LINK_OFORMAT= ;; + *) echo "boot6-gen: unsupported arch '$ARCH'" >&2; exit 2 ;; +esac # Kernel CFLAGS — freestanding, static, no host startfiles. We omit gcc's # -mgeneral-regs-only (no tcc equivalent); kernel.S enables CPACR_EL1.FPEN @@ -75,13 +87,13 @@ cat > "$OUT" <<EOF (must (run "in/tcc3" $KCFLAGS "-c" "-o" "out/mem.o" "in/mem.c") "mem.c -> mem.o") -(write-string stdout "boot6: tcc3 link Image\n") +(write-string stdout "boot6: tcc3 link $OUT_FILE\n") (must (run "in/tcc3" "-nostdlib" "-static" - "-Wl,-Ttext=0x40080000" - "-Wl,--oformat=binary" - "-o" "out/Image" + "-Wl,-Ttext=$TTEXT" + $LINK_OFORMAT + "-o" "out/$OUT_FILE" "out/kernel-asm.o" "out/kernel.o" "out/mmu.o" "out/mem.o") - "link Image") + "link $OUT_FILE") (write-string stdout "boot6: ALL-OK\n") (exit 0) diff --git a/scripts/boot6.sh b/scripts/boot6.sh @@ -1,13 +1,9 @@ #!/bin/sh ## boot6.sh — build the seed-kernel ELF with boot4's tcc3. ## -## Drives tcc3 to compile + link the seed kernel directly into a -## QEMU-bootable arm64 ELF: no `ld -T kernel.lds`, no objcopy. The link -## step uses the minimal flag set the kernel actually needs (see -## scripts/boot6-gen-runscm.sh for the rationale of each flag), notably -## --section-bracket=.bss:__bss_start:__bss_end which replaces the -## kernel.lds bracket assignments — the only genuine link-time service -## the kernel still requires. +## Drives tcc3 to compile + link the seed kernel directly: no `ld -T +## kernel.lds`, no objcopy. aarch64 emits the flat Image QEMU expects; +## amd64/riscv64 emit the ELF consumed by QEMU's -kernel path. ## ## ─── Inputs ────────────────────────────────────────────────────────── ## build/$ARCH/boot4/tcc3 — boot4's verified self-host tcc @@ -41,21 +37,31 @@ ## before jumping to _start. ## ## Usage: scripts/boot6.sh <arch> -## <arch>: aarch64 only. boot6 has no amd64/riscv64 analogue — the seed -## kernel is arm64-specific. +## <arch> ∈ {amd64,aarch64,riscv64} for DRIVER=podman (default). +## DRIVER=seed remains aarch64-only because the seed transport boots the +## aarch64 seed kernel. set -eu -usage() { echo "usage: $0 <aarch64>" >&2; exit 2; } +usage() { echo "usage: $0 <amd64|aarch64|riscv64>" >&2; exit 2; } [ "$#" -eq 1 ] || usage ARCH=$1 -[ "$ARCH" = aarch64 ] || { echo "[boot6] only aarch64 supported (got $ARCH)" >&2; exit 2; } + +case "$ARCH" in + amd64) PLATFORM=linux/amd64; ARCHDIR=amd64; OUT_FILE=kernel.elf ;; + aarch64) PLATFORM=linux/arm64; ARCHDIR=aarch64; OUT_FILE=Image ;; + riscv64) PLATFORM=linux/riscv64; ARCHDIR=riscv64; OUT_FILE=kernel.elf ;; + *) usage ;; +esac ROOT=$(cd "$(dirname "$0")/.." && pwd) cd "$ROOT" DRIVER=${DRIVER:-podman} -PLATFORM=linux/arm64 +case "$DRIVER:$ARCH" in + seed:aarch64|seed:riscv64) ;; + seed:*) echo "[boot6] DRIVER=seed: aarch64|riscv64 only (got $ARCH)" >&2; exit 2 ;; +esac IMAGE=boot2-empty:$ARCH BOOT2=build/$ARCH/boot2 BOOT4=build/$ARCH/boot4 @@ -65,7 +71,7 @@ STAGE=build/$ARCH/.boot6-stage # ── prerequisites ───────────────────────────────────────────────────── [ -x "$BOOT4/tcc3" ] || { echo "[boot6 $ARCH] missing $BOOT4/tcc3 (run scripts/boot4.sh $ARCH)" >&2; exit 1; } [ -x "$BOOT2/scheme1" ] || { echo "[boot6 $ARCH] missing $BOOT2/scheme1 (run scripts/boot2.sh $ARCH)" >&2; exit 1; } -for f in seed-kernel/arch/aarch64/kernel.S seed-kernel/arch/aarch64/mmu.c seed-kernel/arch/aarch64/arch.h seed-kernel/kernel.c tcc-cc/mem.c; do +for f in seed-kernel/arch/$ARCHDIR/kernel.S seed-kernel/arch/$ARCHDIR/mmu.c seed-kernel/arch/$ARCHDIR/arch.h seed-kernel/kernel.c tcc-cc/mem.c; do [ -f "$f" ] || { echo "[boot6 $ARCH] missing $f" >&2; exit 1; } done @@ -76,10 +82,10 @@ if [ "$DRIVER" = podman ] && ! podman image exists "$IMAGE"; then -f scripts/Containerfile.empty scripts/ fi if [ "$DRIVER" = seed ]; then - KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/Image + KERNEL_IMAGE=$ROOT/build/$ARCH/boot6/$OUT_FILE EXTRACT=$ROOT/seed-kernel/scripts/extract-blk.sh [ -f "$KERNEL_IMAGE" ] || { echo "[boot6] missing $KERNEL_IMAGE — run ./scripts/boot.sh $ARCH (default DRIVER=podman) first" >&2; exit 1; } - export KERNEL_IMAGE EXTRACT + export KERNEL_IMAGE EXTRACT SEED_ARCH=$ARCH fi export IMAGE PLATFORM DRIVER @@ -96,13 +102,13 @@ runscm_prelude scheme1/prelude.scm runscm_runscm "$RUNSCM" runscm_input tcc3 "$BOOT4/tcc3" -runscm_input kernel.S seed-kernel/arch/aarch64/kernel.S +runscm_input kernel.S seed-kernel/arch/$ARCHDIR/kernel.S runscm_input kernel.c seed-kernel/kernel.c -runscm_input arch.h seed-kernel/arch/aarch64/arch.h -runscm_input mmu.c seed-kernel/arch/aarch64/mmu.c +runscm_input arch.h seed-kernel/arch/$ARCHDIR/arch.h +runscm_input mmu.c seed-kernel/arch/$ARCHDIR/mmu.c runscm_input mem.c tcc-cc/mem.c -runscm_export Image +runscm_export "$OUT_FILE" runscm_run 1200 -echo "[boot6 $ARCH/$DRIVER] OK -> $OUT/Image ($(wc -c <"$OUT/Image") bytes)" +echo "[boot6 $ARCH/$DRIVER] OK -> $OUT/$OUT_FILE ($(wc -c <"$OUT/$OUT_FILE") bytes)" diff --git a/scripts/lib-pipeline.sh b/scripts/lib-pipeline.sh @@ -177,17 +177,36 @@ in/$inp" APPEND="$bin$P_HEAD" TRANSCRIPT=$cpio_dir/transcript.txt echo "[lib-pipeline:seed] stage $P_IDX:$P_HEAD (bin=$bin)" >&2 - qemu-system-aarch64 \ - -machine virt,gic-version=3,accel=hvf -cpu host -m 2048M \ - -nographic -no-reboot \ - -global virtio-mmio.force-legacy=false \ - -kernel "$KERNEL_IMAGE" \ - -drive file="$cpio_dir/in.img",if=none,format=raw,id=hd0,readonly=on \ - -device virtio-blk-device,drive=hd0 \ - -drive file="$cpio_dir/out.img",if=none,format=raw,id=hd1 \ - -device virtio-blk-device,drive=hd1 \ - -append "$APPEND" \ - > "$TRANSCRIPT" 2>&1 & + seed_arch=${SEED_ARCH:-aarch64} + case "$seed_arch" in + aarch64) + qemu-system-aarch64 \ + -machine virt,gic-version=3,accel=hvf -cpu host -m 2048M \ + -nographic -no-reboot \ + -global virtio-mmio.force-legacy=false \ + -kernel "$KERNEL_IMAGE" \ + -drive file="$cpio_dir/in.img",if=none,format=raw,id=hd0,readonly=on \ + -device virtio-blk-device,drive=hd0 \ + -drive file="$cpio_dir/out.img",if=none,format=raw,id=hd1 \ + -device virtio-blk-device,drive=hd1 \ + -append "$APPEND" \ + > "$TRANSCRIPT" 2>&1 & + ;; + riscv64) + qemu-system-riscv64 \ + -machine virt -m 2048M \ + -nographic -no-reboot \ + -global virtio-mmio.force-legacy=false \ + -kernel "$KERNEL_IMAGE" \ + -drive file="$cpio_dir/in.img",if=none,format=raw,id=hd0,readonly=on \ + -device virtio-blk-device,drive=hd0 \ + -drive file="$cpio_dir/out.img",if=none,format=raw,id=hd1 \ + -device virtio-blk-device,drive=hd1 \ + -append "$APPEND" \ + > "$TRANSCRIPT" 2>&1 & + ;; + *) echo "[lib-pipeline:seed] unsupported SEED_ARCH=$seed_arch" >&2; exit 2 ;; + esac QPID=$! ( sleep 240; kill -9 $QPID 2>/dev/null ) </dev/null >/dev/null 2>&1 & WATCHER=$! diff --git a/scripts/lib-runscm.sh b/scripts/lib-runscm.sh @@ -144,17 +144,40 @@ _runscm_run_seed() { TRANSCRIPT=$S_STAGE_DIR/transcript.txt echo "[runscm/seed] booting scheme1 + run.scm (timeout ${timeout}s)" >&2 - qemu-system-aarch64 \ - -machine virt,gic-version=3,accel=hvf -cpu host -m "$mem" \ - -nographic -no-reboot \ - -global virtio-mmio.force-legacy=false \ - -kernel "$KERNEL_IMAGE" \ - -drive file="$S_STAGE_DIR/in.img",if=none,format=raw,id=hd0,readonly=on \ - -device virtio-blk-device,drive=hd0 \ - -drive file="$S_STAGE_DIR/out.img",if=none,format=raw,id=hd1 \ - -device virtio-blk-device,drive=hd1 \ - -append "init in/combined.scm" \ - > "$TRANSCRIPT" 2>&1 & + seed_arch=${SEED_ARCH:-aarch64} + case "$seed_arch" in + aarch64) + qemu-system-aarch64 \ + -machine virt,gic-version=3,accel=hvf -cpu host -m "$mem" \ + -nographic -no-reboot \ + -global virtio-mmio.force-legacy=false \ + -kernel "$KERNEL_IMAGE" \ + -drive file="$S_STAGE_DIR/in.img",if=none,format=raw,id=hd0,readonly=on \ + -device virtio-blk-device,drive=hd0 \ + -drive file="$S_STAGE_DIR/out.img",if=none,format=raw,id=hd1 \ + -device virtio-blk-device,drive=hd1 \ + -append "init in/combined.scm" \ + > "$TRANSCRIPT" 2>&1 & + ;; + riscv64) + # No hvf accel on Apple Silicon for riscv64 — TCG only. + qemu-system-riscv64 \ + -machine virt -m "$mem" \ + -nographic -no-reboot \ + -global virtio-mmio.force-legacy=false \ + -kernel "$KERNEL_IMAGE" \ + -drive file="$S_STAGE_DIR/in.img",if=none,format=raw,id=hd0,readonly=on \ + -device virtio-blk-device,drive=hd0 \ + -drive file="$S_STAGE_DIR/out.img",if=none,format=raw,id=hd1 \ + -device virtio-blk-device,drive=hd1 \ + -append "init in/combined.scm" \ + > "$TRANSCRIPT" 2>&1 & + ;; + *) + echo "[runscm/seed] unsupported SEED_ARCH=$seed_arch" >&2 + exit 2 + ;; + esac QPID=$! ( sleep "$timeout"; kill -9 $QPID 2>/dev/null ) </dev/null >/dev/null 2>&1 & WATCHER=$! diff --git a/scripts/libc-flatten.sh b/scripts/libc-flatten.sh @@ -158,6 +158,17 @@ apply_simple_patch \ "$STAGE/stdio/vsnprintf.c" \ "$PATCHES/vsnprintf-int-promo.before" \ "$PATCHES/vsnprintf-int-promo.after" +# abtol uses an `int` accumulator, which overflows for values that don't +# fit in 32-bit signed (e.g. 0x80200000 — riscv64 OpenSBI kernel base). +# strtol/strtoul/strtoull all bottom out here, so the overflow propagates +# through everywhere mes-libc's number parsers are called. Concretely, +# tcc3 (linked against mes-libc) mishandles `-Wl,-Ttext=0x80200000` and +# emits an ELF with sign-extended vaddr=0xffffffff80200000 that QEMU +# rejects. Switching the accumulator to `long` fixes the parse. +apply_simple_patch \ + "$STAGE/mes/abtol.c" \ + "$PATCHES/abtol-long-accumulator.before" \ + "$PATCHES/abtol-long-accumulator.after" # --- (3) flatten via host preprocessor -------------------------------- HOST_CC=${HOST_CC:-cc} diff --git a/scripts/simple-patches/tcc-0.9.26/riscv64-cvt-int-zext.after b/scripts/simple-patches/tcc-0.9.26/riscv64-cvt-int-zext.after @@ -0,0 +1,28 @@ + if ((sbt & VT_BTYPE) != VT_LLONG && + (sbt & VT_BTYPE) != VT_PTR && + (sbt & VT_BTYPE) != VT_FUNC) { + /* need to convert from 32bit to 64bit */ + gv(RC_INT); +#if defined(TCC_TARGET_RISCV64) + /* riscv64: gen_cvt_sxtw handles both directions + * (sign- or zero-extend) based on vtop->type. Don't + * skip the unsigned case — RV64 32-bit ops sign-extend + * their results, so an `unsigned int` register value + * may have garbage upper bits that must be cleared + * before widening. See riscv64-gen.c. */ + gen_cvt_sxtw(); +#else + if (sbt != (VT_INT | VT_UNSIGNED)) { +#if defined(TCC_TARGET_ARM64) + gen_cvt_sxtw(); +#elif defined(TCC_TARGET_X86_64) + int r = gv(RC_INT); + /* x86_64 specific: movslq */ + o(0x6348); + o(0xc0 + (REG_VALUE(r) << 3) + REG_VALUE(r)); +#else +#error +#endif + } +#endif + } diff --git a/scripts/simple-patches/tcc-0.9.26/riscv64-cvt-int-zext.before b/scripts/simple-patches/tcc-0.9.26/riscv64-cvt-int-zext.before @@ -0,0 +1,18 @@ + if ((sbt & VT_BTYPE) != VT_LLONG && + (sbt & VT_BTYPE) != VT_PTR && + (sbt & VT_BTYPE) != VT_FUNC) { + /* need to convert from 32bit to 64bit */ + gv(RC_INT); + if (sbt != (VT_INT | VT_UNSIGNED)) { +#if defined(TCC_TARGET_ARM64) || defined(TCC_TARGET_RISCV64) + gen_cvt_sxtw(); +#elif defined(TCC_TARGET_X86_64) + int r = gv(RC_INT); + /* x86_64 specific: movslq */ + o(0x6348); + o(0xc0 + (REG_VALUE(r) << 3) + REG_VALUE(r)); +#else +#error +#endif + } + } diff --git a/scripts/simple-patches/tcc-0.9.26/riscv64-gen-cvt-sxtw.after b/scripts/simple-patches/tcc-0.9.26/riscv64-gen-cvt-sxtw.after @@ -0,0 +1,19 @@ +ST_FUNC void gen_cvt_sxtw(void) +{ + /* int -> long widening on riscv64. RV64 32-bit ops sign-extend their + results into bits 63:32, so for *signed* int -> long the register is + already correctly extended (a no-op suffices, or a defensive + addiw r,r,0). For *unsigned* int -> ulong we must explicitly zero + the upper 32 bits via slli;srli — otherwise an unsigned value with + bit 31 set (e.g. 0x80000000) is read back as 0xffffffff80000000. + Without this, kernel.c's `((u64)be32(p) << 32) | (u64)be32(p+4)` + produces sign-extended garbage for any DTB cell whose top bit is + set, which the seed kernel hits on the riscv64 virt machine. */ + int r = ireg(gv(RC_INT)); + if (vtop->type.t & VT_UNSIGNED) { + EI(0x13, 1, r, r, 32); // slli r, r, 32 + EI(0x13, 5, r, r, 32); // srli r, r, 32 + } else { + EI(0x1b, 0, r, r, 0); // addiw r, r, 0 (sign-extend low 32) + } +} diff --git a/scripts/simple-patches/tcc-0.9.26/riscv64-gen-cvt-sxtw.before b/scripts/simple-patches/tcc-0.9.26/riscv64-gen-cvt-sxtw.before @@ -0,0 +1,5 @@ +ST_FUNC void gen_cvt_sxtw(void) +{ + /* XXX on risc-v the registers are usually sign-extended already. + Let's try to not do anything here. */ +} diff --git a/scripts/simple-patches/tcc-0.9.26/riscv64-load-ptr-zext.after b/scripts/simple-patches/tcc-0.9.26/riscv64-load-ptr-zext.after @@ -0,0 +1,11 @@ + } else if (bt == VT_LLONG || bt == VT_PTR || bt == VT_FUNC) { + /* A 32bit unsigned constant being loaded into a 64-bit + type slot (long long, pointer, or function pointer). + `lui` always sign-extends bits 63:32, so for any value + with bit 31 set (e.g. seed-kernel ARCH_KERNEL_HEAP_END + = 0x8b000000UL cast to u8*) the upper bits would come + out as 0xffffffff. Stock tcc 0.9.26 only handles the + VT_LLONG case here; widen the check to cover pointers + and function pointers too. */ + zext = 1; + } diff --git a/scripts/simple-patches/tcc-0.9.26/riscv64-load-ptr-zext.before b/scripts/simple-patches/tcc-0.9.26/riscv64-load-ptr-zext.before @@ -0,0 +1,5 @@ + } else if (bt == VT_LLONG) { + /* A 32bit unsigned constant for a 64bit type. + lui always sign extends, so we need to do an explicit zext.*/ + zext = 1; + } diff --git a/scripts/stage1-flatten.sh b/scripts/stage1-flatten.sh @@ -246,6 +246,18 @@ awk '{ sub(/\t#.*$/, ""); print }' "$SRC/lib/alloca86_64-bt.S" \ > "$SRC/lib/alloca86_64-bt.S.tmp" mv "$SRC/lib/alloca86_64-bt.S.tmp" "$SRC/lib/alloca86_64-bt.S" +# riscv64 int->llong cast: stock tcc 0.9.26 leaves unsigned int values +# in their native register width, but RV64 32-bit ops sign-extend bits +# 63:32, so widening an `unsigned int` to `unsigned long` reads garbage +# upper bits. Make gen_cvt_sxtw do the right thing for both signs, and +# always invoke it on riscv64. Hits be64() in the seed kernel's DTB +# parser; without the fix the kernel sees mem_start sign-extended to +# 0xffffffff80000000 and the boot panics during MMU bring-up. Patch is +# gated by the call-site / function name so it no-ops on other arches. +apply_our_patch riscv64-cvt-int-zext "$SRC/tccgen.c" +apply_our_patch riscv64-gen-cvt-sxtw "$SRC/riscv64-gen.c" +apply_our_patch riscv64-load-ptr-zext "$SRC/riscv64-gen.c" + # riscv64 stdarg.h order fix — the upstream `#elif __riscv` branch # uses `__builtin_va_list` before it's typedef'd. Stock tcc treats # `__builtin_va_list` as a built-in keyword and forgives the forward diff --git a/seed-kernel/arch/amd64/kernel.S b/seed-kernel/arch/amd64/kernel.S @@ -7,43 +7,70 @@ .long _start .section .text, "ax" -.code32 .globl _start _start: - cli - movl $boot_stack_top, %esp - lgdt boot_gdt64_ptr - - movl %cr4, %eax - orl $0x20, %eax - movl %eax, %cr4 - - movl $boot_pml4, %eax - movl %eax, %cr3 - - movl $0xc0000080, %ecx - rdmsr - orl $0x100, %eax - wrmsr - - movl %cr0, %eax - orl $0x80000001, %eax - movl %eax, %cr0 - - ljmp $0x08, $long_mode - -.code64 + /* QEMU's PVH path enters this address in 32-bit protected mode. + * Keep this transition stub as raw encodings so the same source can + * be assembled by TCC's x86_64-only assembler. */ + .byte 0xfa /* cli */ + .byte 0xbc; .long boot_stack_top /* movl $boot_stack_top,%esp */ + .byte 0x0f,0x01,0x15; .long boot_gdt64_ptr + /* lgdt boot_gdt64_ptr */ + .byte 0x0f,0x20,0xe0 /* movl %cr4,%eax */ + .byte 0x83,0xc8,0x20 /* orl $0x20,%eax */ + .byte 0x0f,0x22,0xe0 /* movl %eax,%cr4 */ + + .byte 0xbf; .long boot_pml4 /* movl $boot_pml4,%edi */ + .byte 0x31,0xc0 /* xorl %eax,%eax */ + .byte 0xb9; .long 0x1800 /* movl $(6*4096/4),%ecx */ + .byte 0xf3,0xab /* rep stosl */ + + .byte 0xc7,0x05; .long boot_pml4; .long boot_pdpt + 0x003 + .byte 0xc7,0x05; .long boot_pml4 + 4; .long 0 + .byte 0xc7,0x05; .long boot_pml4 + 2048; .long boot_pdpt + 0x003 + .byte 0xc7,0x05; .long boot_pml4 + 2052; .long 0 + + .byte 0xc7,0x05; .long boot_pdpt; .long boot_pd0 + 0x003 + .byte 0xc7,0x05; .long boot_pdpt + 4; .long 0 + .byte 0xc7,0x05; .long boot_pdpt + 8; .long boot_pd1 + 0x003 + .byte 0xc7,0x05; .long boot_pdpt + 12; .long 0 + .byte 0xc7,0x05; .long boot_pdpt + 16; .long boot_pd2 + 0x003 + .byte 0xc7,0x05; .long boot_pdpt + 20; .long 0 + .byte 0xc7,0x05; .long boot_pdpt + 24; .long boot_pd3 + 0x003 + .byte 0xc7,0x05; .long boot_pdpt + 28; .long 0 + + .byte 0xbf; .long boot_pd0 /* movl $boot_pd0,%edi */ + .byte 0xb8; .long 0x083 /* movl $0x83,%eax */ + .byte 0xb9; .long 2048 /* movl $2048,%ecx */ +1: + .byte 0x89,0x07 /* movl %eax,(%edi) */ + .byte 0xc7,0x47,0x04; .long 0 /* movl $0,4(%edi) */ + .byte 0x05; .long 0x200000 /* addl $0x200000,%eax */ + .byte 0x83,0xc7,0x08 /* addl $8,%edi */ + .byte 0xe2,0xed /* loop 1b */ + + .byte 0xb8; .long boot_pml4 /* movl $boot_pml4,%eax */ + .byte 0x0f,0x22,0xd8 /* movl %eax,%cr3 */ + .byte 0xb9; .long 0xc0000080 /* movl $0xc0000080,%ecx */ + .byte 0x0f,0x32 /* rdmsr */ + .byte 0x0d; .long 0x100 /* orl $0x100,%eax */ + .byte 0x0f,0x30 /* wrmsr */ + .byte 0x0f,0x20,0xc0 /* movl %cr0,%eax */ + .byte 0x0d; .long 0x80000001 /* orl $0x80000001,%eax */ + .byte 0x0f,0x22,0xc0 /* movl %eax,%cr0 */ + .byte 0xea; .long long_mode; .word 0x08 + /* ljmp $0x08,$long_mode */ long_mode: movw $0x10, %ax movw %ax, %ds movw %ax, %es movw %ax, %ss - movabsq $kstack_top, %rsp + movq $kstack_top, %rsp call amd64_serial_init - movabsq $__bss_start, %rdi - movabsq $_end, %rsi + movq $__bss_start, %rdi + movq $_end, %rsi xorl %eax, %eax 1: cmpq %rsi, %rdi @@ -112,7 +139,7 @@ amd64_int80: movq 104(%rsp), %r15 movq 112(%rsp), %rbp addq $216, %rsp - iretq + .byte 0x48,0xcf /* iretq */ .globl amd64_unhandled amd64_unhandled: @@ -131,7 +158,7 @@ amd64_unhandled: .globl eret_to_user eret_to_user: movq %rsi, saved_user_sp(%rip) - movabsq $kstack_top, %rsp + movq $kstack_top, %rsp pushq $0x1b pushq %rsi pushq $0x202 @@ -152,7 +179,7 @@ eret_to_user: xorq %r13, %r13 xorq %r14, %r14 xorq %r15, %r15 - iretq + .byte 0x48,0xcf /* iretq */ .globl arch_read_user_sp arch_read_user_sp: @@ -182,7 +209,7 @@ arch_idle_forever: .globl arch_mmio_ptr arch_mmio_ptr: - movabsq $0xffff800000000000, %rax + movq $0xffff800000000000, %rax addq %rdi, %rax ret @@ -242,7 +269,7 @@ amd64_lgdt: pushq $0x08 leaq 1f(%rip), %rax pushq %rax - lretq + .byte 0x48,0xcb /* lretq */ 1: ret @@ -293,53 +320,24 @@ boot_gdt64_ptr: .section .data, "aw" .align 4096 boot_pml4: - .quad boot_pdpt + 0x003 - .rept 255 - .quad 0 - .endr - .quad boot_pdpt + 0x003 - .rept 255 - .quad 0 - .endr + .skip 4096 .align 4096 boot_pdpt: - .quad boot_pd0 + 0x003 - .quad boot_pd1 + 0x003 - .quad boot_pd2 + 0x003 - .quad boot_pd3 + 0x003 - .rept 508 - .quad 0 - .endr + .skip 4096 .align 4096 boot_pd0: - .set i, 0 - .rept 512 - .quad (i * 0x200000) + 0x083 - .set i, i + 1 - .endr + .skip 4096 .align 4096 boot_pd1: - .set i, 512 - .rept 512 - .quad (i * 0x200000) + 0x083 - .set i, i + 1 - .endr + .skip 4096 .align 4096 boot_pd2: - .set i, 1024 - .rept 512 - .quad (i * 0x200000) + 0x083 - .set i, i + 1 - .endr + .skip 4096 .align 4096 boot_pd3: - .set i, 1536 - .rept 512 - .quad (i * 0x200000) + 0x083 - .set i, i + 1 - .endr + .skip 4096 .section .bss, "aw", @nobits .align 16 diff --git a/seed-kernel/arch/riscv64/kernel.S b/seed-kernel/arch/riscv64/kernel.S @@ -1,135 +1,210 @@ .section .text, "ax" + +#ifdef __TINYC__ +/* tcc 0.9.26's riscv64 assembler doesn't accept GAS's `off(reg)` form + * for ld/sd/sw and parses three-comma operands as `<rs1>, <rs2>, <imm>`, + * where rs1 is the base for both loads (rd ← [rs1+imm]) and stores + * ([rs1+imm] ← rs2). So `SD(rs, base, off)` must emit base FIRST. */ +#define LA(rd, sym) lla rd, sym +#define LD(rd, base, off) ld rd, base, off +#define SD(rs, base, off) sd base, rs, off +#define SW(rs, base, off) sw base, rs, off +#define J(label) jal zero, label +#define RET jalr zero, ra, 0 +#define CSRW_STVEC_T0 .long 0x10529073 +#define CSRRW_SP_SSCRATCH_SP .long 0x14011173 +#define CSRR_T0_SSCRATCH .long 0x140022f3 +#define CSRR_T0_SEPC .long 0x141022f3 +#define CSRR_A0_SCAUSE .long 0x14202573 +#define CSRR_T0_SSTATUS .long 0x100022f3 +#define CSRW_SEPC_T0 .long 0x14129073 +#define CSRW_SSTATUS_T0 .long 0x10029073 +#define CSRW_SSCRATCH_T1 .long 0x14031073 +#define CSRW_SSCRATCH_T0 .long 0x14029073 +#define CSRW_SEPC_A0 .long 0x14151073 +#define CSRR_A0_STVAL .long 0x14302573 +#define CSRW_SATP_A0 .long 0x18051073 +#define SFENCE_VMA .long 0x12000073 +#define FENCE_I .long 0x0000100f +#define FENCE_W_W .long 0x0110000f +#define FENCE_R_R .long 0x0220000f +#define SRET .long 0x10200073 +#define NOP .long 0x00000013 +#else +#define LA(rd, sym) la rd, sym +#define LD(rd, base, off) ld rd, off(base) +#define SD(rs, base, off) sd rs, off(base) +#define SW(rs, base, off) sw rs, off(base) +#define J(label) j label +#define RET ret +#define CSRW_STVEC_T0 csrw stvec, t0 +#define CSRRW_SP_SSCRATCH_SP csrrw sp, sscratch, sp +#define CSRR_T0_SSCRATCH csrr t0, sscratch +#define CSRR_T0_SEPC csrr t0, sepc +#define CSRR_A0_SCAUSE csrr a0, scause +#define CSRR_T0_SSTATUS csrr t0, sstatus +#define CSRW_SEPC_T0 csrw sepc, t0 +#define CSRW_SSTATUS_T0 csrw sstatus, t0 +#define CSRW_SSCRATCH_T1 csrw sscratch, t1 +#define CSRW_SSCRATCH_T0 csrw sscratch, t0 +#define CSRW_SEPC_A0 csrw sepc, a0 +#define CSRR_A0_STVAL csrr a0, stval +#define CSRW_SATP_A0 csrw satp, a0 +#define SFENCE_VMA sfence.vma +#define FENCE_I fence.i +#define FENCE_W_W fence w, w +#define FENCE_R_R fence r, r +#define SRET sret +#define NOP nop +#endif + .globl _start _start: mv s0, a1 - la sp, kstack_top - la t0, trap_entry - csrw stvec, t0 + LA(sp, kstack_top) + LA(t0, trap_entry) + CSRW_STVEC_T0 - la t0, __bss_start - la t1, _end + LA(t0, __bss_start) + LA(t1, _end) 1: +#ifdef __TINYC__ + bgeu t0, t1, 16 +#else bgeu t0, t1, 2f - sd zero, 0(t0) +#endif + SD(zero, t0, 0) addi t0, t0, 8 - j 1b + J(1b) 2: mv a0, s0 - call kmain + jal ra, kmain 3: wfi - j 3b + J(3b) .align 2 .globl trap_entry trap_entry: - csrrw sp, sscratch, sp + CSRRW_SP_SSCRATCH_SP addi sp, sp, -272 - sd x1, 8(sp) - csrr t0, sscratch - sd t0, 16(sp) - la t1, saved_user_sp - sd t0, 0(t1) - sd x3, 24(sp) - sd x4, 32(sp) - sd x5, 40(sp) - sd x6, 48(sp) - sd x7, 56(sp) - sd x8, 64(sp) - sd x9, 72(sp) - sd x10, 80(sp) - sd x11, 88(sp) - sd x12, 96(sp) - sd x13, 104(sp) - sd x14, 112(sp) - sd x15, 120(sp) - sd x16, 128(sp) - sd x17, 136(sp) - sd x18, 144(sp) - sd x19, 152(sp) - sd x20, 160(sp) - sd x21, 168(sp) - sd x22, 176(sp) - sd x23, 184(sp) - sd x24, 192(sp) - sd x25, 200(sp) - sd x26, 208(sp) - sd x27, 216(sp) - sd x28, 224(sp) - sd x29, 232(sp) - sd x30, 240(sp) - sd x31, 248(sp) - - csrr t0, sepc - csrr a0, scause + /* Save user regs first — including t0 (x5) and t1 (x6) — *before* + * we clobber them for trap-entry bookkeeping. Linux's RISC-V + * syscall ABI preserves every GPR except a0; if t0/t1 carry caller + * state across an ecall, they must be restored. trap_entry's own + * scratch must therefore come out of a register the kernel has + * already preserved on the stack. */ + SD(x1, sp, 8) + SD(x3, sp, 24) + SD(x4, sp, 32) + SD(x5, sp, 40) + SD(x6, sp, 48) + /* t0 and t1 are now in tf->x[5]/tf->x[6]; safe to clobber. */ + CSRR_T0_SSCRATCH + SD(t0, sp, 16) + LA(t1, saved_user_sp) + SD(t0, t1, 0) + SD(x7, sp, 56) + SD(x8, sp, 64) + SD(x9, sp, 72) + SD(x10, sp, 80) + SD(x11, sp, 88) + SD(x12, sp, 96) + SD(x13, sp, 104) + SD(x14, sp, 112) + SD(x15, sp, 120) + SD(x16, sp, 128) + SD(x17, sp, 136) + SD(x18, sp, 144) + SD(x19, sp, 152) + SD(x20, sp, 160) + SD(x21, sp, 168) + SD(x22, sp, 176) + SD(x23, sp, 184) + SD(x24, sp, 192) + SD(x25, sp, 200) + SD(x26, sp, 208) + SD(x27, sp, 216) + SD(x28, sp, 224) + SD(x29, sp, 232) + SD(x30, sp, 240) + SD(x31, sp, 248) + + CSRR_T0_SEPC + CSRR_A0_SCAUSE li t1, 8 +#ifdef __TINYC__ + bne a0, t1, 8 +#else bne a0, t1, 4f +#endif addi t0, t0, 4 4: - sd t0, 256(sp) - csrr t0, sstatus - sd t0, 264(sp) + SD(t0, sp, 256) + CSRR_T0_SSTATUS + SD(t0, sp, 264) mv a1, sp - call trap_sync - - ld t0, 256(sp) - csrw sepc, t0 - ld t0, 264(sp) - csrw sstatus, t0 - - ld x1, 8(sp) - ld x3, 24(sp) - ld x4, 32(sp) - ld x7, 56(sp) - ld x8, 64(sp) - ld x9, 72(sp) - ld x10, 80(sp) - ld x11, 88(sp) - ld x12, 96(sp) - ld x13, 104(sp) - ld x14, 112(sp) - ld x15, 120(sp) - ld x16, 128(sp) - ld x17, 136(sp) - ld x18, 144(sp) - ld x19, 152(sp) - ld x20, 160(sp) - ld x21, 168(sp) - ld x22, 176(sp) - ld x23, 184(sp) - ld x24, 192(sp) - ld x25, 200(sp) - ld x26, 208(sp) - ld x27, 216(sp) - ld x28, 224(sp) - ld x29, 232(sp) - ld x30, 240(sp) - ld x31, 248(sp) - - la t0, saved_user_sp - ld t1, 0(t0) - csrw sscratch, t1 - ld x5, 40(sp) - ld x6, 48(sp) + jal ra, trap_sync + + LD(t0, sp, 256) + CSRW_SEPC_T0 + LD(t0, sp, 264) + CSRW_SSTATUS_T0 + + LD(x1, sp, 8) + LD(x3, sp, 24) + LD(x4, sp, 32) + LD(x7, sp, 56) + LD(x8, sp, 64) + LD(x9, sp, 72) + LD(x10, sp, 80) + LD(x11, sp, 88) + LD(x12, sp, 96) + LD(x13, sp, 104) + LD(x14, sp, 112) + LD(x15, sp, 120) + LD(x16, sp, 128) + LD(x17, sp, 136) + LD(x18, sp, 144) + LD(x19, sp, 152) + LD(x20, sp, 160) + LD(x21, sp, 168) + LD(x22, sp, 176) + LD(x23, sp, 184) + LD(x24, sp, 192) + LD(x25, sp, 200) + LD(x26, sp, 208) + LD(x27, sp, 216) + LD(x28, sp, 224) + LD(x29, sp, 232) + LD(x30, sp, 240) + LD(x31, sp, 248) + + LA(t0, saved_user_sp) + LD(t1, t0, 0) + CSRW_SSCRATCH_T1 + LD(x5, sp, 40) + LD(x6, sp, 48) addi sp, sp, 272 - csrrw sp, sscratch, sp - sret + CSRRW_SP_SSCRATCH_SP + SRET .globl eret_to_user eret_to_user: - la t0, saved_user_sp - sd a1, 0(t0) - la t0, kstack_top - csrw sscratch, t0 - csrw sepc, a0 - csrr t0, sstatus - li t1, ~(1 << 8) + LA(t0, saved_user_sp) + SD(a1, t0, 0) + LA(t0, kstack_top) + CSRW_SSCRATCH_T0 + CSRW_SEPC_A0 + CSRR_T0_SSTATUS + li t1, -257 and t0, t0, t1 - li t1, (1 << 5) | (1 << 18) + lui t1, 0x40 + addi t1, t1, 32 or t0, t0, t1 - csrw sstatus, t0 + CSRW_SSTATUS_T0 mv sp, a1 li ra, 0 li gp, 0 @@ -161,92 +236,102 @@ eret_to_user: li t4, 0 li t5, 0 li t6, 0 - sret + SRET .globl arch_read_user_sp arch_read_user_sp: - la t0, saved_user_sp - ld a0, 0(t0) - ret + LA(t0, saved_user_sp) + LD(a0, t0, 0) + RET .globl arch_write_user_sp arch_write_user_sp: - la t0, saved_user_sp - sd a0, 0(t0) - ret + LA(t0, saved_user_sp) + SD(a0, t0, 0) + RET .globl arch_fault_addr arch_fault_addr: - csrr a0, stval - ret + CSRR_A0_STVAL + RET .globl arch_pause arch_pause: - nop - ret + NOP + RET .globl arch_idle_forever arch_idle_forever: 1: wfi - j 1b + J(1b) .globl arch_mmio_ptr arch_mmio_ptr: - li t0, 0x100000000 + li t0, 1 + slli t0, t0, 32 add a0, a0, t0 - ret + RET .globl arch_wmb arch_wmb: - fence w, w - ret + FENCE_W_W + RET .globl arch_rmb arch_rmb: - fence r, r - ret + FENCE_R_R + RET .globl arch_icache_sync arch_icache_sync: - fence.i - ret + FENCE_I + RET .globl arch_icache_context_sync arch_icache_context_sync: - sfence.vma - fence.i - ret + SFENCE_VMA + FENCE_I + RET .globl riscv_write_satp riscv_write_satp: - csrw satp, a0 - sfence.vma - ret + CSRW_SATP_A0 + SFENCE_VMA + RET .globl riscv_set_sum riscv_set_sum: - csrr t0, sstatus - li t1, (1 << 18) + CSRR_T0_SSTATUS + lui t1, 0x40 or t0, t0, t1 - csrw sstatus, t0 - ret + CSRW_SSTATUS_T0 + RET .globl arch_system_off arch_system_off: - li t0, 0x100000 - li t1, 0x5555 - sw t1, 0(t0) + lui t0, 0x100 + lui t1, 5 + addi t1, t1, 1365 + SW(t1, t0, 0) 1: wfi - j 1b + J(1b) .section .bss, "aw", @nobits +#ifdef __TINYC__ +.align 8 +#else .align 3 +#endif .globl saved_user_sp saved_user_sp: .skip 8 +#ifdef __TINYC__ +.align 16 +#else .align 4 +#endif .skip 0x10000 .globl kstack_top kstack_top: diff --git a/vendor/mes-libc/patches/abtol-long-accumulator.after b/vendor/mes-libc/patches/abtol-long-accumulator.after @@ -0,0 +1,9 @@ + char const *s = p[0]; + /* Use a `long` accumulator so values that don't fit in 32-bit signed + * (e.g. 0x80200000 — riscv64's OpenSBI kernel base) don't overflow + * to a sign-extended negative. Affects strtol/strtoul/strtoull, + * which all bottom out in this routine. Without this, tcc3 mishandles + * `-Wl,-Ttext=0x80200000` and emits an ELF with vaddr=0xffffffff80200000. + */ + long i = 0; + int sign_p = 0; diff --git a/vendor/mes-libc/patches/abtol-long-accumulator.before b/vendor/mes-libc/patches/abtol-long-accumulator.before @@ -0,0 +1,3 @@ + char const *s = p[0]; + int i = 0; + int sign_p = 0;