boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit afba887f0c618419dd2bd013b6326c27d03b9bd9
parent 9e1803cb83d77b9d2501e034d8be1631a25020b0
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Wed,  6 May 2026 11:03:09 -0700

seed-kernel: riscv64 DRIVER=seed end-to-end fixes

Three blockers between boot0 and boot6 under DRIVER=seed riscv64:

- hex2 stage 4 panicked because brk_base = g_user_image_end (16-byte
  aligned past end-of-image) collided with hex2's PC-relative scratch
  buffer at 0x6007e8: write_string's 8-byte sd-zero terminator at the
  buffer tail landed on the first heap node's str pointer, which a
  later lookup walked into and null-deref'd. Round brk_base up to the
  next 4 KiB page in kmain and sys_spawn (Linux's convention).

- boot4 stage D faulted in mem_cpy storing to 0x10000 because tcc
  0.9.26's riscv64-link.c defaults ELF_START_ADDR=0x10000, below
  USER_VA_LO=0x200000. Add simple-patch riscv64-elf-start-addr to move
  the default to 0x600000, matching the rest of the chain
  (M0/hex2pp -B, scheme1, hex2, ...). Belt-and-suspenders:
  boot4-gen-runscm.sh and boot5-gen-runscm.sh also pass
  -Wl,-Ttext=0x600000 on riscv64 link lines so the chain is robust
  even if a future tcc.flat.c regen lands without the patch.

- boot5 (-m 3072M) couldn't read the DTB: OpenSBI placed it at PA
  0x13fe00000, beyond the kernel's L2 direct map. Extend
  arch/riscv64/mmu.c's kernel direct map to L2 slots 2-7
  (PA 0x80000000..0x1ffffffff, 6 GiB), and relocate the device alias
  to L2 slot 8 (ARCH_DEVICE_ALIAS_BASE = 1<<33). arch_mmio_ptr in
  kernel.S shifts by 33 to match.

Per-stage validation: boot0..boot6 each pass under DRIVER=seed.
End-to-end scripts/boot.sh riscv64 not yet rerun after the fixes.

Drop the SEED-RISCV64-TODO.md working doc; the riscv64 path is at
parity with aarch64 at the per-stage level.

Diffstat:
Ddocs/SEED-RISCV64-TODO.md | 196-------------------------------------------------------------------------------
Mscripts/boot4-gen-runscm.sh | 22+++++++++++++++++-----
Mscripts/boot5-gen-runscm.sh | 11++++++++++-
Ascripts/simple-patches/tcc-0.9.26/riscv64-elf-start-addr.after | 8++++++++
Ascripts/simple-patches/tcc-0.9.26/riscv64-elf-start-addr.before | 1+
Mscripts/stage1-flatten.sh | 7+++++++
Mseed-kernel/arch/riscv64/arch.h | 2+-
Mseed-kernel/arch/riscv64/kernel.S | 4+++-
Mseed-kernel/arch/riscv64/mmu.c | 21+++++++++++++++++----
Mseed-kernel/kernel.c | 11++++++++++-
10 files changed, 74 insertions(+), 209 deletions(-)

diff --git a/docs/SEED-RISCV64-TODO.md b/docs/SEED-RISCV64-TODO.md @@ -1,196 +0,0 @@ -# riscv64 seed-kernel TODO - -Working doc. Captures the open work needed to get -`DRIVER=seed ./scripts/boot.sh riscv64` to a clean exit, mirroring the -aarch64 path. Pairs with `docs/OS.md` (kernel contract) and -`docs/TCC.md` (compiler). - -## Goal - -`DRIVER=seed ./scripts/boot.sh riscv64` should run the full -boot0→boot6 chain entirely *inside* the tcc-built riscv64 seed kernel -(the kernel is its own build driver, with podman only used to mint -the very first kernel image). This is the non-negotiable end-to-end -validation: it exercises every kernel path the chain depends on -(ELF load, MMU, syscalls, virtio-blk DMA, fork/exec, exit) under -real workloads — boot4 alone runs ~5000 user-mode tcc invocations -inside the kernel. - -The aarch64 path already passes; this work brings riscv64 to parity. - -## What works (May 2026) - -- `boot6 riscv64` builds a clean ELF kernel from tcc3, located at - `build/riscv64/boot6/kernel.elf`. Loads under OpenSBI on - `qemu-system-riscv64 -machine virt`. -- Kernel reaches `kmain`, parses DTB (mem 0x80000000), brings up - virtio-blk, parses cpio, lists tmpfs, loads ELF, erets to user. -- DRIVER=seed wiring is in place: `boot.sh`, `boot[0-5].sh`, - `boot6.sh`, `lib-runscm.sh`, `lib-pipeline.sh` all dispatch to - `qemu-system-riscv64` and the right kernel filename - (`kernel.elf` vs aarch64's `Image`) when `SEED_ARCH=riscv64`. -- Boot0 stages 1, 2, 3 run cleanly under DRIVER=seed: the kernel - runs `hex0-seed`, `hex0`, `hex1` in user mode and SEEDFS-extracts - the correct outputs. This exercises ELF load, eret_to_user, - openat/read/write/close/lseek/exit syscalls, and virtio-blk DMA. - -## Blocker: boot0 stage 4 user-mode panic - -Stage 4 runs `hex2 in/catm.hex2 out/catm` inside the seed kernel. -Hex2 boots, processes the file, and panics partway through with: - -``` -PANIC: user sync, ESR=0x000000000000000d ELR=0x0000000000600730 FAR=0x0000000000000000 -``` - -— a user-mode load page fault on a null pointer. Same `hex2` binary -runs correctly under podman/Linux on the same input, so the bug is -on the seed-kernel side, not in the assembled hex2. - -The fault is at `lbu t4, 0(t2)` after `ld t2, 16(t0)`; t2=0 means the -ld read a zero from memory at `t0+16`. Disassembly says t0 should -hold s1 (set by `addi t0, s1, 0` four instructions earlier), but the -trapframe dump on the panic path shows t0=0x6007f0 (initial brk -address), not s1=0x600905, with no instruction in between that -modifies t0. - -### Investigation so far - -One real bug found and fixed: `trap_entry` was clobbering t0/t1 -(using them to read sscratch and to reach `saved_user_sp`) **before** -saving them to the trapframe. Linux's RISC-V syscall ABI preserves -all GPRs except a0; user code that holds state in t0/t1 across an -ecall would otherwise see kernel garbage on return. The fix in -`seed-kernel/arch/riscv64/kernel.S` reorders the saves: x5 and x6 -are now stashed before any kernel scratch use. - -After the fix the dump shows t1=0x6007e8 correctly (user value at -trap time), but t0 still reads back as 0x6007f0 — so something else -is going on. Candidates, in order of decreasing likelihood: - -1. **More tcc-riscv64 codegen bugs in the kernel itself.** We already - landed three: `abtol-long-accumulator` (mes-libc), `riscv64-cvt-int-zext` - + `riscv64-gen-cvt-sxtw` (u32→u64 didn't zero-extend), and - `riscv64-load-ptr-zext` (lui sign-extending pointer constants). - The kernel's trap entry/exit asm is the most exercised path; if - tcc miscompiles any C in trap_sync that touches the trapframe, - the dump reads garbage. -2. **Hex2 internal layout I'm misreading without source.** Possible - but doesn't explain why podman/Linux works on the same bytes. -3. **A trap_entry recursion I haven't identified** — e.g., a fault - inside trap_entry's saves that triggers a second pass through and - overwrites the original trapframe before C sees it. - -The dumps to confirm or rule out (1) are checked in but commented -out; the trap_sync `[sc] nr=…` syscall trace also stays out by -default to keep boot transcripts short. - -## How to repro - -```sh -# One-time prereq (10–15 min): -DRIVER=podman ./scripts/boot.sh riscv64 - -# Stash the kernel so the wipe in DRIVER=seed below preserves it: -mkdir -p build/.seed-bootstrap/riscv64 -cp build/riscv64/boot6/kernel.elf build/.seed-bootstrap/riscv64/ - -# Reproduce the panic (~3 min into the run, in boot0 stage 4): -DRIVER=seed ./scripts/boot.sh riscv64 -``` - -The full per-stage QEMU transcripts land in -`build/riscv64/.boot0-stage/s04/transcript.txt`. - -## Smaller-scope reproducer - -Once the chain has been built once, the failing stage can be replayed -without re-running boot0 stages 1–3: - -```sh -mkdir -p build/.qtest/s04/in -cp build/riscv64/boot0/hex2 build/.qtest/s04/init -cp vendor/seed/riscv64/catm.hex2 build/.qtest/s04/in/catm.hex2 -chmod +x build/.qtest/s04/init -( cd build/.qtest/s04 && { echo init; find in -type f; } | sort -u | \ - cpio -o -H newc 2>/dev/null ) > build/.qtest/s04/in.img -truncate -s 256M build/.qtest/s04/out.img - -qemu-system-riscv64 -machine virt -m 2048M -nographic -no-reboot \ - -global virtio-mmio.force-legacy=false \ - -kernel build/riscv64/boot6/kernel.elf \ - -drive file=build/.qtest/s04/in.img,if=none,format=raw,id=hd0,readonly=on \ - -device virtio-blk-device,drive=hd0 \ - -drive file=build/.qtest/s04/out.img,if=none,format=raw,id=hd1 \ - -device virtio-blk-device,drive=hd1 \ - -append "hex2 in/catm.hex2 out/catm" -``` - -To turn on the per-syscall trace and panic-time register dump that -nailed down the trap_entry bug, restore the prints around -`trap_sync()` in `seed-kernel/kernel.c` (see git history for the -exact diagnostics — they were removed before commit to keep -transcripts clean). - -## Rough work plan - -1. **Re-add the diagnostic prints under a compile-time flag** so - they aren't free-text deletes and don't pollute the boot logs by - default. -2. **Identify the t0 mismatch.** Most likely path: write a tiny - trap_entry self-test that has the kernel do a deliberate ecall in - a known register state and assert tf->x[5] reads back the value - it was set to before the ecall. If it fails, the bug is in - trap_entry itself (asm-level); if it passes, the bug is somewhere - between hex2's PC 0x600720 and 0x600730. -3. **Walk forward from there.** Each subsequent stage of the chain - may surface new tcc-riscv64 codegen issues — boot1, boot2's - scheme1, boot3's cc.scm-built tcc0, boot4's tcc1/2/3 self-host — - so expect this to be N rounds of *kernel runs → fault → tcc patch - or kernel asm fix → boot4 rebuild*. The TCC_BOOTSTRAP_RELAX_FIXEDPOINT - knob in `boot4.sh` is there exactly for this loop: each - codegen-altering tcc patch needs one extra bootstrap pass before - tcc2 == tcc3 settles. - -## Patches and source changes already landed - -- `vendor/mes-libc/patches/abtol-long-accumulator.{before,after}` — - `int i` → `long i` so `strtoull("0x80200000", …, 16)` returns - `0x80200000` instead of sign-extending to `0xffffffff80200000`. - Without this, tcc3 mishandles `-Wl,-Ttext=0x80200000` on the - riscv64 link line and the resulting ELF is unloadable. -- `scripts/simple-patches/tcc-0.9.26/riscv64-cvt-int-zext.{before,after}` - + `riscv64-gen-cvt-sxtw.{before,after}` — make `gen_cvt_sxtw` emit - `addiw` for signed and `slli;srli` for unsigned, and remove the - call-site gate that skipped the unsigned case. Without this, - `(u64)be32(p)` in the seed kernel's DTB parser sign-extends - cells whose top bit is set, so `mem_start = 0x80000000` reads - back as `0xffffffff80000000`. -- `scripts/simple-patches/tcc-0.9.26/riscv64-load-ptr-zext.{before,after}` - — widens the existing `bt == VT_LLONG` zext check at constant load - time to also cover `VT_PTR` and `VT_FUNC`. Without this, - `(u8 *)0x8b000000UL` (kheap_end constant) loads as `0xffffffff8b000000` - because `lui` always sign-extends bits 63:32. -- `seed-kernel/arch/riscv64/kernel.S`: - - Macros now use `.long` (32-bit) not `.word` — tcc 0.9.26's `.word` - is 16-bit, so the encoded CSR-op constants would be truncated. - - `SD`/`SW` macros emit base-first (`sd base, src, off`), since - tcc's riscv64 assembler parses three-comma stores as - `<rs1>, <rs2>, <imm>` rather than GAS's `<src>, <imm>(<base>)`. - - `bgeu` offset in the bss-zero loop changed from 12 to 16 - (off-by-one: 12 lands on the `J(1b)` instruction, not the next-stage - label). - - `trap_entry` saves x5 (t0) and x6 (t1) **before** any kernel - scratch use, instead of reading sscratch into t0 first. -- `scripts/boot4.sh` gains a `TCC_BOOTSTRAP_RELAX_FIXEDPOINT=1` - escape: codegen-altering tcc patches need a second bootstrap pass - before `cmp tcc2 tcc3` agrees. The next boot4 run (started from - the relaxed run's tcc3) settles back to a real fixed point. -- `scripts/boot6.sh` and `scripts/boot6-gen-runscm.sh` extended - for amd64 + riscv64; emit the right link base address and ELF - format per arch. -- `scripts/lib-runscm.sh` and `scripts/lib-pipeline.sh` dispatch to - `qemu-system-riscv64` (TCG only — no hvf for riscv on Apple - Silicon, hence ~10× slower per stage than aarch64) when - `SEED_ARCH=riscv64`. All the per-stage `boot[0-5].sh` scripts - pick the correct kernel filename for the active arch. diff --git a/scripts/boot4-gen-runscm.sh b/scripts/boot4-gen-runscm.sh @@ -31,6 +31,18 @@ case "$ARCH" in *) echo "boot4-gen: unknown arch $ARCH" >&2; exit 2 ;; esac +# Per-arch link base for user binaries. tcc 0.9.26's riscv64-link.c +# defaults to ELF_START_ADDR=0x10000, which lives below the seed +# kernel's USER_VA_LO (0x200000). amd64 (0x400000) and aarch64 +# (0x400000) defaults already sit inside the user window, so we leave +# them alone. Everywhere else in the chain (M0/hex2pp -B, boot6 +# -Wl,-Ttext) we link riscv64 user binaries at 0x600000; do the same +# here so tcc-built outputs are loadable inside the seed kernel. +case "$ARCH" in + riscv64) LINK_TTEXT='"-Wl,-Ttext=0x600000"' ;; + *) LINK_TTEXT= ;; +esac + # emit_helpers — cc reads .S/.c sources from in/, writes .o to out/. # cc_path is the cwd-relative path to the spawned compiler binary (in/tcc0 # for round B; out/tcc1, out/tcc2 in later rounds). @@ -71,7 +83,7 @@ emit_archive() { emit_link_tcc() { cc_path=$1; tag=$2; pfx=$3; out=$4 - echo "(must (run \"$cc_path\" \"-nostdlib\" \"out/${pfx}crt1.o\" \"in/tcc.flat.c\" \"out/${pfx}libc.a\" \"out/${pfx}libtcc1.a\" \"out/${pfx}libc.a\" \"-o\" \"out/$out\") \"$tag -> $out\")" + echo "(must (run \"$cc_path\" \"-nostdlib\" $LINK_TTEXT \"out/${pfx}crt1.o\" \"in/tcc.flat.c\" \"out/${pfx}libc.a\" \"out/${pfx}libtcc1.a\" \"out/${pfx}libc.a\" \"-o\" \"out/$out\") \"$tag -> $out\")" } { @@ -100,7 +112,7 @@ emit_helpers in/tcc0 tcc0 cat <<EOF (write-string stdout "boot4: stage C (tcc0 -> tcc1)\n") -(must (run "in/tcc0" "-nostdlib" "out/start.o" "out/sys_stubs.o" "out/mem.o" "out/libc.o" "out/$LIB_HELPER_OBJ" "in/tcc.flat.c" "-o" "out/tcc1") "tcc0 -> tcc1") +(must (run "in/tcc0" "-nostdlib" $LINK_TTEXT "out/start.o" "out/sys_stubs.o" "out/mem.o" "out/libc.o" "out/$LIB_HELPER_OBJ" "in/tcc.flat.c" "-o" "out/tcc1") "tcc0 -> tcc1") (write-string stdout "boot4: stage D (tcc1 -> tcc2)\n") EOF @@ -120,11 +132,11 @@ emit_helpers out/tcc2 tcc2 emit_archive out/tcc2 tcc2 "s3-" emit_link_tcc out/tcc2 tcc2 "s3-" tcc3 -cat <<'EPILOGUE' +cat <<EOF (write-string stdout "boot4: linking hello\n") -(must (run "out/tcc2" "-nostdlib" "out/s3-crt1.o" "in/hello.c" "out/s3-libc.a" "out/s3-libtcc1.a" "out/s3-libc.a" "-o" "out/hello") "tcc2 -> hello") +(must (run "out/tcc2" "-nostdlib" $LINK_TTEXT "out/s3-crt1.o" "in/hello.c" "out/s3-libc.a" "out/s3-libtcc1.a" "out/s3-libc.a" "-o" "out/hello") "tcc2 -> hello") (write-string stdout "boot4: ALL-OK\n") (exit 0) -EPILOGUE +EOF } > "$OUT" diff --git a/scripts/boot5-gen-runscm.sh b/scripts/boot5-gen-runscm.sh @@ -47,6 +47,15 @@ CFLAGS_ASM_QUOTED="$CFLAGS_BASE_QUOTED" CRTFLAGS_C_QUOTED="$CFLAGS_C_QUOTED \"-fno-stack-protector\" \"-DCRT\"" CRTFLAGS_ASM_QUOTED="$CFLAGS_ASM_QUOTED \"-fno-stack-protector\" \"-DCRT\"" +# tcc 0.9.26's riscv64-link.c default ELF_START_ADDR=0x10000 sits below +# the seed kernel's USER_VA_LO (0x200000); land riscv64 user binaries +# in the same 0x600000 window the rest of the chain uses. amd64 +# (0x400000) and aarch64 (0x400000) defaults already fit the window. +case "$MUSL_ARCH" in + riscv64) LINK_TTEXT='"-Wl,-Ttext=0x600000"' ;; + *) LINK_TTEXT= ;; +esac + { cat <<'PROLOGUE' ;; boot5 run.scm — drive musl-1.2.5 (~500 TUs) + hello. @@ -128,7 +137,7 @@ cat <<EOF (write-string stdout "boot5: stage D (link hello)\n") ;; -Lout pulls libc.a (just built); -Lin pulls libtcc1.a (input). -(must (run "in/tcc" "-static" "-nostdinc" "-nostdlib" "-include" "in/tcc-stdarg-bridge.h" +(must (run "in/tcc" "-static" "-nostdinc" "-nostdlib" "-include" "in/tcc-stdarg-bridge.h" $LINK_TTEXT "-I$CIN/include" "-I$CIN/arch/$MUSL_ARCH" "-I$CIN/arch/generic" "-I$CIN/obj/include" "out/crt1.o" "in/hello.c" "-Lout" "-lc" "-Lin" "-ltcc1" "-Lout" "-lc" "-o" "out/hello") "link hello") diff --git a/scripts/simple-patches/tcc-0.9.26/riscv64-elf-start-addr.after b/scripts/simple-patches/tcc-0.9.26/riscv64-elf-start-addr.after @@ -0,0 +1,8 @@ +/* Stock tcc 0.9.26 defaults statically-linked riscv64 binaries to + `addr = 0x00010000`, which sits below the seed kernel's user VA + window (USER_VA_LO = 0x00200000). Land tcc-emitted ELFs at the + same 0x600000 the rest of the chain (M0/hex2pp -B, scheme1, hex2, + ...) uses, so sys_spawn's load_elf can copy segments into the + user pool without falling through to the unmapped low-PA range. + amd64 (0x400000) and aarch64 (0x400000) defaults already fit. */ +#define ELF_START_ADDR 0x00600000 diff --git a/scripts/simple-patches/tcc-0.9.26/riscv64-elf-start-addr.before b/scripts/simple-patches/tcc-0.9.26/riscv64-elf-start-addr.before @@ -0,0 +1 @@ +#define ELF_START_ADDR 0x00010000 diff --git a/scripts/stage1-flatten.sh b/scripts/stage1-flatten.sh @@ -258,6 +258,13 @@ apply_our_patch riscv64-cvt-int-zext "$SRC/tccgen.c" apply_our_patch riscv64-gen-cvt-sxtw "$SRC/riscv64-gen.c" apply_our_patch riscv64-load-ptr-zext "$SRC/riscv64-gen.c" +# riscv64 ELF default load address — stock tcc lands binaries at +# 0x10000, below the seed kernel's USER_VA_LO=0x200000. Move the +# default to 0x600000 so tcc-emitted ELFs slot into the user pool +# without per-link `-Wl,-Ttext=` overrides. Patch is gated by the +# stock literal in the before-block, so it no-ops elsewhere. +apply_our_patch riscv64-elf-start-addr "$SRC/riscv64-link.c" + # riscv64 stdarg.h order fix — the upstream `#elif __riscv` branch # uses `__builtin_va_list` before it's typedef'd. Stock tcc treats # `__builtin_va_list` as a built-in keyword and forgives the forward diff --git a/seed-kernel/arch/riscv64/arch.h b/seed-kernel/arch/riscv64/arch.h @@ -8,7 +8,7 @@ #define ARCH_ELF_MACHINE 0xf3 #define ARCH_ELF_MACHINE_NAME "riscv64" -#define ARCH_DEVICE_ALIAS_BASE 0x100000000UL +#define ARCH_DEVICE_ALIAS_BASE 0x200000000UL #define ARCH_UART0_PA 0x10000000UL #define ARCH_KERNEL_HEAP_END 0x8b000000UL diff --git a/seed-kernel/arch/riscv64/kernel.S b/seed-kernel/arch/riscv64/kernel.S @@ -268,8 +268,10 @@ arch_idle_forever: .globl arch_mmio_ptr arch_mmio_ptr: + /* Device alias offset = ARCH_DEVICE_ALIAS_BASE = 1 << 33. + * Must match the L2 slot picked in arch/riscv64/mmu.c. */ li t0, 1 - slli t0, t0, 32 + slli t0, t0, 33 add a0, a0, t0 RET diff --git a/seed-kernel/arch/riscv64/mmu.c b/seed-kernel/arch/riscv64/mmu.c @@ -48,10 +48,23 @@ void arch_setup_mmu(void) { fill_user_l1(0); l2_root[0] = pte((u64)l1_user, PTE_V); - l2_root[1] = pte(0x40000000UL, DFLAGS); - l2_root[2] = pte(0x80000000UL, KFLAGS); - l2_root[3] = pte(0xc0000000UL, KFLAGS); - l2_root[4] = pte(0x00000000UL, DFLAGS); + l2_root[1] = pte(0x40000000UL, DFLAGS); + /* VA == PA identity map for kernel-side DRAM, slot per 1 GiB. + * Covers PA 0x80000000 .. 0x1ffffffff (6 GiB), enough for QEMU + * virt configurations up to ~6 GiB of RAM. The DTB lives near + * the top of physical RAM (e.g. 0x13fe00000 with -m 3072M); we + * read it via VA = dtb_phys, so it must fall inside this map. */ + l2_root[2] = pte(0x80000000UL, KFLAGS); + l2_root[3] = pte(0xc0000000UL, KFLAGS); + l2_root[4] = pte(0x100000000UL, KFLAGS); + l2_root[5] = pte(0x140000000UL, KFLAGS); + l2_root[6] = pte(0x180000000UL, KFLAGS); + l2_root[7] = pte(0x1c0000000UL, KFLAGS); + /* Device alias: VA ARCH_DEVICE_ALIAS_BASE + PA → PA, used by + * arch_mmio_ptr / arch_console_putc to reach UART + virtio-mmio + * regs whose low-PA addresses overlap the user pool L1 slots. + * Lives well above the direct-map kernel window. */ + l2_root[8] = pte(0x00000000UL, DFLAGS); riscv_set_sum(); riscv_write_satp(((u64)8 << 60) | ((u64)l2_root >> 12)); diff --git a/seed-kernel/kernel.c b/seed-kernel/kernel.c @@ -1123,8 +1123,15 @@ static i64 sys_spawn(struct trapframe *tf, const char *path, char **argv) { return -ENOEXEC; } - /* Reset brk above the new image's end-of-bss. */ + /* Reset brk above the new image's end-of-bss, page-aligned up. + * Some seed binaries (e.g. riscv64 hex2) embed PC-relative scratch + * buffers in the bytes just past their loaded image and assume brk + * lives a full page beyond — Linux rounds brk to PAGE_SIZE. If we + * placed the heap immediately after the image (16-byte aligned), a + * write through the in-binary scratch overlaps the first heap node + * and silently corrupts the user's data structures. */ brk_base = g_user_image_end ? g_user_image_end : USER_VA_LO; + brk_base = (brk_base + 0xfffUL) & ~0xfffUL; brk_cur = brk_base; /* Build new user stack at top of user VA window. */ @@ -1505,7 +1512,9 @@ void kmain(u64 dtb_phys) { * end-of-bss (g_user_image_end, set by load_elf). 16 MB reserved at * the top for the user stack. */ u64 ustack_top = USER_VA_HI; + /* See sys_spawn for why brk_base is page-rounded above end-of-image. */ brk_base = g_user_image_end ? g_user_image_end : USER_VA_LO; + brk_base = (brk_base + 0xfffUL) & ~0xfffUL; brk_cur = brk_base; brk_max = USER_VA_HI - 0x01000000UL;