boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit ba84296701ef636f54258c4a3b2e5850f63254e7
parent e9d50b023fe017da1965ab93090e14c169f589b8
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Mon,  4 May 2026 11:00:58 -0700

OS.md update

Diffstat:
Mdocs/OS.md | 189++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------
1 file changed, 114 insertions(+), 75 deletions(-)

diff --git a/docs/OS.md b/docs/OS.md @@ -41,14 +41,43 @@ These are the native Linux ABIs; the per-arch shims in ~520–930) marshal P1 registers into them. Any kernel that implements these three ABIs verbatim can host the chain. -Syscall numbers are the standard Linux-on-`uname-m` numbers used by -those macros (e.g. `read=63` on aarch64, `read=0` on amd64). A -fresh-write OS is free to renumber, but only at the cost of also -rewriting the per-arch `p1_sys_*` macros. +## Platform layers -## Process image +A compliant platform owes the chain four things: -### ELF +1. **ISA execution** — a CPU (or emulator) that runs the target + user-mode instruction stream `M0`/`hex2` emit. +2. **Image loader** — reads a static ELF, maps `PT_LOAD` segments, + lays out the initial stack, transfers control to `e_entry`. +3. **Address space and syscall trap** — a per-process virtual memory + with a movable program break, plus a trap handler that decodes the + per-arch syscall ABI from §Targets and dispatches. +4. **Syscall implementations** — the 8 Tier-1 / +3 Tier-2 behaviors, + backed by a byte-addressable persistent store for the file-related + ones. + +The remaining sections specify each layer. "Implementing the +contract" means all four; readers chasing only the syscall tables +will miss layers 1–3. + +## Layer 1 — ISA execution + +The chain emits **integer-only, user-mode** code for the chosen arch: + +- Integer arithmetic, load/store, branches/calls. +- The syscall trap instruction from §Targets. +- **No FPU.** `HAVE_FLOAT` is off through libc; `cc.scm` rejects + `0.0` literals. The kernel needs no FP save/restore beyond what + the platform demands (single-process here, so moot). +- **No SIMD, no atomics.** Single-threaded; no shared memory. +- **One arch per image.** No multi-arch fat ELFs. + +A platform that can run static integer-only Linux user binaries on +the named arch already satisfies this layer. + +## Layer 2 — Image loader + +### ELF format - **ET_EXEC, static.** No `PT_INTERP`, no dynamic linker. tcc-boot2's output and every host artefact are statically linked. @@ -57,12 +86,11 @@ rewriting the per-arch `p1_sys_*` macros. - **Entry at `e_entry`.** No `_start` indirection required from the kernel; the loader's job is to transfer control to `e_entry` with the stack laid out below and to return execution to userspace. -- **Single arch per image.** No multi-arch fat ELFs. The `ELF.hex2` file in this repo emits exactly this shape (one `PT_LOAD`, `e_entry` set, no PHDR self-reference). -### Stack at entry +### Initial stack Standard Linux SysV layout. The kernel must place at the initial stack pointer, low to high: @@ -83,7 +111,9 @@ sp + 8 argv[0] (pointer) NULL to find `environ`. **auxv is not required** — nothing in the chain reads it. -### Address space +## Layer 3 — Address space and syscall trap + +### Memory model - **One contiguous heap, grown via `brk`.** The kernel exposes a per-process program break; `sys_brk(0)` returns it, `sys_brk(addr)` @@ -95,55 +125,24 @@ chain reads it. request.** No W^X enforcement complications: tcc-boot2 doesn't JIT; every page is either RX (text) or RW (data/bss/stack/heap). -## Process lifecycle - -- **Image swap via `execve`** (Tier 2). Replaces the calling process's - memory map; on success, control returns at the new image's - `e_entry`. -- **Spawn via `clone`** with `fork()` semantics (Tier 2): new - address space (no `CLONE_VM`), new fd table, parent/child return - distinguished by return value (0 in child, child-pid in parent). - The scheme1 prelude calls `(sys-clone)` with no arguments — the - P1pp wrapper supplies `SIGCHLD` as the only flag. The `fork()` - syscall itself is not required. -- **Reap via `waitid`** (Tier 2). Only `WEXITED` (=4) is used. Job - control flags are not needed. -- **Termination via `exit_group`.** Exit status is the low byte of - the argument. No `atexit`, no destructors. - -No signal-handler installation is required. Default actions -(SIGSEGV → terminate, SIGPIPE → terminate, etc.) are sufficient. The -chain installs zero handlers; `boot2-syscall.c` stubs `raise` to -ENOSYS. - -## Filesystem - -A flat, byte-addressable file abstraction with POSIX read/write -semantics. Concretely: - -- Regular files have a length and an in-file byte offset per fd. -- `O_RDONLY | O_WRONLY | O_RDWR | O_CREAT | O_TRUNC | O_APPEND` flags - honored; no `O_NONBLOCK`, no `O_DIRECT`. -- Mode bits on `openat(O_CREAT)`: only the user-rwx bits need - honoring; group/other and setuid bits can be ignored. -- `lseek` whences: `SEEK_SET=0`, `SEEK_CUR=1`, `SEEK_END=2`. -- `unlinkat(AT_FDCWD, path, 0)` removes a regular file. +### Syscall ABI -**Not required:** +Trap instruction, argument registers, syscall-number register, and +return register are listed per arch in §Targets. Syscall numbers +default to the standard Linux-on-`uname-m` values used by the per-arch +P1 macros (e.g. `read=63` on aarch64, `read=0` on amd64). A +fresh-write OS may renumber, but only at the cost of also rewriting +the per-arch `p1_sys_*` macros in `P1/P1-{aarch64,amd64,riscv64}.M1pp`. -- `stat`, `fstat`, directory iteration, symlinks, hard links, file - modes beyond a usable subset, mtime, ownership. -- A hierarchical filesystem in any rich sense; flat directory plus - `/` separators is enough. tcc-boot2 reads files by literal path - strings the build emits. +Error returns follow the standard Linux convention: a non-negative +result on success or a negative errno value in the return register. +See [§Error convention](#error-convention). -The chain opens 3 fd kinds: source files (read), output files -(write+create+trunc), and the inherited stdin/stdout/stderr (0/1/2). -Pipes appear only at Tier 2. +## Layer 4 — Syscalls -## Tier 1 — toolchain syscalls +### Tier 1 — toolchain (8 calls) -Eight calls. Wired in `P1/P1pp.P1pp:986-1055`. +Wired in `P1/P1pp.P1pp:986-1055`. | name | linux nr (aa64 / amd64 / riscv64) | semantics | |-----------|-----------------------------------|------------------------------------------------------| @@ -156,21 +155,48 @@ Eight calls. Wired in `P1/P1pp.P1pp:986-1055`. | unlinkat | 35 / 263 / 35 | called as `unlinkat(AT_FDCWD=-100, path, 0)` | | exit_group| 93 / 60 / 93 | `void exit(status)`; never returns | -Errors are returned as negative errno (`-EBADF`, `-ENOENT`, …) in the -result register, per the standard Linux convention. The libc errno -layer (`vendor/mes-libc/boot2-syscall.c`) negates and stores into a -single global `errno` int. - Everything in `docs/LIBC.txt`'s "syscall-using" column reduces to exactly these eight (`fopen → openat`, `fseek → lseek`, `malloc/ realloc/free → brk`, `__assert_fail / abort / exit → exit_group`, etc.). -## Tier 2 — driver syscalls +#### Filesystem semantics + +A flat, byte-addressable file abstraction with POSIX read/write +semantics: + +- Regular files have a length and an in-file byte offset per fd. +- `O_RDONLY | O_WRONLY | O_RDWR | O_CREAT | O_TRUNC | O_APPEND` flags + honored; no `O_NONBLOCK`, no `O_DIRECT`. +- Mode bits on `openat(O_CREAT)`: only the user-rwx bits need + honoring; group/other and setuid bits can be ignored. +- `lseek` whences: `SEEK_SET=0`, `SEEK_CUR=1`, `SEEK_END=2`. +- `unlinkat(AT_FDCWD, path, 0)` removes a regular file. + +Not required: `stat`, `fstat`, directory iteration, symlinks, hard +links, file modes beyond a usable subset, mtime, ownership. A +hierarchical filesystem in any rich sense is not required either — +flat directory plus `/` separators is enough; tcc-boot2 reads files +by literal path strings the build emits. + +The chain opens 3 fd kinds: source files (read), output files +(write+create+trunc), and the inherited stdin/stdout/stderr (0/1/2). +No pipes are used at any tier. + +#### Termination + +- **`exit_group`.** Exit status is the low byte of the argument. No + `atexit`, no destructors. +- **No signal-handler installation required.** Default actions + (SIGSEGV → terminate, SIGPIPE → terminate, etc.) are sufficient. + The chain installs zero handlers; `boot2-syscall.c` stubs `raise` + to ENOSYS. + +### Tier 2 — driver (+3 calls) -Adds three. Per-arch macros already exist in `P1/P1-*.M1pp`. The -scheme1 prelude's `spawn` / `run` / `wait` / `exit` are built -directly on these (`scheme1/prelude.scm:520-537`). +Per-arch macros already exist in `P1/P1-*.M1pp`. The scheme1 prelude's +`spawn` / `run` / `wait` / `exit` are built directly on these +(`scheme1/prelude.scm:520-537`). | name | linux nr (aa64 / amd64 / riscv64) | driver role | |---------|-----------------------------------|-------------------------------------------| @@ -178,7 +204,20 @@ directly on these (`scheme1/prelude.scm:520-537`). | execve | 221 / 59 / 221 | image swap; takes `(prog, argv)` — no envp arg in the prelude wrapper, so the kernel-side execve must accept a NULL/empty envp without erroring | | waitid | 95 / 247 / 95 | reap child; called as `waitid(P_PID=1, pid, info, WEXITED=4)` — info[8]=si_code, info[24]=si_status (`scheme1/prelude.scm:497-506`) | -**Notably not required:** +#### Process lifecycle + +- **Image swap via `execve`.** Replaces the calling process's memory + map; on success, control returns at the new image's `e_entry`. +- **Spawn via `clone`** with `fork()` semantics: new address space + (no `CLONE_VM`), new fd table, parent/child return distinguished by + return value (0 in child, child-pid in parent). The scheme1 prelude + calls `(sys-clone)` with no arguments — the P1pp wrapper supplies + `SIGCHLD` as the only flag. The `fork()` syscall itself is not + required. +- **Reap via `waitid`.** Only `WEXITED` (=4) is used. Job control + flags are not needed. + +Notably **not** required at Tier 2: - `dup3` / `dup2`, `pipe` / `pipe2` — no fd plumbing between processes. Children inherit stdin/stdout/stderr (0/1/2) from the @@ -190,18 +229,18 @@ directly on these (`scheme1/prelude.scm:520-537`). If a future driver needs redirection (say, capturing tcc-boot2's stderr into a file), the right move is to grow the prelude to use -`dup3` and add the syscall here; until then it's not in the -contract. - -## Errors - -- **Convention:** every syscall returns either a non-negative result - or a negative errno value in the result register. No errno TLS - variable in the kernel/userspace contract — the value lives in the - return register. -- **Errno numbers:** standard Linux constants (`EBADF=9`, - `ENOENT=2`, `EFAULT=14`, …). The libc layer maps them through - `strerror` lookup tables vendored from mes. +`dup3` and add the syscall here; until then it's not in the contract. + +### Error convention + +- Every syscall returns either a non-negative result or a negative + errno value in the return register. No errno TLS variable in the + kernel/userspace contract — the value lives in the return register. + The libc errno layer (`vendor/mes-libc/boot2-syscall.c`) negates + and stores into a single global `errno` int. +- Errno numbers: standard Linux constants (`EBADF=9`, `ENOENT=2`, + `EFAULT=14`, …). The libc layer maps them through `strerror` lookup + tables vendored from mes. ## Out of scope