boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit 4675b390813456fd7840a5b1de21b90167752be9
parent 086f5f0fbe01a0a2d59d4158da12057d37cfe4e6
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Mon, 27 Apr 2026 13:23:33 -0700

docs: cc parallel work plan

Coordination doc for the next batch of failing tests/cc/ features
(082, 087, 111-118). Defines four parallel work streams, the design
seams between them, and the sequencing where streams aren't fully
independent (struct-return ABI is the only sequential chain).

Diffstat:
Adocs/CC-PARALLEL-PLAN.md | 233+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 233 insertions(+), 0 deletions(-)

diff --git a/docs/CC-PARALLEL-PLAN.md b/docs/CC-PARALLEL-PLAN.md @@ -0,0 +1,233 @@ +# CC parallel work plan + +Coordination doc for the next batch of `tests/cc/` features. Defines +four parallel work streams, the design seams between them, and the +sequencing required where streams aren't fully independent. + +Companion to [CC-PUNCHLIST.md](CC-PUNCHLIST.md) (the per-item +checklist) and [CC-INTERNALS.md](CC-INTERNALS.md) (module layering +and the cg-fixture-first workflow). ABI choices below cite +[P1.md §Arguments and return values](P1.md). + +## Tests in scope + +Currently failing on aarch64: + +| Test | Stream | Failure | +|---|---|---| +| cc/082-union-basic | D | exit 1 — union field offsets bumped like struct | +| cc/087-sizeof-noeval | D | exit 2 — sizeof(x++) actually increments x | +| cc/111-struct-ret-1word | A1 | compile-fail — return-by-value struct unhandled | +| cc/112-struct-ret-2word | A1 | compile-fail — no two-word return path | +| cc/113-struct-ret-3word | A2 | compile-fail — no indirect-result path | +| cc/114-struct-ret-many-args | A3 | exit 2 — silently truncates | +| cc/115-struct-ret-3word-many-args | A3 | exit 2 — silently truncates | +| cc/116-struct-ret-vararg | A3 | exit 2 — silently truncates | +| cc/117-compound-literal | B | compile-fail — `(T){…}` unparsed | +| cc/118-const-expr | C | compile-fail — const-expr surface incomplete | + +## Stream layout + +``` + ┌─ A1: 111 + 112 (one-word + two-word direct) +Stream A (sequential) ───┤ +struct-return ABI └─ A2: 113 (indirect-result) + │ + ▼ + ┌─ A3a: 114 (sret + stack-staged args) ┐ + ├─ A3b: 115 (sret + 3-word + many args) │ parallel + └─ A3c: 116 (sret + variadic save area) ┘ + +Stream B (independent, t=0): 117 compound literals +Stream C (independent, t=0): 118 const-expr evaluator +Stream D (independent, t=0): 082 union offsets + 087 sizeof no-emit +``` + +`t=0`: A1, B, C, D fan out together. A2 starts when A1 lands. A3a/b/c +fan out when A2 lands. Total wall time ≈ A1 + A2 + max(A3). + +## Stream A — Struct-return ABI + +Implements the three result conventions from P1.md §Arguments: + +| Width | Convention | a0 | a1 | Args 0..3 | +|---|---|---|---|---| +| ≤ 8B | one-word direct | result word 0 | (caller arg 1) | a0..a3 | +| 9–16B | two-word direct | result word 0 | result word 1 | a0..a3 | +| > 16B | indirect-result | result-buffer ptr | (callee arg 0) | a1..a3 (shifted) | + +In the indirect convention, incoming stack-arg slot 0 corresponds to +explicit arg word 3 (not 4). `LDARG` indexing in the variadic save area +must respect this shift. + +### A1 — one-word and two-word direct + +- **cg primitives:** extend `cg-fn-end` to load the function's return + slot into a0 (≤8B) or a0/a1 (9–16B) per the return ctype's size. + Extend `cg-call`'s receive side to allocate a fresh frame slot via + `cg-alloc-slot` (sized to the return ctype) and store back from + a0[/a1] before pushing a frame-lval. +- **Uniform path:** ≤8B and 9–16B both go through a frame slot. No + register-only fast path. Simpler cg, identical parser surface. +- **Receive-area lifetime:** fresh slot per call site, allocated by + cg-call. Required so chained `make_triple(...).c` (cc/113:32) and + `ret1(99).x` (cc/111:26) don't alias across consecutive calls. +- **Parser:** `parse-return-stmt` accepts a struct-typed expression and + emits a per-byte copy from the source lval into the function's + return slot. `parse-postfix-rest` accepts struct-typed call results + as lvals so `.field` and `&` chain naturally. +- **CC-CONTRACTS update:** §3.2 currently asserts a single 8-byte + return slot. Replace with a pointer to P1.md §Arguments and a note + that the cg lowers all three conventions. +- **Fixtures:** `tests/cc-cg/70-struct-ret-1word.scm`, + `71-struct-ret-2word.scm` first; then unblock `tests/cc/111`, `112`. + +### A2 — indirect-result + +- **cg primitives:** when return ctype size > 16, `cg-fn-begin/v` + treats arg slot 0 as the sret pointer, parameter slots shift by one + register. `cg-fn-end` does no a0 store (callee already wrote through + *a0 during the return-stmt copy); the convention's "a0 holds the + same pointer on return" is automatic since a0 hasn't been clobbered. +- **Caller side:** `cg-call` detects sret-eligible return type; before + emitting the call, materializes the receive-slot's address into a0, + shifts ordinary args from a0..a3 → a1..a3 + stack, with stack slot 0 + now holding arg word 3. +- **Variadic interaction is deferred to A3c.** A2 itself targets only + cc/113 (no varargs, ≤4 named args). +- **Fixtures:** `tests/cc-cg/72-struct-ret-3word.scm`; then `tests/cc/113`. + +### A3 — sret compositions (parallel) + +Each agent picks up one test, mostly composing infrastructure A1+A2 +already shipped: + +- **A3a — cc/114** (sret-pair + 8 stack-staged args): the two-word + return doesn't need sret; the 8 args do exercise stack staging + (P1.md §Incoming stack-argument area). Validates that A1's + two-word receive composes with the existing stack-stage path. +- **A3b — cc/115** (sret-3word + 8 stack-staged args): with sret in + a0, args 0–2 live in a1–a3 and args 3–7 stage to stack slots 0–4. + Pure indexing test for A2's shift. +- **A3c — cc/116** (sret + variadic save area): `cg-fn-begin/v`'s + 16-slot save window indexes from incoming arg word 0. When the fn + uses indirect-result, slot 0 is arg word 3 (not 4) — A3c adjusts the + windowing accordingly. `__builtin_va_start` must skip the sret + pointer; in practice that's automatic if the named-arg count + threading is correct, since the sret pointer occupies a0 and the + named args start at a1. + +## Stream B — Compound literals (117) + +C99 §6.5.2.5: `(T){ init-list }` as a postfix expression. Block-scope +only; file-scope literals `die` explicitly. + +- **Parser:** detect `(T){` lookahead in `parse-cast-or-unary`. Parse + the typename via `parse-decl-spec` + `parse-declarator`, then call + the existing `parse-init-local-aggregate` against a fresh slot from + `cg-alloc-slot`. Push a frame-lval typed as T (or T[N]). +- **Lvalue contract:** the literal is an lvalue, so `&literal`, + `literal.field`, and `literal[i]` all work via existing + `cg-take-addr` / `cg-push-field` / `cg-decay-array` paths. +- **Lifetime:** frame slot ⇒ enclosing block, automatic. Matches C99. +- **Reuse:** no new cg primitives. The fixture surface (positional, + designated, partial-init zero-fill, trailing comma, array decay, + byval struct arg) is all already covered by Stream E in + CC-PUNCHLIST. + +## Stream C — Const-expr evaluator (118) + +Adds `parse-const-expr ps → (value . ctype)`, a self-contained walker +that never touches cg. + +- **Operand surface:** integer/character literal, enum constant, + `sizeof(typename)`, unary `+ - ~ !`, binary `+ - * / % << >> & | ^`, + compare `< <= > >= == !=`, logical `&& ||` (short-circuit), ternary + `?:`, cast to integer type, parenthesization. Anything else `die`s. +- **Width-aware return:** the `(int)(unsigned char)257 == 1` case + requires the cast to truncate at u8 width. Bare fixnum loses this; + the (value . ctype) tuple keeps it. +- **Sizeof arm:** for now, only `sizeof(TYPENAME)` — the only form + exercised by 118. If a value-expression form surfaces later, grow a + scope-lookup arm; still no cg interaction. +- **Wiring (replace existing `parse-const-int` call sites):** + `parse-enum-spec`, `parse-decl-suf-cont`'s `[]` arm, + `parse-init-global`'s scalar branch, `parse-switch-stmt`'s case + label, local array bound in `parse-stmt`. +- **No interaction with Stream D.** See "Sizeof split" below. + +## Stream D — Surgical fixes (082 + 087) + +Single agent — both ~30-line changes. + +### 082 — union field offsets + +`parse-struct-fields` (cc.scm:3634) advances offset after every field +regardless of kind. For unions all fields must stay at offset 0. + +- Thread `kind` from `parse-aggregate-spec` (cc.scm:3611, where it's + already in scope) through to `parse-struct-fields`. +- Gate the `(+ oa (max sz 0))` bump on `(eq? kind 'struct)`. Unions + pass through with `oa` unchanged. +- `complete-agg!` already sizes unions correctly; no change there. + +### 087 — sizeof no-emit + +The current sizeof arm at cc.scm:4898 calls `parse-unary` / `parse-expr` +which emit code for the operand. `sizeof(x++)` therefore actually +increments x. + +- Add cg primitives `cg-snapshot cg → tag` and `cg-rewind cg tag`. + Snapshot captures vstack depth and fn-buf chunk count; rewind + restores both. Internal-only. +- In the sizeof arm, snapshot before parse-unary, read + `(opnd-type (cg-top))`, rewind, push `cg-push-imm %t-u64 size`. +- Fixture lock-in: 087 covers `sizeof(x++)`. Add a cg-fixture if cg + primitives need direct validation. + +## Cross-stream contracts + +### Sizeof split (C and D stay independent) + +Two distinct callers, two independent mechanisms: + +- **Outside const-expr** (Stream D — 087): operand can be anything + (`x++`, calls, side-effects). Result lives at runtime. Use + `cg-snapshot` / `cg-rewind`. +- **Inside const-expr** (Stream C — 118): operand grammar restricted; + result is a parse-time fixnum. Const-expr evaluator handles + `sizeof(TYPENAME)` directly via `parse-decl-spec` + + `parse-declarator` + `ctype-size`. Never calls cg. + +The two paths share the *concept* (don't evaluate the operand) but +not the implementation. They can land in either order. + +### A1 ↔ B/C/D + +Independent. A1 only touches `cg-fn-end`, `cg-call`, and +`parse-return-stmt`. B touches `parse-cast-or-unary`. C adds a new +walker plus changes to four call sites that don't overlap with A1. +D touches `parse-struct-fields` and adds two new cg primitives. + +### A2 ↔ A3 + +A3 depends on A2's indirect-result implementation. A3a/b/c are +independent of each other once A2 has landed. + +## Workflow per stream + +Per [CC-INTERNALS §Feature workflow](CC-INTERNALS.md#feature-workflow): + +1. cc-cg fixture (red) — drive the cg API directly. +2. Implement cg primitives until cc-cg green. +3. cc fixture (red) — full driver. +4. Implement parser changes until cc green. + +Pick the next free `<n>` per suite. cc-cg currently goes up through +69; cc goes up through 118 (with gaps). + +## Acceptance + +Per stream: `make test SUITE=cc ARCH=aarch64` shows the stream's +target tests as PASS, no prior tests regress. Final acceptance: all +ten currently-failing tests green on aarch64.