commit 4675b390813456fd7840a5b1de21b90167752be9
parent 086f5f0fbe01a0a2d59d4158da12057d37cfe4e6
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Mon, 27 Apr 2026 13:23:33 -0700
docs: cc parallel work plan
Coordination doc for the next batch of failing tests/cc/ features
(082, 087, 111-118). Defines four parallel work streams, the design
seams between them, and the sequencing where streams aren't fully
independent (struct-return ABI is the only sequential chain).
Diffstat:
1 file changed, 233 insertions(+), 0 deletions(-)
diff --git a/docs/CC-PARALLEL-PLAN.md b/docs/CC-PARALLEL-PLAN.md
@@ -0,0 +1,233 @@
+# CC parallel work plan
+
+Coordination doc for the next batch of `tests/cc/` features. Defines
+four parallel work streams, the design seams between them, and the
+sequencing required where streams aren't fully independent.
+
+Companion to [CC-PUNCHLIST.md](CC-PUNCHLIST.md) (the per-item
+checklist) and [CC-INTERNALS.md](CC-INTERNALS.md) (module layering
+and the cg-fixture-first workflow). ABI choices below cite
+[P1.md §Arguments and return values](P1.md).
+
+## Tests in scope
+
+Currently failing on aarch64:
+
+| Test | Stream | Failure |
+|---|---|---|
+| cc/082-union-basic | D | exit 1 — union field offsets bumped like struct |
+| cc/087-sizeof-noeval | D | exit 2 — sizeof(x++) actually increments x |
+| cc/111-struct-ret-1word | A1 | compile-fail — return-by-value struct unhandled |
+| cc/112-struct-ret-2word | A1 | compile-fail — no two-word return path |
+| cc/113-struct-ret-3word | A2 | compile-fail — no indirect-result path |
+| cc/114-struct-ret-many-args | A3 | exit 2 — silently truncates |
+| cc/115-struct-ret-3word-many-args | A3 | exit 2 — silently truncates |
+| cc/116-struct-ret-vararg | A3 | exit 2 — silently truncates |
+| cc/117-compound-literal | B | compile-fail — `(T){…}` unparsed |
+| cc/118-const-expr | C | compile-fail — const-expr surface incomplete |
+
+## Stream layout
+
+```
+ ┌─ A1: 111 + 112 (one-word + two-word direct)
+Stream A (sequential) ───┤
+struct-return ABI └─ A2: 113 (indirect-result)
+ │
+ ▼
+ ┌─ A3a: 114 (sret + stack-staged args) ┐
+ ├─ A3b: 115 (sret + 3-word + many args) │ parallel
+ └─ A3c: 116 (sret + variadic save area) ┘
+
+Stream B (independent, t=0): 117 compound literals
+Stream C (independent, t=0): 118 const-expr evaluator
+Stream D (independent, t=0): 082 union offsets + 087 sizeof no-emit
+```
+
+`t=0`: A1, B, C, D fan out together. A2 starts when A1 lands. A3a/b/c
+fan out when A2 lands. Total wall time ≈ A1 + A2 + max(A3).
+
+## Stream A — Struct-return ABI
+
+Implements the three result conventions from P1.md §Arguments:
+
+| Width | Convention | a0 | a1 | Args 0..3 |
+|---|---|---|---|---|
+| ≤ 8B | one-word direct | result word 0 | (caller arg 1) | a0..a3 |
+| 9–16B | two-word direct | result word 0 | result word 1 | a0..a3 |
+| > 16B | indirect-result | result-buffer ptr | (callee arg 0) | a1..a3 (shifted) |
+
+In the indirect convention, incoming stack-arg slot 0 corresponds to
+explicit arg word 3 (not 4). `LDARG` indexing in the variadic save area
+must respect this shift.
+
+### A1 — one-word and two-word direct
+
+- **cg primitives:** extend `cg-fn-end` to load the function's return
+ slot into a0 (≤8B) or a0/a1 (9–16B) per the return ctype's size.
+ Extend `cg-call`'s receive side to allocate a fresh frame slot via
+ `cg-alloc-slot` (sized to the return ctype) and store back from
+ a0[/a1] before pushing a frame-lval.
+- **Uniform path:** ≤8B and 9–16B both go through a frame slot. No
+ register-only fast path. Simpler cg, identical parser surface.
+- **Receive-area lifetime:** fresh slot per call site, allocated by
+ cg-call. Required so chained `make_triple(...).c` (cc/113:32) and
+ `ret1(99).x` (cc/111:26) don't alias across consecutive calls.
+- **Parser:** `parse-return-stmt` accepts a struct-typed expression and
+ emits a per-byte copy from the source lval into the function's
+ return slot. `parse-postfix-rest` accepts struct-typed call results
+ as lvals so `.field` and `&` chain naturally.
+- **CC-CONTRACTS update:** §3.2 currently asserts a single 8-byte
+ return slot. Replace with a pointer to P1.md §Arguments and a note
+ that the cg lowers all three conventions.
+- **Fixtures:** `tests/cc-cg/70-struct-ret-1word.scm`,
+ `71-struct-ret-2word.scm` first; then unblock `tests/cc/111`, `112`.
+
+### A2 — indirect-result
+
+- **cg primitives:** when return ctype size > 16, `cg-fn-begin/v`
+ treats arg slot 0 as the sret pointer, parameter slots shift by one
+ register. `cg-fn-end` does no a0 store (callee already wrote through
+ *a0 during the return-stmt copy); the convention's "a0 holds the
+ same pointer on return" is automatic since a0 hasn't been clobbered.
+- **Caller side:** `cg-call` detects sret-eligible return type; before
+ emitting the call, materializes the receive-slot's address into a0,
+ shifts ordinary args from a0..a3 → a1..a3 + stack, with stack slot 0
+ now holding arg word 3.
+- **Variadic interaction is deferred to A3c.** A2 itself targets only
+ cc/113 (no varargs, ≤4 named args).
+- **Fixtures:** `tests/cc-cg/72-struct-ret-3word.scm`; then `tests/cc/113`.
+
+### A3 — sret compositions (parallel)
+
+Each agent picks up one test, mostly composing infrastructure A1+A2
+already shipped:
+
+- **A3a — cc/114** (sret-pair + 8 stack-staged args): the two-word
+ return doesn't need sret; the 8 args do exercise stack staging
+ (P1.md §Incoming stack-argument area). Validates that A1's
+ two-word receive composes with the existing stack-stage path.
+- **A3b — cc/115** (sret-3word + 8 stack-staged args): with sret in
+ a0, args 0–2 live in a1–a3 and args 3–7 stage to stack slots 0–4.
+ Pure indexing test for A2's shift.
+- **A3c — cc/116** (sret + variadic save area): `cg-fn-begin/v`'s
+ 16-slot save window indexes from incoming arg word 0. When the fn
+ uses indirect-result, slot 0 is arg word 3 (not 4) — A3c adjusts the
+ windowing accordingly. `__builtin_va_start` must skip the sret
+ pointer; in practice that's automatic if the named-arg count
+ threading is correct, since the sret pointer occupies a0 and the
+ named args start at a1.
+
+## Stream B — Compound literals (117)
+
+C99 §6.5.2.5: `(T){ init-list }` as a postfix expression. Block-scope
+only; file-scope literals `die` explicitly.
+
+- **Parser:** detect `(T){` lookahead in `parse-cast-or-unary`. Parse
+ the typename via `parse-decl-spec` + `parse-declarator`, then call
+ the existing `parse-init-local-aggregate` against a fresh slot from
+ `cg-alloc-slot`. Push a frame-lval typed as T (or T[N]).
+- **Lvalue contract:** the literal is an lvalue, so `&literal`,
+ `literal.field`, and `literal[i]` all work via existing
+ `cg-take-addr` / `cg-push-field` / `cg-decay-array` paths.
+- **Lifetime:** frame slot ⇒ enclosing block, automatic. Matches C99.
+- **Reuse:** no new cg primitives. The fixture surface (positional,
+ designated, partial-init zero-fill, trailing comma, array decay,
+ byval struct arg) is all already covered by Stream E in
+ CC-PUNCHLIST.
+
+## Stream C — Const-expr evaluator (118)
+
+Adds `parse-const-expr ps → (value . ctype)`, a self-contained walker
+that never touches cg.
+
+- **Operand surface:** integer/character literal, enum constant,
+ `sizeof(typename)`, unary `+ - ~ !`, binary `+ - * / % << >> & | ^`,
+ compare `< <= > >= == !=`, logical `&& ||` (short-circuit), ternary
+ `?:`, cast to integer type, parenthesization. Anything else `die`s.
+- **Width-aware return:** the `(int)(unsigned char)257 == 1` case
+ requires the cast to truncate at u8 width. Bare fixnum loses this;
+ the (value . ctype) tuple keeps it.
+- **Sizeof arm:** for now, only `sizeof(TYPENAME)` — the only form
+ exercised by 118. If a value-expression form surfaces later, grow a
+ scope-lookup arm; still no cg interaction.
+- **Wiring (replace existing `parse-const-int` call sites):**
+ `parse-enum-spec`, `parse-decl-suf-cont`'s `[]` arm,
+ `parse-init-global`'s scalar branch, `parse-switch-stmt`'s case
+ label, local array bound in `parse-stmt`.
+- **No interaction with Stream D.** See "Sizeof split" below.
+
+## Stream D — Surgical fixes (082 + 087)
+
+Single agent — both ~30-line changes.
+
+### 082 — union field offsets
+
+`parse-struct-fields` (cc.scm:3634) advances offset after every field
+regardless of kind. For unions all fields must stay at offset 0.
+
+- Thread `kind` from `parse-aggregate-spec` (cc.scm:3611, where it's
+ already in scope) through to `parse-struct-fields`.
+- Gate the `(+ oa (max sz 0))` bump on `(eq? kind 'struct)`. Unions
+ pass through with `oa` unchanged.
+- `complete-agg!` already sizes unions correctly; no change there.
+
+### 087 — sizeof no-emit
+
+The current sizeof arm at cc.scm:4898 calls `parse-unary` / `parse-expr`
+which emit code for the operand. `sizeof(x++)` therefore actually
+increments x.
+
+- Add cg primitives `cg-snapshot cg → tag` and `cg-rewind cg tag`.
+ Snapshot captures vstack depth and fn-buf chunk count; rewind
+ restores both. Internal-only.
+- In the sizeof arm, snapshot before parse-unary, read
+ `(opnd-type (cg-top))`, rewind, push `cg-push-imm %t-u64 size`.
+- Fixture lock-in: 087 covers `sizeof(x++)`. Add a cg-fixture if cg
+ primitives need direct validation.
+
+## Cross-stream contracts
+
+### Sizeof split (C and D stay independent)
+
+Two distinct callers, two independent mechanisms:
+
+- **Outside const-expr** (Stream D — 087): operand can be anything
+ (`x++`, calls, side-effects). Result lives at runtime. Use
+ `cg-snapshot` / `cg-rewind`.
+- **Inside const-expr** (Stream C — 118): operand grammar restricted;
+ result is a parse-time fixnum. Const-expr evaluator handles
+ `sizeof(TYPENAME)` directly via `parse-decl-spec` +
+ `parse-declarator` + `ctype-size`. Never calls cg.
+
+The two paths share the *concept* (don't evaluate the operand) but
+not the implementation. They can land in either order.
+
+### A1 ↔ B/C/D
+
+Independent. A1 only touches `cg-fn-end`, `cg-call`, and
+`parse-return-stmt`. B touches `parse-cast-or-unary`. C adds a new
+walker plus changes to four call sites that don't overlap with A1.
+D touches `parse-struct-fields` and adds two new cg primitives.
+
+### A2 ↔ A3
+
+A3 depends on A2's indirect-result implementation. A3a/b/c are
+independent of each other once A2 has landed.
+
+## Workflow per stream
+
+Per [CC-INTERNALS §Feature workflow](CC-INTERNALS.md#feature-workflow):
+
+1. cc-cg fixture (red) — drive the cg API directly.
+2. Implement cg primitives until cc-cg green.
+3. cc fixture (red) — full driver.
+4. Implement parser changes until cc green.
+
+Pick the next free `<n>` per suite. cc-cg currently goes up through
+69; cc goes up through 118 (with gaps).
+
+## Acceptance
+
+Per stream: `make test SUITE=cc ARCH=aarch64` shows the stream's
+target tests as PASS, no prior tests regress. Final acceptance: all
+ten currently-failing tests green on aarch64.