CC parallel work plan
Coordination doc for the next batch of tests/cc/ features. Defines
four parallel work streams, the design seams between them, and the
sequencing required where streams aren't fully independent.
Companion to CC-PUNCHLIST.md (the per-item checklist) and CC-INTERNALS.md (module layering and the cg-fixture-first workflow). ABI choices below cite P1.md §Arguments and return values.
Tests in scope
Currently failing on aarch64:
| Test | Stream | Failure |
|---|---|---|
| cc/082-union-basic | D | exit 1 — union field offsets bumped like struct |
| cc/087-sizeof-noeval | D | exit 2 — sizeof(x++) actually increments x |
| cc/111-struct-ret-1word | A1 | compile-fail — return-by-value struct unhandled |
| cc/112-struct-ret-2word | A1 | compile-fail — no two-word return path |
| cc/113-struct-ret-3word | A2 | compile-fail — no indirect-result path |
| cc/114-struct-ret-many-args | A3 | exit 2 — silently truncates |
| cc/115-struct-ret-3word-many-args | A3 | exit 2 — silently truncates |
| cc/116-struct-ret-vararg | A3 | exit 2 — silently truncates |
| cc/117-compound-literal | B | compile-fail — (T){…} unparsed |
| cc/118-const-expr | C | compile-fail — const-expr surface incomplete |
Stream layout
┌─ A1: 111 + 112 (one-word + two-word direct)
Stream A (sequential) ───┤
struct-return ABI └─ A2: 113 (indirect-result)
│
▼
┌─ A3a: 114 (sret + stack-staged args) ┐
├─ A3b: 115 (sret + 3-word + many args) │ parallel
└─ A3c: 116 (sret + variadic save area) ┘
Stream B (independent, t=0): 117 compound literals
Stream C (independent, t=0): 118 const-expr evaluator
Stream D (independent, t=0): 082 union offsets + 087 sizeof no-emit
t=0: A1, B, C, D fan out together. A2 starts when A1 lands. A3a/b/c
fan out when A2 lands. Total wall time ≈ A1 + A2 + max(A3).
Stream A — Struct-return ABI
Implements the three result conventions from P1.md §Arguments:
| Width | Convention | a0 | a1 | Args 0..3 |
|---|---|---|---|---|
| ≤ 8B | one-word direct | result word 0 | (caller arg 1) | a0..a3 |
| 9–16B | two-word direct | result word 0 | result word 1 | a0..a3 |
| > 16B | indirect-result | result-buffer ptr | (callee arg 0) | a1..a3 (shifted) |
In the indirect convention, incoming stack-arg slot 0 corresponds to
explicit arg word 3 (not 4). LDARG indexing in the variadic save area
must respect this shift.
A1 — one-word and two-word direct
- cg primitives: extend
cg-fn-endto load the function's return slot into a0 (≤8B) or a0/a1 (9–16B) per the return ctype's size. Extendcg-call's receive side to allocate a fresh frame slot viacg-alloc-slot(sized to the return ctype) and store back from a0[/a1] before pushing a frame-lval. - Uniform path: ≤8B and 9–16B both go through a frame slot. No register-only fast path. Simpler cg, identical parser surface.
- Receive-area lifetime: fresh slot per call site, allocated by
cg-call. Required so chained
make_triple(...).c(cc/113:32) andret1(99).x(cc/111:26) don't alias across consecutive calls. - Parser:
parse-return-stmtaccepts a struct-typed expression and emits a per-byte copy from the source lval into the function's return slot.parse-postfix-restaccepts struct-typed call results as lvals so.fieldand&chain naturally. - CC-CONTRACTS update: §3.2 currently asserts a single 8-byte return slot. Replace with a pointer to P1.md §Arguments and a note that the cg lowers all three conventions.
- Fixtures:
tests/cc-cg/70-struct-ret-1word.scm,71-struct-ret-2word.scmfirst; then unblocktests/cc/111,112.
A2 — indirect-result
- cg primitives: when return ctype size > 16,
cg-fn-begin/vtreats arg slot 0 as the sret pointer, parameter slots shift by one register.cg-fn-enddoes no a0 store (callee already wrote through *a0 during the return-stmt copy); the convention's "a0 holds the same pointer on return" is automatic since a0 hasn't been clobbered. - Caller side:
cg-calldetects sret-eligible return type; before emitting the call, materializes the receive-slot's address into a0, shifts ordinary args from a0..a3 → a1..a3 + stack, with stack slot 0 now holding arg word 3. - Variadic interaction is deferred to A3c. A2 itself targets only cc/113 (no varargs, ≤4 named args).
- Fixtures:
tests/cc-cg/72-struct-ret-3word.scm; thentests/cc/113.
A3 — sret compositions (parallel)
Each agent picks up one test, mostly composing infrastructure A1+A2 already shipped:
- A3a — cc/114 (sret-pair + 8 stack-staged args): the two-word return doesn't need sret; the 8 args do exercise stack staging (P1.md §Incoming stack-argument area). Validates that A1's two-word receive composes with the existing stack-stage path.
- A3b — cc/115 (sret-3word + 8 stack-staged args): with sret in a0, args 0–2 live in a1–a3 and args 3–7 stage to stack slots 0–4. Pure indexing test for A2's shift.
- A3c — cc/116 (sret + variadic save area):
cg-fn-begin/v's 16-slot save window indexes from incoming arg word 0. When the fn uses indirect-result, slot 0 is arg word 3 (not 4) — A3c adjusts the windowing accordingly.__builtin_va_startmust skip the sret pointer; in practice that's automatic if the named-arg count threading is correct, since the sret pointer occupies a0 and the named args start at a1.
Stream B — Compound literals (117)
C99 §6.5.2.5: (T){ init-list } as a postfix expression. Block-scope
only; file-scope literals die explicitly.
- Parser: detect
(T){lookahead inparse-cast-or-unary. Parse the typename viaparse-decl-spec+parse-declarator, then call the existingparse-init-local-aggregateagainst a fresh slot fromcg-alloc-slot. Push a frame-lval typed as T (or T[N]). - Lvalue contract: the literal is an lvalue, so
&literal,literal.field, andliteral[i]all work via existingcg-take-addr/cg-push-field/cg-decay-arraypaths. - Lifetime: frame slot ⇒ enclosing block, automatic. Matches C99.
- Reuse: no new cg primitives. The fixture surface (positional, designated, partial-init zero-fill, trailing comma, array decay, byval struct arg) is all already covered by Stream E in CC-PUNCHLIST.
Stream C — Const-expr evaluator (118)
Adds parse-const-expr ps → (value . ctype), a self-contained walker
that never touches cg.
- Operand surface: integer/character literal, enum constant,
sizeof(typename), unary+ - ~ !, binary+ - * / % << >> & | ^, compare< <= > >= == !=, logical&& ||(short-circuit), ternary?:, cast to integer type, parenthesization. Anything elsedies. - Width-aware return: the
(int)(unsigned char)257 == 1case requires the cast to truncate at u8 width. Bare fixnum loses this; the (value . ctype) tuple keeps it. - Sizeof arm: for now, only
sizeof(TYPENAME)— the only form exercised by 118. If a value-expression form surfaces later, grow a scope-lookup arm; still no cg interaction. - Wiring (replace existing
parse-const-intcall sites):parse-enum-spec,parse-decl-suf-cont's[]arm,parse-init-global's scalar branch,parse-switch-stmt's case label, local array bound inparse-stmt. - No interaction with Stream D. See "Sizeof split" below.
Stream D — Surgical fixes (082 + 087)
Single agent — both ~30-line changes.
082 — union field offsets
parse-struct-fields (cc.scm:3634) advances offset after every field
regardless of kind. For unions all fields must stay at offset 0.
- Thread
kindfromparse-aggregate-spec(cc.scm:3611, where it's already in scope) through toparse-struct-fields. - Gate the
(+ oa (max sz 0))bump on(eq? kind 'struct). Unions pass through withoaunchanged. complete-agg!already sizes unions correctly; no change there.
087 — sizeof no-emit
The current sizeof arm at cc.scm:4898 calls parse-unary / parse-expr
which emit code for the operand. sizeof(x++) therefore actually
increments x.
- Add cg primitives
cg-snapshot cg → tagandcg-rewind cg tag. Snapshot captures vstack depth and fn-buf chunk count; rewind restores both. Internal-only. - In the sizeof arm, snapshot before parse-unary, read
(opnd-type (cg-top)), rewind, pushcg-push-imm %t-u64 size. - Fixture lock-in: 087 covers
sizeof(x++). Add a cg-fixture if cg primitives need direct validation.
Cross-stream contracts
Sizeof split (C and D stay independent)
Two distinct callers, two independent mechanisms:
- Outside const-expr (Stream D — 087): operand can be anything
(
x++, calls, side-effects). Result lives at runtime. Usecg-snapshot/cg-rewind. - Inside const-expr (Stream C — 118): operand grammar restricted;
result is a parse-time fixnum. Const-expr evaluator handles
sizeof(TYPENAME)directly viaparse-decl-spec+parse-declarator+ctype-size. Never calls cg.
The two paths share the concept (don't evaluate the operand) but not the implementation. They can land in either order.
A1 ↔ B/C/D
Independent. A1 only touches cg-fn-end, cg-call, and
parse-return-stmt. B touches parse-cast-or-unary. C adds a new
walker plus changes to four call sites that don't overlap with A1.
D touches parse-struct-fields and adds two new cg primitives.
A2 ↔ A3
A3 depends on A2's indirect-result implementation. A3a/b/c are independent of each other once A2 has landed.
Workflow per stream
Per CC-INTERNALS §Feature workflow:
- cc-cg fixture (red) — drive the cg API directly.
- Implement cg primitives until cc-cg green.
- cc fixture (red) — full driver.
- Implement parser changes until cc green.
Pick the next free <n> per suite. cc-cg currently goes up through
69; cc goes up through 118 (with gaps).
Acceptance
Per stream: make test SUITE=cc ARCH=aarch64 shows the stream's
target tests as PASS, no prior tests regress. Final acceptance: all
ten currently-failing tests green on aarch64.