kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 97244390fc17b2a65d6b516201addd537d325ec7
parent 0f31a272680990b067b56657b85fb69e396d33df
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Sun, 10 May 2026 07:20:00 -0700

cg: CG-driven spill/reload on a finite physical-register pool

Implement doc/REGALLOC.md. Replace the aarch64 backend's monotonic
scratch-cursor + statement-boundary reset with bitmask RegPools that
hand back free regs on demand. CG tracks per-SValue residency
(INHERENT/REG/SPILLED), spills the deepest unpinned RES_REG when the
pool exhausts, and reloads on consumption. cg_call publishes its
in-flight avs[] so the spill driver can re-spill an already-
materialized arg's reg (rewriting storage to OPK_LOCAL) when the
stack victim list runs dry — required for calls with more reg-class
args than the pool can hold.

Removes cg_reset_scratch (which silently corrupted opt's bytes when
opt was in the chain) and the aa_panic("spill_reg/reload_reg") stubs.
free_reg becomes load-bearing; every pop site routes through release().

test/parse: 306 pass → 322 pass (+16, no regressions). New
6_5_33_regalloc_spill row exercises 12-arg call spill paths.

Diffstat:
Adoc/REGALLOC.md | 346+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Msrc/arch/aarch64.c | 174+++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------
Msrc/cg/cg.c | 816+++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------
Msrc/parse/parse.c | 6------
Mtest/parse/CORPUS.md | 1+
Atest/parse/cases/6_5_33_regalloc_spill.c | 32++++++++++++++++++++++++++++++++
Atest/parse/cases/6_5_33_regalloc_spill.expected | 1+
7 files changed, 1048 insertions(+), 328 deletions(-)

diff --git a/doc/REGALLOC.md b/doc/REGALLOC.md @@ -0,0 +1,346 @@ +# REGALLOC — CG-driven spill/reload on a finite physical-register pool + +The single-pass codegen produces target machine code by streaming +`CGTarget` calls. Backends expose a finite scratch-register pool +(aarch64 today: 10 INT, 16 FP). CG must drive that pool correctly under +arbitrary register pressure: when the pool is empty and another reg is +needed, spill the deepest live value on the CG value stack to a frame +slot, free its register, and proceed. When a spilled value is consumed, +reload it first. + +This document defines the contract between CG and the backend and the +residency state machine on the value stack. Opt is out of scope — +`opt_cgtarget` panics on `spill_reg`/`reload_reg` today and that +remains the case until the deferred Phase 3 RA pass lands. + +--- + +## 1. What's broken today + +`aa_alloc_reg` panics with "out of INT scratch (no spill yet)" once 10 +integer scratch regs are live. To work around this, CG calls +`cg_reset_scratch` at every statement boundary +(`parse/parse.c:1618`), which calls `aa_reset_scratch(g->target)` +directly (`cg/cg.c:412`). That direct call is a layering violation: + +- It hardcodes the aarch64 backend by name. +- When opt is in the chain, `g->target` is an `OptImpl*`, not an + `AAImpl*`. `aa_reset_scratch` casts it as the latter and writes + `used_int=0; used_fp=0;` at AAImpl offsets inside opt's bytes — + silent memory corruption on every statement boundary. +- The reset hides the fact that CG never calls `free_reg` on consumed + SValues. The pool is recycled by fiat, not by tracking ownership. + +Both `cg_reset_scratch` and `aa_reset_scratch` are removed by this +design. `free_reg` becomes load-bearing. + +--- + +## 2. CGTarget contract changes + +### 2.1 `alloc_reg` returns `REG_NONE` on exhaustion + +```c +Reg (*alloc_reg)(CGTarget*, RegClass, const Type*); +``` + +Returns a fresh physical reg, or `REG_NONE` if the class's pool is +empty. Backends never panic on exhaustion — that decision belongs to +CG. Other failure modes (unknown `RegClass`, internal invariant +violation) still panic. + +### 2.2 `free_reg` is real + +```c +void (*free_reg)(CGTarget*, Reg); +``` + +Returns `r` to the backend's free-list. Idempotent calls and +double-frees are bugs and may panic. CG owes exactly one `free_reg` +per `alloc_reg` over the lifetime of every value. + +### 2.3 `spill_reg` implies `free_reg` + +```c +void (*spill_reg)(CGTarget*, Operand src_reg, FrameSlot, MemAccess); +``` + +`src_reg` must be `OPK_REG`. The call: + +1. Stores the register's contents to the frame slot per `MemAccess`. +2. Returns the register to the backend's free-list. + +After `spill_reg`, the caller must not use `src_reg` and must not call +`free_reg` on it. Coupling these two operations matches every CG +caller's intent and removes a forget-to-free foot-gun. + +### 2.4 `reload_reg` is independent + +```c +void (*reload_reg)(CGTarget*, Operand dst_reg, FrameSlot, MemAccess); +``` + +`dst_reg` must already have been allocated by the caller via +`alloc_reg`. Loads the slot's bytes into the register. The slot is not +released by this call — CG returns it to its own slot free-list. +Multiple reloads from the same slot are well-defined; CG does not rely +on this but the contract permits it. + +### 2.5 Backend pool implementation + +Each `RegClass` holds a `u32` free-mask: bit `i` set means the i-th +register in that class's contiguous range is free. The aarch64 backend +keeps two such masks alongside the per-class base register: + +```c +typedef struct RegPool { + u32 free; /* bit i set ⇔ regs[base + i] is free */ + u8 base; /* first physical reg in the class */ + u8 nregs; /* count; bits [nregs..32) are always 0 */ +} RegPool; +``` + +For aarch64: + +- INT pool: `base = 19`, `nregs = 10`, initial `free = 0x000003FFu` + (x19..x28). +- FP pool: `base = 8`, `nregs = 16`, initial `free = 0x0000FFFFu` + (v8..v23, callee-saves first then caller-saved scratch). + +The three pool ops are pure bit operations: + +```c +static Reg pool_alloc(RegPool* p) { + if (p->free == 0) return REG_NONE; + u32 idx = (u32)__builtin_ctz(p->free); + p->free &= ~(1u << idx); + return (Reg)(p->base + idx); +} + +static void pool_free(RegPool* p, Reg r) { + u32 idx = (u32)r - p->base; + /* Double-free is a CG bug — bit must currently be 0. */ + if (p->free & (1u << idx)) + compiler_panic(..., "free_reg: %u already free", (unsigned)r); + p->free |= (1u << idx); +} +``` + +`spill_reg` emits the store via the backend's existing store path, +then calls `pool_free` to release the bit. + +`__builtin_ctz` picks the lowest free bit, so allocation order is +deterministic and the same physical regs stay hot — matching today's +sequential allocation order for diff stability across the test corpus. +The 32-bit mask is sufficient: every architecture cfree targets has +fewer than 32 scratch regs per class. + +--- + +## 3. CG residency model + +### 3.1 SValue extension + +```c +typedef enum SResidency { + RES_INHERENT, /* IMM / LOCAL / GLOBAL — no reg owed */ + RES_REG, /* op holds a Reg this SValue owns */ + RES_SPILLED, /* register contents stored to spill_slot; + op.kind reflects the original value form + (REG or INDIRECT); op.v.reg is REG_NONE */ +} SResidency; + +typedef struct SValue { + Operand op; + const Type* type; + SResidency res; + FrameSlot spill_slot; /* valid iff res == RES_SPILLED */ + u8 pinned; /* 1 = ineligible spill victim */ +} SValue; +``` + +`OPK_INDIRECT` lvalues hold a base register and are tracked with +`RES_REG`. On spill the base reg is saved; on reload the SValue is +restored to `OPK_INDIRECT` with the freshly-reloaded base. The +deferred-load identity of the lvalue is preserved across spill/reload +— CG does not eagerly materialize an INDIRECT to a plain rvalue just +because it became a spill victim. + +`pinned` is set for the duration of one CG operation. Pinning prevents +a freshly-reloaded operand from being chosen as the spill victim while +CG is still arranging the other operand of a binop or call. + +### 3.2 Spill-slot pool + +CG maintains two free-lists of `FrameSlot`s, one per `RegClass`: + +- `RC_INT` slots are 8 bytes, 8-byte aligned. +- `RC_FP` slots are 16 bytes, 16-byte aligned (covers `double` and + the spilled portion of `long double`). + +A spill takes a slot from the free-list (allocating a fresh +`FrameSlot` from the backend if the list is empty). A reload returns +the slot to the free-list. Frame footprint is bounded by peak +concurrent spills per class, which on a stack machine of typical depth +is small. + +Slots are per-function. `cg_func_end` discards both free-lists. + +### 3.3 Invariants + +After every public `cg_*` call returns: + +1. Every `RES_REG` SValue on the stack owns its register; the sum of + `RES_REG` regs equals the backend's allocated-reg count. +2. No SValue is `pinned`. +3. Every `RES_SPILLED` SValue holds a slot that is *not* on the + slot free-list. + +`cg_func_end` asserts the stack is empty and both free-lists are well- +formed (every entry distinct, no aliasing with a live `RES_SPILLED` +slot). + +--- + +## 4. Spill / reload algorithm + +### 4.1 Allocation with fallback + +``` +alloc_reg_or_spill(g, cls, ty) -> Reg: + r = target->alloc_reg(cls, ty) + if r != REG_NONE: + return r + victim = pick_victim(g, cls) + if victim == NULL: + panic("regalloc: no spillable victim") // pinned-only is a CG bug + slot = take_spill_slot(g, cls) + target->spill_reg(victim.op, slot, mem_for_spill(victim, cls)) + victim.spill_slot = slot + victim.res = RES_SPILLED + victim.op.v.reg = REG_NONE + r = target->alloc_reg(cls, ty) + assert(r != REG_NONE) + return r +``` + +`pick_victim` walks the value stack from index 0 upward and returns +the first SValue with `res == RES_REG`, `pinned == 0`, and matching +`RegClass`. This is FIFO from the bottom — equivalent to "deepest +first" — and matches the intuition that the top of the stack is about +to be consumed. + +`mem_for_spill` constructs a `MemAccess` from the victim's type with +`alias.kind = ALIAS_LOCAL` rooted at the spill slot. + +### 4.2 Reload before consumption + +``` +ensure_reg(g, sv): + if sv.res != RES_SPILLED: + return + r = alloc_reg_or_spill(g, class_of(sv.type), sv.type) + target->reload_reg(op_reg(r, sv.type), sv.spill_slot, + mem_for_spill(sv, class_of(sv.type))) + return_spill_slot(g, sv.spill_slot, class_of(sv.type)) + sv.spill_slot = FRAME_SLOT_NONE + if sv.op.kind == OPK_INDIRECT: + sv.op.v.ind.base = r + else: + sv.op = op_reg(r, sv.type) + sv.res = RES_REG +``` + +`ensure_reg` is called at the start of every operation that consumes +a register-resident operand. The ensure-then-pin pattern is: + +``` +binop(g, op): + b = pop(g); ensure_reg(g, &b); b.pinned = 1 + a = pop(g); ensure_reg(g, &a); a.pinned = 1 + rd = alloc_reg_or_spill(g, class_of(result), result_ty) + target->binop(op, op_reg(rd, result_ty), a.op, b.op) + a.pinned = b.pinned = 0 + target->free_reg(reg_of(a)); target->free_reg(reg_of(b)) + push(g, svalue_reg(rd, result_ty)) +``` + +Pinning is symmetric within one CG call. Nothing leaks across calls. + +### 4.3 Termination + +Each `alloc_reg_or_spill` call either succeeds on first try or evicts +exactly one unpinned `RES_REG` SValue. The pinned set is bounded +(within one CG op, at most a handful: two source operands plus a +destination-in-progress). With at least `pinned + 1` registers in the +class, the second `alloc_reg` call in `alloc_reg_or_spill` is +guaranteed to succeed. Every backend's pool comfortably exceeds this +bound (aarch64: 10 INT, 16 FP; minimum needed: ~3). + +`cg_call` is the one exception to "victims live on the value stack." +While the pop loop materializes args, popped-but-not-yet-emitted regs +accumulate in the local `CGABIValue avs[]` array — off the value +stack and so invisible to `pick_victim`. For a call with more reg- +class args than the pool size can hold, `pick_victim` will eventually +return NULL while genuine victims remain in `avs[]`. + +To handle this, `cg_call` publishes its in-flight `avs` array via +`CG.avs_in_flight` for the duration of the pop+materialize loop. +`alloc_reg_or_spill` falls back to `spill_avs_victim` when the stack +victim list is empty: it picks an `OPK_REG` arg storage entry, +evicts it through `spill_reg`, and rewrites `avs[i].storage` to +`OPK_LOCAL` so the backend's call lowering loads from the slot. After +`T->call`, `cg_call` walks `avs` and returns each `OPK_LOCAL` slot to +the spill-slot free-list. Without this fallback, a 12-arg INT call on +aarch64 (10 INT pool) would be unsatisfiable. + +If a backend is added with a class smaller than the pinned bound for +some operation outside `cg_call`, that's a wiring bug; CG asserts +after the recursive `alloc_reg` call. + +### 4.4 Free-on-consume + +Every site that pops an SValue and is done with it must release any +register it owned: + +``` +release(g, sv): + if sv.res == RES_REG: + if sv.op.kind == OPK_REG: + target->free_reg(sv.op.v.reg) + else if sv.op.kind == OPK_INDIRECT: + target->free_reg(sv.op.v.ind.base) + else if sv.res == RES_SPILLED: + return_spill_slot(g, sv.spill_slot, class_of(sv.type)) + /* RES_INHERENT: nothing owed */ +``` + +`cg_drop`, the result-discard path of `cg_store`, the operand pops in +every `cg_binop`/`cg_unop`/`cg_cmp`/`cg_call`/`cg_load`/`cg_addr`/etc. +all flow through `release`. This is the audit that bulks up the diff — +CG has many `pop` sites and they currently leak. + +--- + +## 5. Removed mechanisms + +The following are deleted as part of this change: + +- `cg_reset_scratch` (`cg/cg.c`) — no replacement; `release` in §4.4 + makes it unnecessary. +- The `extern void cg_reset_scratch(CG*)` declaration and the five call + sites in `parse/parse.c`. +- `aa_reset_scratch` (`arch/aarch64.c`) and its forward declaration in + `cg/cg.c`. +- The "(no spill yet)" panics in `aa_alloc_reg` — replaced by + `return REG_NONE`. +- The `aa_panic("spill_reg")`/`aa_panic("reload_reg")` stubs — + replaced by real STR/LDR implementations that route through the + backend's existing store/load paths. + +The `used_int` / `used_fp` fields on `AAImpl` are repurposed: instead +of monotonically-incrementing allocation cursors (where free was a +no-op), they now track the highest-index-+1 ever allocated per class — +the high-water mark needed by the prologue/epilogue to size the +callee-save area. The actual free-list is the `int_free` / `fp_free` +bitmasks added alongside. diff --git a/src/arch/aarch64.c b/src/arch/aarch64.c @@ -16,10 +16,14 @@ * local slots (cum_off bytes) * x29, x30 save (16 bytes) -- x29 = sp + frame_size - 16 * - * Single-pass register allocator: alloc_reg(RC_INT) hands out x19..x28 in - * order; alloc_reg(RC_FP) hands out v8..v15. Both ranges are callee-saved - * and only the prefix actually used is saved by the prologue. Width - * derives from Operand.type via type_is_64. Spill/reload not implemented. + * Single-pass register allocator: a free-mask pool per class hands out + * the lowest free index. INT pool covers x19..x28 (10 callee-saves); + * FP pool covers v8..v23, with v8..v15 callee-saved and v16..v23 + * caller-saved scratch — lowest-bit-first allocation prefers callee- + * saves. Only the prefix actually used (high-water mark) is saved by + * the prologue. Width derives from Operand.type via type_is_64. CG + * drives spill/reload through alloc_reg returning REG_NONE on + * exhaustion plus the spill_reg/reload_reg vtable entries. * * Multi-function: each func_begin/func_end pair owns its own frame state * via the AAImpl fields, so the harness can build several functions in @@ -257,6 +261,51 @@ static inline u32 aa64_bfm(u32 sf, u32 Rd, u32 Rn, u32 immr, u32 imms) { } /* ============================================================ + * Per-class register pool (free-mask + high-water mark). + * + * The mask uses bit i for the i-th register in the class's contiguous + * range, so allocation is `__builtin_ctz` over the free mask and + * deallocation is bit-set. `hwm` records the highest-index-+1 ever + * allocated, which the prologue/epilogue uses to size the callee-save + * area. 32-bit masks suffice for every aarch64/x86_64/RISC-V class. + * ============================================================ */ + +typedef struct RegPool { + u32 free; /* bit i set ⇔ regs[base + i] is free */ + u32 hwm; /* highest-index-+1 ever allocated */ + u8 base; /* first physical reg in the class */ + u8 nregs; /* count; bits [nregs..32) are always 0 */ + u8 pad[2]; +} RegPool; + +static void regpool_init(RegPool* p, u8 base, u8 nregs) { + p->base = base; + p->nregs = nregs; + p->hwm = 0; + p->free = (nregs >= 32u) ? 0xFFFFFFFFu : ((1u << nregs) - 1u); +} + +static Reg regpool_alloc(RegPool* p) { + if (p->free == 0) return (Reg)REG_NONE; + u32 idx = (u32)__builtin_ctz(p->free); + p->free &= ~(1u << idx); + if (idx + 1u > p->hwm) p->hwm = idx + 1u; + return (Reg)(p->base + idx); +} + +/* Returns 1 on successful free, 0 if `r` is outside this pool's range, + * -1 on double-free (caller is expected to panic). */ +static int regpool_free(RegPool* p, Reg r) { + u32 rn = (u32)r; + if (rn < p->base || rn >= (u32)(p->base + p->nregs)) return 0; + u32 idx = rn - p->base; + u32 bit = 1u << idx; + if (p->free & bit) return -1; + p->free |= bit; + return 1; +} + +/* ============================================================ * AAImpl * ============================================================ */ @@ -305,9 +354,18 @@ typedef struct AAImpl { u8 has_sret; /* sret pointer arrived in x8 */ FrameSlot sret_ptr_slot; /* hidden slot holding incoming x8 */ - /* Reg allocator (callee-saved prefix). */ - u32 used_int; /* x19 + i, i in [0, used_int) */ - u32 used_fp; /* v8 + i, i in [0, used_fp ) */ + /* Reg allocator pools. Bit i set in `free` means the i-th register in + * the class's contiguous range (base..base+nregs-1) is available. The + * high-water mark `hwm` is the largest index+1 ever allocated for the + * class — used by the prologue to decide how many callee-saves to push. + * + * INT pool: base = 19, nregs = 10 (x19..x28). + * FP pool : base = 8, nregs = 16 (v8..v23). The first 8 (v8..v15) are + * AAPCS64 callee-saves; v16..v23 are caller-saved scratch handed out + * after the callee-saved range fills. Allocation is lowest-bit-first + * so callee-saves are still preferred. */ + RegPool int_pool; + RegPool fp_pool; /* Structured-scope stack. Entries are not popped — IDs returned to * the caller are stable indices into this array for the lifetime @@ -343,6 +401,9 @@ static AAImpl* impl_of(CGTarget* t) { return (AAImpl*)t; } static FrameSlot aa_frame_slot(CGTarget* t, const FrameSlotDesc* d); static AASlot* slot_get(AAImpl* a, FrameSlot fs); static u32 force_reg_int(CGTarget* t, Operand op, u32 sf, u32 scratch); +static void aa_load(CGTarget* t, Operand dst, Operand addr, MemAccess ma); +static void aa_store(CGTarget* t, Operand addr, Operand src, MemAccess ma); +static void aa_free_reg(CGTarget* t, Reg r); /* ---- helpers ---- */ @@ -535,8 +596,8 @@ static void aa_func_begin(CGTarget* t, const CGFuncDesc* fd) { a->has_sret = (fd->abi && fd->abi->has_sret) ? 1 : 0; a->cum_off = 0; a->max_outgoing = 0; - a->used_int = 0; - a->used_fp = 0; + regpool_init(&a->int_pool, /*base=*/19u, /*nregs=*/10u); /* x19..x28 */ + regpool_init(&a->fp_pool, /*base=*/8u, /*nregs=*/16u); /* v8..v23 */ a->nslots = 0; a->nscopes = 0; a->has_alloca = 0; @@ -614,8 +675,8 @@ static void aa_func_end(CGTarget* t) { /* Compute callee-save layout. Only v8..v15 are callee-saved; the * caller-saved v16..v23 are handed out by alloc_reg too but never * appear in prologue saves. */ - u32 n_int_pairs = (a->used_int + 1) / 2; /* round up */ - u32 used_fp_cs = a->used_fp > 8 ? 8u : a->used_fp; + u32 n_int_pairs = (a->int_pool.hwm + 1) / 2; /* round up */ + u32 used_fp_cs = a->fp_pool.hwm > 8 ? 8u : a->fp_pool.hwm; u32 n_fp_pairs = (used_fp_cs + 1) / 2; u32 outgoing_off = 0; @@ -750,45 +811,28 @@ static void aa_func_end(CGTarget* t) { static Reg aa_alloc_reg(CGTarget* t, RegClass cls, const Type* ty) { AAImpl* a = impl_of(t); (void)ty; - if (cls == RC_INT) { - if (a->used_int >= 10) { - compiler_panic(t->c, a->loc, - "aarch64 alloc_reg: out of INT scratch (no spill yet)"); - } - return (Reg)(19u + a->used_int++); - } - if (cls == RC_FP) { - /* v8..v15 are callee-saved (low 64 bits); v16..v23 are caller-saved - * scratch. Hand out callee-saved first, then fall back to scratch - * for short-lived materialization (e.g. j06 builds 9 FP arg regs - * with no intervening call). */ - if (a->used_fp >= 16) { - compiler_panic(t->c, a->loc, - "aarch64 alloc_reg: out of FP scratch (no spill yet)"); - } - u32 idx = a->used_fp++; - return (Reg)(idx < 8 ? 8u + idx : 16u + (idx - 8u)); - } + /* Lowest-bit-first allocation hands out callee-saves before caller- + * saves on the FP side (v8..v15 then v16..v23) — short-lived + * materializations (e.g. j06 building 9 FP arg regs with no + * intervening call) thus reach into the caller-saved range. */ + if (cls == RC_INT) return regpool_alloc(&a->int_pool); + if (cls == RC_FP) return regpool_alloc(&a->fp_pool); compiler_panic(t->c, a->loc, "aarch64 alloc_reg: class %d unimpl", (int)cls); } static void aa_free_reg(CGTarget* t, Reg r) { - (void)t; - (void)r; -} - -/* Reset the scratch-register cursors. The parser calls this between - * statements (when its value stack is known empty), letting the next - * statement reuse the entire scratch pool. Safe only when no live - * register-resident SValue is still expected — the parser asserts - * that precondition by checking cg's sp before forwarding. Without - * this, any function whose body contains more than ~10 sequential - * register-allocating operations exhausts the scratch pool. */ -void aa_reset_scratch(CGTarget* t); -void aa_reset_scratch(CGTarget* t) { AAImpl* a = impl_of(t); - a->used_int = 0; - a->used_fp = 0; + RegPool* pools[2] = {&a->int_pool, &a->fp_pool}; + for (u32 i = 0; i < 2; ++i) { + int rc = regpool_free(pools[i], r); + if (rc == 1) return; + if (rc == -1) { + compiler_panic(t->c, a->loc, "aarch64 free_reg: reg %u already free", + (unsigned)r); + } + } + compiler_panic(t->c, a->loc, "aarch64 free_reg: reg %u not a scratch reg", + (unsigned)r); } static FrameSlot aa_frame_slot(CGTarget* t, const FrameSlotDesc* d) { @@ -914,17 +958,35 @@ static const Reg* aa_clobbers(CGTarget* t, RegClass c, u32* n) { (void)n; aa_panic(t, "clobbers"); } -static void aa_spill_reg(CGTarget* t, Operand s, FrameSlot f, MemAccess m) { - (void)s; - (void)f; - (void)m; - aa_panic(t, "spill_reg"); -} -static void aa_reload_reg(CGTarget* t, Operand d, FrameSlot f, MemAccess m) { - (void)d; - (void)f; - (void)m; - aa_panic(t, "reload_reg"); +static void aa_spill_reg(CGTarget* t, Operand src, FrameSlot slot, + MemAccess ma) { + AAImpl* a = impl_of(t); + if (src.kind != OPK_REG) { + compiler_panic(t->c, a->loc, "aarch64 spill_reg: src is not OPK_REG"); + } + Operand addr; + memset(&addr, 0, sizeof addr); + addr.kind = OPK_LOCAL; + addr.cls = RC_INT; + addr.type = ma.type; + addr.v.frame_slot = slot; + aa_store(t, addr, src, ma); + aa_free_reg(t, src.v.reg); +} + +static void aa_reload_reg(CGTarget* t, Operand dst, FrameSlot slot, + MemAccess ma) { + AAImpl* a = impl_of(t); + if (dst.kind != OPK_REG) { + compiler_panic(t->c, a->loc, "aarch64 reload_reg: dst is not OPK_REG"); + } + Operand addr; + memset(&addr, 0, sizeof addr); + addr.kind = OPK_LOCAL; + addr.cls = RC_INT; + addr.type = ma.type; + addr.v.frame_slot = slot; + aa_load(t, dst, addr, ma); } /* ---- labels / control flow ---- diff --git a/src/cg/cg.c b/src/cg/cg.c @@ -11,11 +11,29 @@ * - OPK_LOCAL / OPK_GLOBAL / OPK_INDIRECT are lvalues. cg_load promotes * them to OPK_REG via target->load + a fresh scratch register. * - * This is the spine slice — enough for §6.5/§6.8 fixtures: scalar i32 - * locals, integer arithmetic, comparisons, control flow, and return. - * Aggregates, atomics, calls, and the asm/setjmp/intrinsic methods are - * placeholders pending their corpus rows. The interface in cg.h is the - * commitment; this file fills in the slice that's exercised today. */ + * Register pressure & spill (see doc/REGALLOC.md): + * - Each SValue carries an SResidency tag (INHERENT / REG / SPILLED). + * REG-residing SValues own a physical scratch register that must be + * released back to the pool when the value is consumed; SPILLED + * SValues own a frame slot instead and must be reloaded before use. + * - alloc_reg_or_spill is the single allocation entry point. On pool + * exhaustion it picks the deepest unpinned RES_REG SValue from the + * value stack as the spill victim, evicts its register through + * T->spill_reg, and retries. ensure_reg is the dual: it reloads a + * SPILLED SValue's register before consumption, possibly evicting + * another value to make room. + * - Pop sites (binop, cmp, store, branch, call, ...) call release() + * to return regs/slots after consumption — there is no statement- + * boundary scratch reset; ownership tracking is the discipline. + * - cg_call additionally exposes its in-flight CGABIValue array via + * CG.avs_in_flight so the spill driver can re-spill an already- + * materialized arg's register (rewriting storage to OPK_LOCAL) when + * the value stack runs out of victims; this lets calls with more + * reg-class args than the pool size can hold lower correctly. + * + * Aggregates, atomics, asm/setjmp/intrinsic methods are placeholders + * pending their corpus rows. The interface in cg.h is the commitment; + * this file fills in the slice that's exercised today. */ #include "cg/cg.h" @@ -35,12 +53,28 @@ * Value stack * ============================================================ */ +/* Residency: where the value's storage actually lives. INHERENT values + * (IMM, LOCAL, GLOBAL) carry no register obligation. REG values own a + * physical scratch register. SPILLED values had their register evicted + * to a frame slot under register pressure and must be reloaded before + * consumption. See doc/REGALLOC.md §3. */ +typedef enum SResidency { + RES_INHERENT, + RES_REG, + RES_SPILLED, +} SResidency; + typedef struct SValue { Operand op; /* IMM/REG (rvalue) or LOCAL/GLOBAL/INDIRECT (lvalue) */ const Type* type; /* C semantic type of the value (post-promotion) */ + u8 res; /* SResidency */ + u8 pinned; /* 1 = ineligible spill victim (cleared per CG op) */ + u8 pad[2]; + FrameSlot spill_slot; /* valid iff res == RES_SPILLED */ } SValue; #define CG_STACK_INITIAL 16u +#define CG_SPILL_FREE_INITIAL 4u struct CG { Compiler* c; @@ -64,6 +98,29 @@ struct CG { SValue* stack; u32 sp; u32 cap; + + /* Per-function spill-slot free-lists, one per RegClass. A spill takes a + * slot from the free-list (allocating fresh from the backend if empty); + * a reload returns the slot. Frame footprint is bounded by the peak + * concurrent spills per class. Reset at func_end. See doc/REGALLOC.md + * §3.2. */ + struct { + FrameSlot* free; + u32 n; + u32 cap; + } slot_pools[3]; /* indexed by RegClass; RC_VEC reserved */ + + /* Set during cg_call's pop+materialize loop to point at the call's + * in-flight CGABIValue array. When alloc_reg_or_spill exhausts the + * stack victim list, it falls back to spilling an OPK_REG arg + * storage entry from this array, rewriting it to OPK_LOCAL so the + * backend's call lowering loads it from the spill slot. NULL outside + * cg_call. Without this fallback, calls with more reg-class args + * than the pool can hold (e.g. 10+ INT args on aarch64) would have + * unreclaimable regs sitting in avs[] while the value stack runs + * out of victims. */ + CGABIValue* avs_in_flight; + u32 avs_in_flight_n; }; static void stack_grow(CG* g, u32 want) { @@ -93,11 +150,24 @@ static SValue pop(CG* g) { return g->stack[--g->sp]; } -static SValue* top(CG* g) { - if (g->sp == 0) { - compiler_panic(g->c, g->cur_loc, "cg: stack empty"); - } - return &g->stack[g->sp - 1]; + +/* Residency of an Operand as it should land on the stack at construction + * time. REG → owns a register; INDIRECT → owns its base register; the + * rest carry no register obligation. */ +static u8 residency_for(const Operand* o) { + if (o->kind == OPK_REG) return RES_REG; + if (o->kind == OPK_INDIRECT) return RES_REG; + return RES_INHERENT; +} + +static SValue make_sv(Operand op, const Type* ty) { + SValue sv; + memset(&sv, 0, sizeof sv); + sv.op = op; + sv.type = ty; + sv.res = residency_for(&op); + sv.spill_slot = FRAME_SLOT_NONE; + return sv; } /* ============================================================ @@ -187,6 +257,257 @@ static AliasKind alias_for_lvalue(const Operand* o) { } } +/* MemAccess for a load/store through an lvalue Operand. Routes through + * derive_mem with the alias root and any local-id taken from the lvalue + * itself. The Operand must be OPK_LOCAL/GLOBAL/INDIRECT. */ +static MemAccess mem_for_lvalue(CG* g, const Operand* lv, const Type* ty) { + AliasKind ak = alias_for_lvalue(lv); + i32 al = (ak == ALIAS_LOCAL) ? (i32)lv->v.frame_slot : 0; + return derive_mem(g, ty, ak, al); +} + +/* C type carried by an SValue, with a fallback to the Operand's own type + * field for SValues whose `type` was never set (notably the bare-slot + * cg_push_local path). The two should agree when both present. */ +static const Type* sv_type(const SValue* sv) { + return sv->type ? sv->type : sv->op.type; +} + +/* Build an OPK_INDIRECT Operand for [base + ofs] of type `ty`. The base + * register is always RC_INT (it holds a pointer). */ +static Operand op_indirect(Reg base, i32 ofs, const Type* ty) { + Operand o; + memset(&o, 0, sizeof o); + o.kind = OPK_INDIRECT; + o.cls = RC_INT; + o.type = ty; + o.v.ind.base = base; + o.v.ind.ofs = ofs; + return o; +} + +/* ============================================================ + * Register-pool & spill driver — see doc/REGALLOC.md + * ============================================================ */ + +/* Class an SValue's register lives in: RC_FP for float types, RC_INT for + * everything else (including OPK_INDIRECT base regs which always hold a + * pointer-sized integer). */ +static u8 class_of_sv(const SValue* sv) { + if (sv->op.kind == OPK_INDIRECT) return RC_INT; + return type_class(sv_type(sv)); +} + +/* The C type whose width matches the register an SValue currently owns. + * For OPK_INDIRECT, that's a pointer to the lvalue's type (the base reg + * holds an address); for everything else, the SValue's plain type. Used + * by the spill/reload machinery to size the slot store/load. */ +static const Type* sv_owned_reg_type(CG* g, const SValue* sv) { + if (sv->op.kind == OPK_INDIRECT) { + return type_ptr(g->pool, sv->type ? sv->type : type_void(g->pool)); + } + return sv_type(sv); +} + +static u8 reg_of_sv(const SValue* sv) { + if (sv->op.kind == OPK_REG) return (u8)sv->op.v.reg; + if (sv->op.kind == OPK_INDIRECT) return (u8)sv->op.v.ind.base; + return 0; +} + +/* Write the register an SValue owns. For OPK_INDIRECT the ind.base + * field; for OPK_REG the v.reg field. Used by ensure_reg to restore + * the reloaded reg, and by alloc_reg_or_spill to mark a freshly- + * spilled SValue's reg slot as REG_NONE so any stale read fails fast. */ +static void set_owned_reg(SValue* sv, Reg r) { + if (sv->op.kind == OPK_INDIRECT) { + sv->op.v.ind.base = r; + } else { + sv->op.v.reg = r; + } +} + +/* MemAccess for a spill/reload. Width is driven by the type of the value + * whose register is moving — pointer-sized for OPK_INDIRECT bases, + * otherwise the value's own type. The alias root is ALIAS_LOCAL with + * id 0: spill traffic is internal CG bookkeeping that opt's alias + * analysis never inspects, and the slot itself disambiguates accesses. */ +static MemAccess mem_for_spill(CG* g, const SValue* sv) { + return derive_mem(g, sv_owned_reg_type(g, sv), ALIAS_LOCAL, 0); +} + +/* Per-class spill-slot size in bytes. FP slots are 16 bytes to cover + * `double` and the spilled portion of `long double`; INT slots are + * pointer-width. Both values are also the slot's alignment. */ +static u32 spill_slot_size(u8 cls) { return (cls == RC_FP) ? 16u : 8u; } + +/* Take a spill slot from the per-class free-list, or allocate a fresh one. */ +static FrameSlot take_spill_slot(CG* g, u8 cls) { + if (g->slot_pools[cls].n > 0) { + return g->slot_pools[cls].free[--g->slot_pools[cls].n]; + } + FrameSlotDesc d; + memset(&d, 0, sizeof d); + d.size = spill_slot_size(cls); + d.align = d.size; + d.kind = FS_SPILL; + return g->target->frame_slot(g->target, &d); +} + +static void return_spill_slot(CG* g, FrameSlot s, u8 cls) { + if (s == FRAME_SLOT_NONE) return; + if (g->slot_pools[cls].n == g->slot_pools[cls].cap) { + Heap* h = g->c->env->heap; + u32 cap = g->slot_pools[cls].cap; + u32 nc = cap ? cap * 2u : CG_SPILL_FREE_INITIAL; + FrameSlot* nb = + (FrameSlot*)h->alloc(h, sizeof(FrameSlot) * nc, _Alignof(FrameSlot)); + if (g->slot_pools[cls].free) { + memcpy(nb, g->slot_pools[cls].free, + sizeof(FrameSlot) * g->slot_pools[cls].n); + h->free(h, g->slot_pools[cls].free, sizeof(FrameSlot) * cap); + } + g->slot_pools[cls].free = nb; + g->slot_pools[cls].cap = nc; + } + g->slot_pools[cls].free[g->slot_pools[cls].n++] = s; +} + +/* Walk the value stack from index 0 upward and return the first + * unpinned RES_REG SValue that matches `cls`. FIFO-from-bottom == the + * deepest live value, which matches the intuition that the top of + * stack is about to be consumed. Pinned entries are skipped — they're + * the operands of an in-flight CG op that mustn't be evicted from + * under itself (see cg_dup, which keeps its source on the stack while + * allocating the duplicate's destination register). Returns NULL if + * no eligible victim exists; that's a CG bug at this call site. */ +static SValue* pick_victim(CG* g, u8 cls) { + for (u32 i = 0; i < g->sp; ++i) { + SValue* sv = &g->stack[i]; + if (sv->res != RES_REG) continue; + if (sv->pinned) continue; + if (class_of_sv(sv) != cls) continue; + return sv; + } + return NULL; +} + +/* Release the resources owned by a single in-flight CGABIValue arg + * after the call has returned: REG storage goes back to the reg pool, + * LOCAL storage (only produced by spill_avs_victim) returns its slot + * to the per-class spill-slot pool. IMM and other kinds carry no + * runtime ownership and need nothing. */ +static void release_arg_storage(CG* g, const Operand* st) { + if (st->kind == OPK_REG) { + g->target->free_reg(g->target, st->v.reg); + } else if (st->kind == OPK_LOCAL) { + return_spill_slot(g, st->v.frame_slot, st->cls); + } +} + +/* Try to spill a CGABIValue arg from the in-flight set whose `storage` + * is an OPK_REG of `cls`. Mutates the entry: storage becomes OPK_LOCAL + * pointing at a freshly-taken spill slot, and the backend's call + * lowering will load from there when materializing the arg into its + * ABI register. Returns 1 if a victim was found and spilled, 0 if none + * was eligible. Used by alloc_reg_or_spill as a fallback when the + * value stack has no eligible victim — see CG.avs_in_flight. */ +static int spill_avs_victim(CG* g, u8 cls) { + CGTarget* T = g->target; + if (!g->avs_in_flight) return 0; + for (u32 i = 0; i < g->avs_in_flight_n; ++i) { + CGABIValue* av = &g->avs_in_flight[i]; + if (av->storage.kind != OPK_REG) continue; + if (av->storage.cls != cls) continue; + FrameSlot slot = take_spill_slot(g, cls); + SValue tmp = make_sv(av->storage, av->type); + MemAccess ma = mem_for_spill(g, &tmp); + T->spill_reg(T, av->storage, slot, ma); + /* Rewrite storage to OPK_LOCAL so the backend reads from the slot. + * The cls is preserved on the operand so the cleanup loop knows + * which spill-slot pool to return the slot to. */ + Operand local = op_local(slot, av->type); + local.cls = cls; + av->storage = local; + return 1; + } + return 0; +} + +/* Allocate a register; on pool exhaustion, spill the deepest live + * RES_REG value to a frame slot and try again. If the value stack has + * no eligible victim, fall back to spilling an in-flight cg_call arg + * (see avs_in_flight) — without that fallback, a call with more args + * than the pool size can hold becomes unsatisfiable since the popped- + * but-not-yet-emitted arg regs would sit unreclaimable in avs[]. */ +static Reg alloc_reg_or_spill(CG* g, u8 cls, const Type* ty) { + CGTarget* T = g->target; + Reg r = T->alloc_reg(T, cls, ty); + if (r != (Reg)REG_NONE) return r; + + SValue* victim = pick_victim(g, cls); + if (victim) { + FrameSlot slot = take_spill_slot(g, cls); + Operand victim_reg = op_reg((Reg)reg_of_sv(victim), victim->type); + T->spill_reg(T, victim_reg, slot, mem_for_spill(g, victim)); + + victim->spill_slot = slot; + victim->res = RES_SPILLED; + /* Mark the reg slot as REG_NONE so any stale read fails fast; the + * reload via ensure_reg will write the new reg back into the same + * field. */ + set_owned_reg(victim, (Reg)REG_NONE); + } else if (!spill_avs_victim(g, cls)) { + compiler_panic(g->c, g->cur_loc, + "cg: regalloc — no spillable victim (class %u)", + (unsigned)cls); + } + + r = T->alloc_reg(T, cls, ty); + if (r == (Reg)REG_NONE) { + compiler_panic(g->c, g->cur_loc, + "cg: regalloc — class %u still empty after spill", + (unsigned)cls); + } + return r; +} + +/* Reload a spilled SValue back into a register (possibly evicting another + * live value). After return, `sv->res == RES_REG` and the operand reflects + * the reloaded register. INDIRECT lvalues are restored to OPK_INDIRECT + * with a fresh base; the deferred-load identity is preserved. */ +static void ensure_reg(CG* g, SValue* sv) { + if (sv->res != RES_SPILLED) return; + CGTarget* T = g->target; + u8 cls = class_of_sv(sv); + const Type* ty = sv_owned_reg_type(g, sv); + Reg r = alloc_reg_or_spill(g, cls, ty); + T->reload_reg(T, op_reg(r, ty), sv->spill_slot, mem_for_spill(g, sv)); + return_spill_slot(g, sv->spill_slot, cls); + sv->spill_slot = FRAME_SLOT_NONE; + /* For INDIRECT, the lvalue's deferred-load identity is preserved: only + * the base reg changes. For everything else, refresh the whole REG + * operand so its `type` matches the SValue's current type tag. */ + if (sv->op.kind == OPK_INDIRECT) { + sv->op.v.ind.base = r; + } else { + sv->op = op_reg(r, sv_type(sv)); + } + sv->res = RES_REG; +} + +/* Release any register or spill slot owned by an SValue popped off the + * stack and not consumed by a downstream operation. */ +static void release(CG* g, SValue* sv) { + if (sv->res == RES_REG) { + g->target->free_reg(g->target, (Reg)reg_of_sv(sv)); + } else if (sv->res == RES_SPILLED) { + return_spill_slot(g, sv->spill_slot, class_of_sv(sv)); + sv->spill_slot = FRAME_SLOT_NONE; + } + sv->res = RES_INHERENT; +} + /* ============================================================ * Construction * ============================================================ */ @@ -212,6 +533,12 @@ void cg_free(CG* g) { if (!g) return; h = g->c->env->heap; if (g->stack) h->free(h, g->stack, sizeof(SValue) * g->cap); + for (u32 c = 0; c < 3; ++c) { + if (g->slot_pools[c].free) { + h->free(h, g->slot_pools[c].free, + sizeof(FrameSlot) * g->slot_pools[c].cap); + } + } h->free(h, g, sizeof *g); } @@ -227,6 +554,10 @@ void cg_func_begin(CG* g, const CGFuncDesc* fd) { g->fn_ret_type = fd->fn_type ? fd->fn_type->fn.ret : NULL; g->fn_abi = fd->abi; g->sp = 0; + /* Per-function spill-slot free-lists reset. The backing arrays are + * reused; only the counts go to zero since slot ids belong to the new + * function's frame. */ + for (u32 c = 0; c < 3; ++c) g->slot_pools[c].n = 0; /* Class-1 DWARF: a new subprogram opens. doc/DWARF.md §3.1 makes this * the parser's job; we forward through cg as a convenience hook. */ @@ -270,10 +601,7 @@ void cg_bind_decl(CG* g, DeclId id) { * ============================================================ */ void cg_push_int(CG* g, i64 v, const Type* ty) { - SValue sv; - sv.op = op_imm(v, ty); - sv.type = ty; - push(g, sv); + push(g, make_sv(op_imm(v, ty), ty)); } void cg_push_const(CG* g, ConstBytes cb) { @@ -281,15 +609,10 @@ void cg_push_const(CG* g, ConstBytes cb) { * stack value is plain rvalue REG. The constant pool / immediate-encoding * choice is the backend's. */ CGTarget* T = g->target; - Reg r = T->alloc_reg(T, type_class(cb.type), cb.type); + Reg r = alloc_reg_or_spill(g, type_class(cb.type), cb.type); Operand dst = op_reg(r, cb.type); T->load_const(T, dst, cb); - { - SValue sv; - sv.op = dst; - sv.type = cb.type; - push(g, sv); - } + push(g, make_sv(dst, cb.type)); } void cg_push_float(CG* g, double v, const Type* ty) { @@ -330,43 +653,28 @@ void cg_push_local(CG* g, FrameSlot s) { * scope record, not through cg, so the push uses NULL type and the * subsequent cg_load supplies the right type. The parser actually pushes * via the type-aware variant; this base entry is here for completeness. */ - SValue sv; - sv.op = op_local(s, NULL); - sv.type = NULL; - push(g, sv); + push(g, make_sv(op_local(s, NULL), NULL)); } /* Type-aware variants used by the parser. Not in the public header; the * parser calls these directly via a small extension below. */ void cg_push_local_typed(CG* g, FrameSlot s, const Type* ty); void cg_push_local_typed(CG* g, FrameSlot s, const Type* ty) { - SValue sv; - sv.op = op_local(s, ty); - sv.type = ty; - push(g, sv); + push(g, make_sv(op_local(s, ty), ty)); } /* Pop a pointer rvalue and push an OPK_INDIRECT lvalue for the pointee. * The parser uses this to implement unary `*`. The pointer is materialized * into a register; the resulting lvalue's MemAccess alias root is unknown * (not LOCAL/GLOBAL), which is the right conservative answer for *ptr. */ -static Operand force_reg(CG* g, SValue v, const Type* ty); +static Operand force_reg(CG* g, SValue* v, const Type* ty); void cg_deref(CG* g, const Type* pointee_ty); void cg_deref(CG* g, const Type* pointee_ty) { SValue v = pop(g); - const Type* pty = v.type ? v.type : v.op.type; - Operand src = force_reg(g, v, pty); - Operand ind; - SValue sv; - memset(&ind, 0, sizeof ind); - ind.kind = OPK_INDIRECT; - ind.cls = RC_INT; - ind.type = pointee_ty; - ind.v.ind.base = src.v.reg; - ind.v.ind.ofs = 0; - sv.op = ind; - sv.type = pointee_ty; - push(g, sv); + /* The pointer reg becomes the new lvalue's base — ownership transfers + * from `v` to the new INDIRECT SValue, so no release on `v`. */ + Operand src = force_reg(g, &v, sv_type(&v)); + push(g, make_sv(op_indirect(src.v.reg, 0, pointee_ty), pointee_ty)); } /* Read the type of the value currently on top of the stack without popping. @@ -398,25 +706,8 @@ void cg_retag_top(CG* g, const Type* ty) { g->stack[g->sp - 1].op.type = ty; } -/* Recycle the backend's scratch-register pool. Safe only when nothing on - * the value stack holds a register operand (sp == 0 in particular, but - * an all-IMM stack is also fine in principle). The parser calls this at - * statement boundaries so a function body with many sequential reg- - * allocating operations doesn't exhaust the fixed scratch window. */ -extern void aa_reset_scratch(CGTarget*); -void cg_reset_scratch(CG* g); -void cg_reset_scratch(CG* g) { - if (g->sp != 0) return; - /* For now we only know about the aarch64 backend; once a second arch - * lands we promote this to a CGTarget vtable entry. */ - aa_reset_scratch(g->target); -} - void cg_push_global(CG* g, ObjSymId sym, const Type* ty) { - SValue sv; - sv.op = op_global(sym, 0, ty); - sv.type = ty; - push(g, sv); + push(g, make_sv(op_global(sym, 0, ty), ty)); } /* ============================================================ @@ -424,8 +715,47 @@ void cg_push_global(CG* g, ObjSymId sym, const Type* ty) { * ============================================================ */ void cg_dup(CG* g) { - SValue v = *top(g); - push(g, v); + /* Duplicate the top SValue. INHERENT values (IMM/LOCAL/GLOBAL) carry + * no register ownership — copying the SValue is enough. REG-owning + * values must materialize into a second register so each side of the + * dup can be released independently; the contents come over via + * target->copy. + * + * The source stays on the stack while the destination register is + * allocated, which means alloc_reg_or_spill could in principle pick + * the source as the spill victim if pool pressure is high and there + * is no other eligible RES_REG on the stack. Pin the source for the + * duration to keep pick_victim away from it. */ + CGTarget* T = g->target; + if (g->sp == 0) compiler_panic(g->c, g->cur_loc, "cg_dup: stack empty"); + SValue* top_p = &g->stack[g->sp - 1]; + ensure_reg(g, top_p); + SValue v = *top_p; /* snapshot AFTER reload — v.op now reflects fresh reg */ + if (v.res != RES_REG) { + push(g, v); + return; + } + top_p->pinned = 1; + const Type* ty = sv_owned_reg_type(g, &v); + Reg r = alloc_reg_or_spill(g, class_of_sv(&v), ty); + T->copy(T, op_reg(r, ty), op_reg((Reg)reg_of_sv(&v), ty)); + /* Refresh the stack pointer: alloc_reg_or_spill above may have spilled + * a different stack entry (the top was pinned, so it stayed put), but + * the SValue array address could have been reallocated by a future + * grow. Today no path inside copy/spill grows the stack, but reading + * through the original `top_p` after a potential realloc would be UB + * regardless. */ + g->stack[g->sp - 1].pinned = 0; + /* The duplicate is `v` with its owned reg replaced by `r`. set_owned_reg + * writes into ind.base for INDIRECT lvalues or v.reg for REG rvalues — + * either way the duplicate ends up RES_REG with the freshly-copied + * value, independent of the source. */ + SValue dup = v; + set_owned_reg(&dup, r); + dup.res = RES_REG; + dup.pinned = 0; + dup.spill_slot = FRAME_SLOT_NONE; + push(g, dup); } void cg_swap(CG* g) { @@ -438,7 +768,10 @@ void cg_swap(CG* g) { g->stack[g->sp - 2] = a; } -void cg_drop(CG* g) { (void)pop(g); } +void cg_drop(CG* g) { + SValue v = pop(g); + release(g, &v); +} /* ============================================================ * load / store / addr @@ -451,49 +784,34 @@ static int is_lvalue(const Operand* o) { void cg_load(CG* g) { SValue v = pop(g); - CGTarget* T = g->target; + ensure_reg(g, &v); if (!is_lvalue(&v.op)) { /* Already an rvalue — passing-through is correct (cg_load is idempotent * on rvalues so the parser can call it eagerly). */ push(g, v); return; } - { - const Type* ty = v.type ? v.type : v.op.type; - Reg r = T->alloc_reg(T, type_class(ty), ty); - Operand dst = op_reg(r, ty); - MemAccess ma; - AliasKind ak = alias_for_lvalue(&v.op); - i32 alias_local = (ak == ALIAS_LOCAL) ? (i32)v.op.v.frame_slot : 0; - ma = derive_mem(g, ty, ak, alias_local); - T->load(T, dst, v.op, ma); - { - SValue rv; - rv.op = dst; - rv.type = ty; - push(g, rv); - } - } + /* force_reg's lvalue branch does exactly what cg_load wants: alloc a + * fresh value reg, T->load through the lvalue's MemAccess, free the + * old INDIRECT base if any, retag v as RES_REG. */ + const Type* ty = sv_type(&v); + Operand dst = force_reg(g, &v, ty); + push(g, make_sv(dst, ty)); } void cg_addr(CG* g) { SValue v = pop(g); CGTarget* T = g->target; + ensure_reg(g, &v); if (!is_lvalue(&v.op)) { compiler_panic(g->c, g->cur_loc, "cg_addr: operand is not an lvalue"); } - { - const Type* pty = type_ptr(g->pool, v.type ? v.type : v.op.type); - Reg r = T->alloc_reg(T, RC_INT, pty); - Operand dst = op_reg(r, pty); - T->addr_of(T, dst, v.op); - { - SValue rv; - rv.op = dst; - rv.type = pty; - push(g, rv); - } - } + const Type* pty = type_ptr(g->pool, sv_type(&v)); + Reg r = alloc_reg_or_spill(g, RC_INT, pty); + Operand dst = op_reg(r, pty); + T->addr_of(T, dst, v.op); + release(g, &v); + push(g, make_sv(dst, pty)); } void cg_store(CG* g) { @@ -506,34 +824,25 @@ void cg_store(CG* g) { SValue rv = pop(g); SValue lv = pop(g); CGTarget* T = g->target; + ensure_reg(g, &rv); + ensure_reg(g, &lv); if (!is_lvalue(&lv.op)) { compiler_panic(g->c, g->cur_loc, "cg_store: destination is not an lvalue"); } - { - const Type* ty = lv.type ? lv.type : lv.op.type; - AliasKind ak = alias_for_lvalue(&lv.op); - i32 alias_local = (ak == ALIAS_LOCAL) ? (i32)lv.op.v.frame_slot : 0; - MemAccess ma = derive_mem(g, ty, ak, alias_local); - /* IMM is a legal source for store; otherwise force into a reg. */ - Operand src = rv.op; - if (src.kind != OPK_REG && src.kind != OPK_IMM) { - Reg r = T->alloc_reg(T, type_class(ty), ty); - Operand dst = op_reg(r, ty); - MemAccess mr; - AliasKind sak = alias_for_lvalue(&src); - i32 saloc = (sak == ALIAS_LOCAL) ? (i32)src.v.frame_slot : 0; - mr = derive_mem(g, rv.type ? rv.type : ty, sak, saloc); - T->load(T, dst, src, mr); - src = dst; - } - T->store(T, lv.op, src, ma); - { - SValue out; - out.op = src; - out.type = ty; - push(g, out); - } + const Type* ty = sv_type(&lv); + /* IMM is a legal source for store; otherwise force the rvalue into a + * register. force_reg handles the lvalue → REG transition cleanly. */ + Operand src; + if (rv.op.kind == OPK_IMM || rv.op.kind == OPK_REG) { + src = rv.op; + } else { + src = force_reg(g, &rv, sv_type(&rv)); } + T->store(T, lv.op, src, mem_for_lvalue(g, &lv.op, ty)); + release(g, &lv); + /* Result of assignment expression: leave the stored rvalue on top. + * Ownership of any reg in `src` transfers to the new SValue. */ + push(g, make_sv(src, ty)); } /* ============================================================ @@ -561,26 +870,32 @@ void cg_bitfield_store(CG* g, BitFieldAccess b) { * Arithmetic / compare / convert * ============================================================ */ -/* Force an SValue into a register operand of the given type. */ -static Operand force_reg(CG* g, SValue v, const Type* ty) { +/* Force an SValue (already popped, by reference) into a register operand + * of the given type. Mutates `*v` so that v->op is OPK_REG and v->res is + * RES_REG; on lvalue inputs this means the original lvalue's base reg is + * freed and replaced by the freshly-loaded value reg. The caller can + * then release(g, v) to give the register back when the operation is + * done with it. */ +static Operand force_reg(CG* g, SValue* v, const Type* ty) { CGTarget* T = g->target; - if (v.op.kind == OPK_REG) return v.op; - if (v.op.kind == OPK_IMM) { - Reg r = T->alloc_reg(T, type_class(ty), ty); - Operand dst = op_reg(r, ty); - T->load_imm(T, dst, v.op.v.imm); - return dst; - } - if (is_lvalue(&v.op)) { - Reg r = T->alloc_reg(T, type_class(ty), ty); - Operand dst = op_reg(r, ty); - AliasKind ak = alias_for_lvalue(&v.op); - i32 al = (ak == ALIAS_LOCAL) ? (i32)v.op.v.frame_slot : 0; - MemAccess ma = derive_mem(g, ty, ak, al); - T->load(T, dst, v.op, ma); - return dst; + ensure_reg(g, v); + if (v->op.kind == OPK_REG) return v->op; + Reg r = alloc_reg_or_spill(g, type_class(ty), ty); + Operand dst = op_reg(r, ty); + if (v->op.kind == OPK_IMM) { + T->load_imm(T, dst, v->op.v.imm); + } else if (is_lvalue(&v->op)) { + T->load(T, dst, v->op, mem_for_lvalue(g, &v->op, ty)); + /* Old INDIRECT base reg is no longer referenced — release it. */ + if (v->op.kind == OPK_INDIRECT) { + T->free_reg(T, v->op.v.ind.base); + } + } else { + compiler_panic(g->c, g->cur_loc, "cg: cannot force operand to register"); } - compiler_panic(g->c, g->cur_loc, "cg: cannot force operand to register"); + v->op = dst; + v->res = RES_REG; + return dst; } void cg_binop(CG* g, BinOp op) { @@ -590,33 +905,26 @@ void cg_binop(CG* g, BinOp op) { CGTarget* T = g->target; /* Result type is `a`'s type at this slice (parser already coerced). */ const Type* ty = a.type ? a.type : b.type; - Operand ra = force_reg(g, a, ty); - Operand rb = force_reg(g, b, ty); - Reg rr = T->alloc_reg(T, type_class(ty), ty); + Operand ra = force_reg(g, &a, ty); + Operand rb = force_reg(g, &b, ty); + Reg rr = alloc_reg_or_spill(g, type_class(ty), ty); Operand dst = op_reg(rr, ty); T->binop(T, op, dst, ra, rb); - { - SValue sv; - sv.op = dst; - sv.type = ty; - push(g, sv); - } + release(g, &a); + release(g, &b); + push(g, make_sv(dst, ty)); } void cg_unop(CG* g, UnOp op) { SValue a = pop(g); CGTarget* T = g->target; const Type* ty = a.type ? a.type : a.op.type; - Operand ra = force_reg(g, a, ty); - Reg rr = T->alloc_reg(T, type_class(ty), ty); + Operand ra = force_reg(g, &a, ty); + Reg rr = alloc_reg_or_spill(g, type_class(ty), ty); Operand dst = op_reg(rr, ty); T->unop(T, op, dst, ra); - { - SValue sv; - sv.op = dst; - sv.type = ty; - push(g, sv); - } + release(g, &a); + push(g, make_sv(dst, ty)); } void cg_cmp(CG* g, CmpOp op) { @@ -626,17 +934,14 @@ void cg_cmp(CG* g, CmpOp op) { CGTarget* T = g->target; const Type* opty = a.type ? a.type : b.type; const Type* i32 = type_prim(g->pool, TY_INT); - Operand ra = force_reg(g, a, opty); - Operand rb = force_reg(g, b, opty); - Reg rr = T->alloc_reg(T, RC_INT, i32); + Operand ra = force_reg(g, &a, opty); + Operand rb = force_reg(g, &b, opty); + Reg rr = alloc_reg_or_spill(g, RC_INT, i32); Operand dst = op_reg(rr, i32); T->cmp(T, op, dst, ra, rb); - { - SValue sv; - sv.op = dst; - sv.type = i32; - push(g, sv); - } + release(g, &a); + release(g, &b); + push(g, make_sv(dst, i32)); } void cg_inc_dec(CG* g, BinOp op, int post) { @@ -644,45 +949,31 @@ void cg_inc_dec(CG* g, BinOp op, int post) { * because juggling lv + old + new through dup/swap from outside requires * a 3-element rotate the stack API doesn't expose. */ CGTarget* T = g->target; - SValue lv; - const Type* ty; - AliasKind ak; - i32 alias_local; - MemAccess ma; - Reg r_old; - Reg r_new; - Operand o_old; - Operand o_new; - Operand o_one; - - lv = pop(g); + SValue lv = pop(g); + ensure_reg(g, &lv); if (!is_lvalue(&lv.op)) { compiler_panic(g->c, g->cur_loc, "cg_inc_dec: target is not an lvalue"); } - ty = lv.type ? lv.type : lv.op.type; - ak = alias_for_lvalue(&lv.op); - alias_local = (ak == ALIAS_LOCAL) ? (i32)lv.op.v.frame_slot : 0; - ma = derive_mem(g, ty, ak, alias_local); + const Type* ty = sv_type(&lv); + MemAccess ma = mem_for_lvalue(g, &lv.op, ty); /* Load current value into r_old, compute r_new = r_old +/- 1, store back. */ - r_old = T->alloc_reg(T, type_class(ty), ty); - o_old = op_reg(r_old, ty); + Reg r_old = alloc_reg_or_spill(g, type_class(ty), ty); + Operand o_old = op_reg(r_old, ty); T->load(T, o_old, lv.op, ma); - r_new = T->alloc_reg(T, type_class(ty), ty); - o_new = op_reg(r_new, ty); - o_one = op_imm(1, ty); - T->binop(T, op, o_new, o_old, o_one); + Reg r_new = alloc_reg_or_spill(g, type_class(ty), ty); + Operand o_new = op_reg(r_new, ty); + T->binop(T, op, o_new, o_old, op_imm(1, ty)); T->store(T, lv.op, o_new, ma); - { - SValue sv; - sv.op = post ? o_old : o_new; - sv.type = ty; - push(g, sv); - } + /* Free whichever register is NOT being returned, plus any base reg the + * lvalue owned. */ + T->free_reg(T, post ? r_new : r_old); + release(g, &lv); + push(g, make_sv(post ? o_old : o_new, ty)); } void cg_convert(CG* g, const Type* dst_ty) { @@ -698,8 +989,8 @@ void cg_convert(CG* g, const Type* dst_ty) { push(g, v); return; } - src = force_reg(g, v, sty); - rr = T->alloc_reg(T, type_class(dst_ty), dst_ty); + src = force_reg(g, &v, sty); + rr = alloc_reg_or_spill(g, type_class(dst_ty), dst_ty); dst = op_reg(rr, dst_ty); /* Pick a ConvKind from src/dst kinds. v1 spine only sees integer↔integer * (sign/zero ext + trunc); float and bitcast follow the same dispatch. */ @@ -733,12 +1024,8 @@ void cg_convert(CG* g, const Type* dst_ty) { } } T->convert(T, ck, dst, src); - { - SValue sv; - sv.op = dst; - sv.type = dst_ty; - push(g, sv); - } + release(g, &v); + push(g, make_sv(dst, dst_ty)); } /* ============================================================ @@ -749,60 +1036,51 @@ void cg_call(CG* g, u32 nargs, const Type* fn_type) { /* stack: [..., callee, arg0..argN-1] → [result] (or nothing if void) */ CGTarget* T = g->target; const ABIFuncInfo* abi = abi_func_info(g->abi, fn_type); - CGABIValue* avs = NULL; - CGABIValue ret_v; - CGCallDesc desc; - Operand callee_op; - SValue callee; - u32 i; + const Type* ret_ty = fn_type->fn.ret; + int has_result = ret_ty && ret_ty->kind != TY_VOID; if (g->sp < (u32)nargs + 1u) { compiler_panic(g->c, g->cur_loc, "cg_call: stack underflow"); } + CGABIValue* avs = NULL; if (nargs) { avs = arena_array(g->c->tu, CGABIValue, nargs); memset(avs, 0, sizeof(CGABIValue) * nargs); } - /* Pop args in reverse so we can fill avs[i] in declaration order. */ - for (i = 0; i < nargs; ++i) { + /* Expose avs to the regalloc fallback. As we pop and materialize args + * one at a time, the popped regs accumulate in avs[] off the value + * stack, where pick_victim can't reach them. If pressure exhausts the + * pool while reloading a spilled arg later in the loop, spill_avs_victim + * picks an already-materialized avs entry, stores it to a frame slot, + * and rewrites avs[i].storage to OPK_LOCAL — the backend's call + * lowering loads from the slot. */ + g->avs_in_flight = avs; + g->avs_in_flight_n = nargs; + + /* Pop args in reverse so we can fill avs[i] in declaration order. + * Lvalues materialize into a register through force_reg (which also + * frees an old INDIRECT base); OPK_IMM and OPK_REG pass through so + * the call sees the same operand. */ + for (u32 i = 0; i < nargs; ++i) { u32 idx = nargs - 1u - i; SValue arg = pop(g); + ensure_reg(g, &arg); const Type* aty = fn_type->fn.params ? fn_type->fn.params[idx] : arg.type; - Operand src; - /* Materialize into an Operand the backend can route through ABI parts. - * For simple scalars REG/IMM is enough; aggregates would force LOCAL. */ - if (arg.op.kind == OPK_LOCAL || arg.op.kind == OPK_GLOBAL || - arg.op.kind == OPK_INDIRECT) { - /* lvalue: backend may need an address (byval/indirect) or a loaded - * value. Spine: scalars only — load to register. */ - Reg r = T->alloc_reg(T, type_class(aty), aty); - Operand dst = op_reg(r, aty); - AliasKind ak = alias_for_lvalue(&arg.op); - i32 al = (ak == ALIAS_LOCAL) ? (i32)arg.op.v.frame_slot : 0; - MemAccess ma = derive_mem(g, aty, ak, al); - T->load(T, dst, arg.op, ma); - src = dst; - } else if (arg.op.kind == OPK_IMM) { - src = arg.op; - } else { - src = arg.op; - } avs[idx].type = aty; avs[idx].abi = &abi->params[idx]; - avs[idx].storage = src; - avs[idx].parts = NULL; - avs[idx].nparts = 0; + avs[idx].storage = is_lvalue(&arg.op) ? force_reg(g, &arg, aty) : arg.op; } - callee = pop(g); - if (callee.op.kind == OPK_GLOBAL) { - callee_op = callee.op; - } else { - /* Indirect call — force into a register if necessary. */ - callee_op = force_reg(g, callee, fn_type); - } + SValue callee = pop(g); + ensure_reg(g, &callee); + /* Direct calls keep the OPK_GLOBAL operand; indirect calls force the + * function pointer into a register. */ + Operand callee_op = (callee.op.kind == OPK_GLOBAL) + ? callee.op + : force_reg(g, &callee, fn_type); + CGCallDesc desc; memset(&desc, 0, sizeof desc); desc.fn_type = fn_type; desc.abi = abi; @@ -810,24 +1088,29 @@ void cg_call(CG* g, u32 nargs, const Type* fn_type) { desc.args = avs; desc.nargs = nargs; desc.flags = CG_CALL_NONE; - /* Return storage: REG of the right class for scalar returns; struct - * returns would set parts/storage differently. */ - memset(&ret_v, 0, sizeof ret_v); - ret_v.type = fn_type->fn.ret; - ret_v.abi = &abi->ret; - if (ret_v.type && ret_v.type->kind != TY_VOID) { - Reg r = T->alloc_reg(T, type_class(ret_v.type), ret_v.type); - ret_v.storage = op_reg(r, ret_v.type); + desc.ret.type = ret_ty; + desc.ret.abi = &abi->ret; + if (has_result) { + Reg r = alloc_reg_or_spill(g, type_class(ret_ty), ret_ty); + desc.ret.storage = op_reg(r, ret_ty); } - desc.ret = ret_v; T->call(T, &desc); - if (ret_v.type && ret_v.type->kind != TY_VOID) { - SValue sv; - sv.op = ret_v.storage; - sv.type = ret_v.type; - push(g, sv); + /* Tear down the in-flight arg set: each entry's storage may be a REG + * (return to pool) or OPK_LOCAL (a spill slot, return to per-class + * free-list). IMMs carry no runtime ownership. */ + for (u32 i = 0; i < nargs; ++i) { + release_arg_storage(g, &avs[i].storage); + } + g->avs_in_flight = NULL; + g->avs_in_flight_n = 0; + + if (callee.op.kind != OPK_GLOBAL) { + T->free_reg(T, callee_op.v.reg); + } + if (has_result) { + push(g, make_sv(desc.ret.storage, ret_ty)); } } @@ -848,13 +1131,14 @@ void cg_ret(CG* g, int has_value) { { SValue v = pop(g); const Type* rty = g->fn_ret_type; - Operand ret_op = force_reg(g, v, rty); + Operand ret_op = force_reg(g, &v, rty); CGABIValue av; memset(&av, 0, sizeof av); av.type = rty; av.abi = &abi->ret; av.storage = ret_op; T->ret(T, &av); + release(g, &v); } } @@ -869,22 +1153,15 @@ void cg_alloca(CG* g) { * size up to a 16-byte multiple, which is what keeps SP aligned). */ CGTarget* T = g->target; SValue sz = pop(g); - Operand sz_op; const Type* void_ptr = type_ptr(g->pool, type_void(g->pool)); - Reg dst_r; - Operand dst; - SValue out; - if (sz.op.kind == OPK_IMM) { - sz_op = sz.op; - } else { - sz_op = force_reg(g, sz, sz.type ? sz.type : sz.op.type); - } - dst_r = T->alloc_reg(T, RC_INT, void_ptr); - dst = op_reg(dst_r, void_ptr); + ensure_reg(g, &sz); + Operand sz_op = + (sz.op.kind == OPK_IMM) ? sz.op : force_reg(g, &sz, sv_type(&sz)); + Reg dst_r = alloc_reg_or_spill(g, RC_INT, void_ptr); + Operand dst = op_reg(dst_r, void_ptr); T->alloca_(T, dst, sz_op, /*align=*/16); - out.op = dst; - out.type = void_ptr; - push(g, out); + release(g, &sz); + push(g, make_sv(dst, void_ptr)); } void cg_va_start_(CG* g) { compiler_panic(g->c, g->cur_loc, "cg_va_start: not in v1 slice"); @@ -945,9 +1222,10 @@ void cg_branch_true(CG* g, CGLabel l) { SValue v = pop(g); CGTarget* T = g->target; const Type* ty = v.type ? v.type : type_prim(g->pool, TY_INT); - Operand a = force_reg(g, v, ty); + Operand a = force_reg(g, &v, ty); Operand zero = op_imm(0, ty); T->cmp_branch(T, CMP_NE, a, zero, (Label)l); + release(g, &v); } void cg_branch_false(CG* g, CGLabel l) { @@ -962,12 +1240,14 @@ void cg_branch_false(CG* g, CGLabel l) { if (v.op.v.imm == 0) { T->jump(T, (Label)l); } + release(g, &v); return; } { - Operand a = force_reg(g, v, ty); + Operand a = force_reg(g, &v, ty); Operand zero = op_imm(0, ty); T->cmp_branch(T, CMP_EQ, a, zero, (Label)l); + release(g, &v); } } @@ -986,7 +1266,11 @@ CGScope cg_scope_begin(CG* g, CGScopeConfig cfg) { /* Pop the condition. */ SValue v = pop(g); const Type* ty = v.type ? v.type : type_prim(g->pool, TY_INT); - d.cond = force_reg(g, v, ty); + d.cond = force_reg(g, &v, ty); + /* The cond reg is consumed by the backend's scope_begin emit; once + * the comparison/branch is in flight there's no live use, so free + * it back to the pool now. */ + release(g, &v); } return (CGScope)g->target->scope_begin(g->target, &d); } diff --git a/src/parse/parse.c b/src/parse/parse.c @@ -53,7 +53,6 @@ extern void cg_retag_top(CG*, const Type*); /* Recycle the backend's scratch-register pool when no value-stack entry * holds a live register. Called at statement boundaries to avoid * exhausting the fixed scratch window over the course of a function. */ -extern void cg_reset_scratch(CG*); /* ============================================================ * Keywords @@ -1636,7 +1635,6 @@ static int parse_decl_suffix(Parser* p, DeclSuffix* out) { cg_swap(p->cg); cg_store(p->cg); cg_drop(p->cg); - cg_reset_scratch(p->cg); p->vla_pending = 1; p->vla_pending_count_slot = out->vla_count_slot; } @@ -1823,7 +1821,6 @@ static void zero_init_at(Parser* p, FrameSlot slot, const Type* arr_ty, cg_push_int(p->cg, 0, ty); cg_store(p->cg); cg_drop(p->cg); - cg_reset_scratch(p->cg); } /* Parse the initializer for the sub-object at `offset` of type `ty`. Arrays @@ -1862,7 +1859,6 @@ static void init_at(Parser* p, FrameSlot slot, const Type* arr_ty, u32 offset, to_rvalue(p); cg_store(p->cg); cg_drop(p->cg); - cg_reset_scratch(p->cg); if (had_brace) { accept_punct(p, ','); /* tolerate trailing comma inside `{ x, }` */ expect_punct(p, '}', "'}' after scalar initializer"); @@ -1900,7 +1896,6 @@ static void parse_init_declarator(Parser* p, const DeclSpecs* specs) { cg_swap(p->cg); cg_store(p->cg); cg_drop(p->cg); - cg_reset_scratch(p->cg); if (accept_punct(p, '=')) { perr(p, "VLA initializers are not allowed (§6.7.9 ¶3)"); } @@ -2092,7 +2087,6 @@ static void parse_stmt(Parser* p) { /* Each statement starts from an empty value stack; recycle scratch * registers so a function body with many sequential reg-allocating * operations isn't bounded by the backend's fixed scratch window. */ - cg_reset_scratch(p->cg); cg_set_loc(p->cg, tok_loc(&p->cur)); if (is_punct(&p->cur, '{')) { parse_compound_stmt(p); diff --git a/test/parse/CORPUS.md b/test/parse/CORPUS.md @@ -143,6 +143,7 @@ here for completeness once they're real cases. | `6_5_30_generic_selection`| · | `int x=42; return _Generic((x), int: x, default: 0);` | 42 | | `6_5_31_subscript_commute`| ★ | `int a[5]={0,0,42,0,0}; return 2[a];` | 42 | | `6_5_32_string_subscript` | ★ | `return "*"[0];` | 42 | +| `6_5_33_regalloc_spill` | ★ | 12-arg `sum12(x1+0, ..., x12+0)` — exceeds the 10-INT scratch pool, exercises `spill_reg`/`reload_reg` and the cg_call avs-in-flight fallback (see doc/REGALLOC.md) | 78 | ## §6.6 Constant expressions diff --git a/test/parse/cases/6_5_33_regalloc_spill.c b/test/parse/cases/6_5_33_regalloc_spill.c @@ -0,0 +1,32 @@ +/* Regalloc spill stress. + * + * Each `xN + 0` arg is a binop that leaves a register-resident rvalue on + * the cg value stack until cg_call materializes it. With 12 simultaneously + * accumulating arg values and a 10-INT scratch pool on aarch64, several + * code paths the spill driver must handle get exercised: + * + * - alloc_reg_or_spill returning REG_NONE on pool exhaustion and + * picking the deepest unpinned RES_REG via pick_victim, then routing + * it through spill_reg / take_spill_slot. + * - ensure_reg reloading SPILLED args inside cg_call's arg-pop loop, + * consuming pool slots as it goes. + * - The avs-in-flight fallback: once cg_call has popped enough RES_REG + * args into avs[] to drain the pool, alloc_reg_or_spill finds no + * stack victim (later args are SPILLED, earlier args are off-stack + * in avs[]) and falls back to spilling an OPK_REG avs entry through + * spill_avs_victim, rewriting it to OPK_LOCAL so the backend's call + * lowering loads from the spill slot. + * + * 1+2+...+12 = 78. */ + +int sum12(int a, int b, int c, int d, int e, int f, + int g, int h, int i, int j, int k, int l) { + return a + b + c + d + e + f + g + h + i + j + k + l; +} + +int test_main(void) { + int x1 = 1, x2 = 2, x3 = 3, x4 = 4, x5 = 5, x6 = 6; + int x7 = 7, x8 = 8, x9 = 9, x10 = 10, x11 = 11, x12 = 12; + return sum12(x1 + 0, x2 + 0, x3 + 0, x4 + 0, x5 + 0, x6 + 0, + x7 + 0, x8 + 0, x9 + 0, x10 + 0, x11 + 0, x12 + 0); +} diff --git a/test/parse/cases/6_5_33_regalloc_spill.expected b/test/parse/cases/6_5_33_regalloc_spill.expected @@ -0,0 +1 @@ +78