kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 889ad29eec3c9e6ab03bc7592c419335285b2463
parent 96469fa516e26b1e76409a026678a264cb06a928
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Tue, 19 May 2026 15:29:41 -0700

cg: extract call-shape helpers and central storage-shape predicate

Preparatory refactor for an ABI-driven storage-shape decision (see
doc/CBACKEND.md). Adds api_arg_storage_must_be_addr as the single
predicate consulted by api_release_arg_storage and the new
api_alloc_call_ret_storage helper. Dedupes cfree_cg_call and
api_call_symbol_common via api_alloc_call_args, api_pack_call_arg,
api_alloc_call_ret_storage, api_release_call_args, and
api_push_call_result. Behavior preserving: test-cg-api (610), test-opt,
test-toy, test-smoke-x64 all green.

Diffstat:
Adoc/CBACKEND.md | 298+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Msrc/api/cg.c | 239++++++++++++++++++++++++++++++++++++-------------------------------------------
2 files changed, 406 insertions(+), 131 deletions(-)

diff --git a/doc/CBACKEND.md b/doc/CBACKEND.md @@ -0,0 +1,298 @@ +# C Source Backend and ABI Storage-Shape Refactor + +## Motivation + +cfree's no-deps posture rules out linking against LLVM or GCC's optimizer +directly. The practical path to "industrial-strength" optimization for cfree +users is to emit C from the CG layer and hand the result to gcc/clang. A C +backend lives at the same layer as `arch_impl_aa64`, `arch_impl_x64`, etc.: a +new `arch_impl_c` with its own `CGTarget` and ABI vtable. Frontends do not need +to know it exists. + +GCC/clang-extension C covers what looked like blockers on first read: + +- inline asm — `IRAsmAux` is already GCC's `asm(tmpl : outs : ins : clobbers)` shape. +- overflow/trap — `__builtin_{add,sub,mul}_overflow`, `__builtin_trap`, + `__builtin_unreachable`. +- atomics — `_Atomic` + `<stdatomic.h>` with explicit `memory_order_*`. +- TLS — `__thread` or `_Thread_local`. + +The real blocker is one layer up: the CG layer makes aggregate-passing +decisions that bypass the ABI vtable. A trivial "C ABI" — every arg +`ABI_ARG_DIRECT` with one full-coverage part — would still see the CG layer +materialize aggregates as addresses and allocate sret slots. + +This document plans the refactor that makes those decisions ABI-driven, so a +trivial C ABI vtable produces value-shaped storage suitable for emitting +`ret = f(a, b, c)` C source. + +## Current State + +### ABI Vtable Selection + +Native ABIs already classify small aggregates as `ABI_ARG_DIRECT` with multiple +parts (e.g. SysV-x64 splits a 16B struct into two `ABI_CLASS_INT` parts in +`src/abi/abi_sysv_x64.c:53-71`). Large aggregates classify as +`ABI_ARG_INDIRECT`. The ABI vtable is selected per-target via +`ArchImpl.abi_vtable` (`src/arch/arch.h:899`) and dispatched through +`abi_init` → `select_vtable` (`src/abi/abi.c:176`). + +### Preparatory Refactors (landed) + +Two preparatory passes shaped `src/api/cg.c` so the functional change can be +small and confined to a couple of helper bodies: + +**Prep A — central predicate.** Added a single helper that today encodes the +type-shape decision; future change rewrites only its body. + +```c +/* src/api/cg.c:1323 */ +static int api_arg_storage_must_be_addr(Compiler *c, CfreeCgTypeId ty) { + return cg_type_is_aggregate(c, ty) || api_is_wide16_scalar_type(c, ty); +} +``` + +Used by `api_release_arg_storage` (`src/api/cg.c:2129`) and +`api_alloc_call_ret_storage` (`src/api/cg.c:6408`). + +**Prep B — call-shape helpers.** `cfree_cg_call` and `api_call_symbol_common` +shared a ~80-line duplicated body. Extracted five helpers and reduced both +public entry points to thin orchestration: + +| Helper | Location | Role | +| ------------------------------- | --------------------- | ------------------------------------------ | +| `api_alloc_call_args` | `src/api/cg.c:6363` | `avs` array + `avs_in_flight` setup | +| `api_pack_call_arg` | `src/api/cg.c:6374` | per-arg type resolution + 3-way packaging | +| `api_alloc_call_ret_storage` | `src/api/cg.c:6406` | return slot vs return register | +| `api_release_call_args` | `src/api/cg.c:6424` | post-call release loop | +| `api_push_call_result` | `src/api/cg.c:6432` | lv/sv push based on storage kind | + +After Prep A+B, the CG-side surface area that needs to change is reduced to +two helper bodies and one as-yet-unextracted ret packaging function. + +### Remaining Predicate Sites + +After the prep refactors, the type-shape decisions that still need to become +ABI-driven live in just three places: + +1. **`api_arg_storage_must_be_addr`** (`src/api/cg.c:1323`) — the central + predicate consulted by `api_release_arg_storage` and + `api_alloc_call_ret_storage`. Today: `is_aggregate || wide16`. +2. **`api_pack_call_arg`** (`src/api/cg.c:6374`) — per-arg packaging still + has a three-way switch (`api_is_wide16_scalar_type` at line 6387, + `cg_type_is_aggregate` at 6392, scalar fall-through). The three branches + collapse to "address-shaped" vs "value-shaped" under ABI control. +3. **`cfree_cg_ret`** (`src/api/cg.c:6636`) — ret packaging still has the + same three-way switch inline (`is_aggregate` at 6654, `wide16` at 6662). + Not extracted yet because Prep B's scope was call/call_symbol dedupe. + +Together these three are the entire functional surface for Phase 1. + +## Refactor Plan + +### Invariant to Introduce + +`CGABIValue.storage` shape is determined by an ABI helper, not by +`cg_type_is_aggregate`: + +```c +/* In src/abi/abi.h */ +typedef enum ABIStorageShape { + ABI_STORAGE_VALUE, /* storage is the value itself (REG/IMM/GLOBAL/LOCAL) */ + ABI_STORAGE_ADDR, /* storage is the address of a memory image */ +} ABIStorageShape; + +ABIStorageShape abi_arg_storage_shape(const ABIArgInfo*, u32 type_size); +``` + +Rule: + +- `ABI_STORAGE_ADDR` iff `kind == ABI_ARG_INDIRECT`, **or** + `kind == ABI_ARG_DIRECT && (nparts > 1 || parts[0].src_offset != 0 || + parts[0].size != type_size)`. +- Otherwise `ABI_STORAGE_VALUE`. + +This makes today's native behavior fall out unchanged: small structs +(multi-part DIRECT) → ADDR; large structs (INDIRECT) → ADDR. Only a +trivial DIRECT — one part, full coverage, zero offset — produces VALUE, +which is exactly what the C target will register. + +### Touch List + +**`src/abi/abi.h` / `src/abi/abi.c`** — add `ABIStorageShape` enum and +`abi_arg_storage_shape()`. + +**`src/api/cg.c`** — rewrite three function bodies: + +| Site | Location | Change | +| --------------------------------- | --------------------- | ---------------------------------------------------------------------- | +| `api_arg_storage_must_be_addr` | `src/api/cg.c:1323` | body becomes `abi_arg_storage_shape(abi, size) == ABI_STORAGE_ADDR` | +| `api_pack_call_arg` | `src/api/cg.c:6374` | collapse 3-way switch to `must_be_addr`-driven materialization | +| `cfree_cg_ret` | `src/api/cg.c:6636` | same collapse, OR pre-extract a `api_pack_ret_value` helper first | + +Optionally extract `api_pack_ret_value` from `cfree_cg_ret` as Prep C before +the functional change, so the three-way collapse lives in helper bodies +rather than mid-public-function. Small, mechanical, ~20 LOC. + +**`src/abi/abi_sysv_x64.c`, `abi_aapcs64.c`, `abi_apple_arm64.c`, +`abi_rv64.c`** — extend `classify_one`/`classify_scalar` to classify wide16 +scalars (i128, long double) as `ABI_ARG_DIRECT` with multi-part shape. See +Phase 2 below. + +**Untouched** — other `cg_type_is_aggregate` sites in `cg.c` (lines 1754, +1782, 3823ff, 3945ff, 5094, 5103). Those handle assignments, lvalue +conversion, and address-of. They are correctly aggregate-policy, not +ABI-policy. + +**Native backends** — no expected change. They already consult +`desc.args[i].abi` and the invariant preserves what they see today. + +## Phasing + +### Phase 0 (done) — preparatory refactors + +- Prep A: central `api_arg_storage_must_be_addr` predicate. +- Prep B: extract `api_alloc_call_args`, `api_pack_call_arg`, + `api_alloc_call_ret_storage`, `api_release_call_args`, + `api_push_call_result`. +- Verified by `test-cg-api` (610 pass), `test-opt`, `test-toy`, + `test-smoke-x64`. + +### Phase 1 — Helper bodies become ABI-driven + +- Add `abi_arg_storage_shape()` in `abi.h`/`abi.c`. +- Rewrite `api_arg_storage_must_be_addr` body to delegate to the new helper + (needs `const ABIArgInfo*` and `type_size` — adjust the helper signature + accordingly, and pass them through from `api_pack_call_arg` / + `api_alloc_call_ret_storage` / `api_release_arg_storage`). +- Collapse the three-way switches in `api_pack_call_arg` and `cfree_cg_ret` + (or extracted `api_pack_ret_value`) into a single `must_be_addr` branch. + +Acceptance: `make test-cg-api test-opt test-link test-elf test-toy +test-smoke-x64 test-smoke-rv64 test-aa64-inline` pass; spot-check `.o` +outputs on a representative corpus against the current state to confirm +byte-identical codegen for native ABIs. + +### Phase 2 — Migrate wide16 to ABI classification + +Today `api_is_wide16_scalar_type` papers over incomplete ABI classifiers in +some native targets (see Risks below). Phase 2 fixes the classifiers, then +removes the wide16-specific code path from the predicate. + +- Fix SysV-x64 `classify_scalar` to emit DIRECT/2-INT-parts for + `ti.size == 16 && ti.scalar_kind == ABI_SC_INT` (the i128 case), + matching what RV64 and AAPCS64 already do. +- Defer long-double-as-FP correctness — long double passes through + memory in current cfree even on native targets, and the existing + wide16 shortcut effectively forces that. Either retain the + `is_wide16` check just for long-double cases (narrow the branch), + or introduce a dedicated x87 / 16B-FP ABI class as a separate piece + of work. The C-backend refactor does not require this fix. +- After classifiers are correct, drop the `api_is_wide16_scalar_type` + clause from `api_arg_storage_must_be_addr`. + +Acceptance: same as Phase 1, plus `test-libc` (long double through +musl/glibc paths) and any i128 coverage. + +### Phase 3 — Negative-control fixture + +Add a unit test in `test/api/` that constructs a `Compiler` with a +synthetic ABI vtable returning trivial DIRECT/one-full-part for everything, +drives `cfree_cg_call` with an aggregate arg and aggregate return, and +asserts `desc.args[0].storage.kind != OPK_INDIRECT` and that no sret frame +slot was allocated. + +This fixture locks in the new invariant so future changes cannot +accidentally regress to always-address-for-aggregate. + +### Phase 4 (out of scope here, but enabled by this work) + +- Add `arch_impl_c` with a `c_abi_vtable` whose `compute_func_info` returns + trivial DIRECT/one-full-part for every arg and return. +- Stub `cgtarget_new` to a placeholder that records call/ret shapes for + inspection. +- The actual C-source emitter is a separate piece of work, driven by the + recorded `CGCallDesc` shape that this refactor now makes value-typed for + aggregates. + +## Risks and Open Items + +Investigated post-plan and after the prep refactors: + +- **`api_release_arg_storage` (resolved by Prep A).** + Originally a fifth open site; now uses `api_arg_storage_must_be_addr` + directly (`src/api/cg.c:2129`). Resolution: same helper drives the + decision. + +- **`call_symbol` duplication (resolved by Prep B).** + Both `cfree_cg_call` and `api_call_symbol_common` now share the five + extracted helpers and contain only the call-shape orchestration. Drift + is no longer a maintenance concern. + +- **`fn_abi` is reliably non-null inside a function body.** + Set at `cfree_cg_func_begin` (`src/api/cg.c:3125`) and cleared at + `func_end` (line 3149). `cfree_cg_ret` only runs within that window. + No null-safe fallback needed. + +- **CGCallPlan / backends are already fully ABI-driven.** + Grep across `src/arch/` finds zero `cg_type_is_aggregate` references. + Every site branches on `ai->kind == ABI_ARG_INDIRECT` or iterates + `ai->parts`. Examples: `arch/aa64/ops.c:904`, `arch/x64/alloc.c:54`, + `arch/rv64/opt_coord.c:178`. The new invariant preserves what native + backends observe (multi-part DIRECT aggregates still produce ADDR + storage), so backends do not change. + +- **Wide16 classification is incomplete in some native ABIs — this is + the biggest finding and Phase 2's largest hidden cost.** + Today the wide16 check in `api_arg_storage_must_be_addr` papers over + bugs in the underlying ABI classifiers. Per-target status: + + - **RV64** (`src/abi/abi_rv64.c:23-43`): correctly classifies 16B + INT or FLOAT scalars as DIRECT with two 8B INT parts. ✓ + - **AAPCS64** (`src/abi/abi_aapcs64.c:23-39`): correctly classifies + 16B INT scalars (i128) as DIRECT/2-parts. **Missing**: 16B FP + (long double on ARM64) should be DIRECT with one or two FP parts + in Q-registers, not fall through to single 16B INT part. + - **Apple ARM64** (`src/abi/abi_apple_arm64.c`): delegates to AAPCS64; + inherits the same long-double gap. + - **SysV-x64** (`src/abi/abi_sysv_x64.c:28-44`): **no 16B branch + at all**. i128 currently falls through to a single 16B INT part — + malformed because no GPR can hold it. Long double is 80-bit x87 + with 16B alignment and needs a target-specific class entirely. + The wide16 clause in `api_arg_storage_must_be_addr` hides both bugs + by always routing wide16 through a memory image. + + **Consequence**: if Phase 2 drops the wide16 clause before fixing the + SysV-x64 and AAPCS64 classifiers, native codegen breaks. The new + `abi_arg_storage_shape` would compute VALUE for a malformed single-part + DIRECT (one part, `src_offset==0, size==type_size==16`), but no Operand + kind can hold a 16B value. + +- **HFA / HVA in AAPCS64**: the existing classifier explicitly defers + HFA refinement (see comment at `src/abi/abi_aapcs64.c:9` and `:68-69`). + Small aggregates today classify uniformly as DIRECT/INT-parts. Wide16 + classification (i128) does not collide with HFA logic because the two + enter `classify_one` through disjoint type kinds (RECORD vs scalar). + Confirmed safe. + +- **`tail` interaction**: the tail-call path + (`src/api/cg.c:6497-6498`) calls `api_regalloc_finish` before + `T->call`, which can mutate live storage state. The storage-shape + helper is queried per-arg during pre-call packaging, before this + finish call, so the decision sequencing is unchanged. No additional + risk. + +## Estimated Size + +- Phase 0 (done): Prep A (~25 LOC) + Prep B (~95 LOC of helpers, ~120 LOC + of duplicate body deleted from `cfree_cg_call` / `api_call_symbol_common`). +- Phase 1: ~20 LOC for `abi_arg_storage_shape` + rewriting three function + bodies in `cg.c` (signature changes to thread `ABIArgInfo*` + size into + the helpers). +- Optional Prep C (extract `api_pack_ret_value` from `cfree_cg_ret`): ~20 LOC. +- Phase 2a (i128 classification fix): ~30 LOC in `abi_sysv_x64.c` + + removing the i128 path from the wide16 clause. ~50 LOC total. +- Phase 2b (long-double, optional / deferable): not required for the C + backend. Treat as separate work. +- Phase 3 (negative-control fixture): one ~150 LOC test file. +- Total remaining for C-backend prerequisite: ~250 LOC, no public API change. diff --git a/src/api/cg.c b/src/api/cg.c @@ -1314,6 +1314,16 @@ static int api_is_wide16_scalar_type(Compiler *c, CfreeCgTypeId ty) { return api_is_f128_type(c, ty) || api_is_i128_type(c, ty); } +/* Whether a CGABIValue.storage for `ty` must be an address operand (pointing + * to a memory image of the value) rather than a value operand. Today this is + * driven by the type shape — aggregates and wide16 scalars cannot fit in a + * single Operand. A future refactor will key this off ABIArgInfo so a + * trivial-DIRECT ABI (e.g. for a C-source backend) can keep aggregates as + * value operands. See doc/CBACKEND.md. */ +static int api_arg_storage_must_be_addr(Compiler *c, CfreeCgTypeId ty) { + return cg_type_is_aggregate(c, ty) || api_is_wide16_scalar_type(c, ty); +} + static Operand api_op_imm(i64 v, CfreeCgTypeId ty) { Operand o; memset(&o, 0, sizeof o); @@ -2116,7 +2126,7 @@ static void api_release_arg_storage(CfreeCg *g, Operand *storage) { api_free_reg(g, storage->v.reg, storage->cls); } else if (storage->kind == OPK_LOCAL && storage->cls < 3) { CfreeCgTypeId ty = storage->type; - if (cg_type_is_aggregate(g->c, ty) || api_is_wide16_scalar_type(g->c, ty)) + if (api_arg_storage_must_be_addr(g->c, ty)) return; api_return_spill_slot(g, storage->v.frame_slot, storage->cls); } else if (storage->kind == OPK_INDIRECT) { @@ -6342,6 +6352,93 @@ void cfree_cg_field(CfreeCg *g, uint32_t field_index) { * Calls / return * ============================================================ */ +/* Shared scaffolding for cfree_cg_call / cfree_cg_call_symbol. The two + * public entry points differ only in how the callee is obtained and in + * their pre-call stack-depth check; everything else (arg packaging, return + * storage allocation, post-call release, result push) is identical. These + * helpers carry the common shape and are the natural targets for any future + * change that wants to vary call-shape policy (e.g. an ABI-driven storage + * decision). */ + +static CGABIValue *api_alloc_call_args(CfreeCg *g, u32 nargs) { + CGABIValue *avs = NULL; + if (nargs) { + avs = arena_array(g->c->tu, CGABIValue, nargs); + memset(avs, 0, sizeof(CGABIValue) * nargs); + } + g->avs_in_flight = avs; + g->avs_in_flight_n = nargs; + return avs; +} + +static void api_pack_call_arg(CfreeCg *g, CGABIValue *av, CfreeCgTypeId fty, + const ABIFuncInfo *abi, u32 idx) { + ApiSValue arg = api_pop(g); + int is_vararg = (idx >= abi->nparams); + CfreeCgTypeId aty = is_vararg + ? (arg.type ? arg.type : api_sv_type(&arg)) + : cg_type_func_param_id(g->c, fty, idx); + if (!aty) + aty = arg.type; + + av->type = aty; + av->abi = is_vararg ? NULL : &abi->params[idx]; + + if (api_is_wide16_scalar_type(g->c, aty)) { + ApiSValue lv = api_wide16_materialize_lvalue(g, &arg, aty); + av->storage = lv.op; + av->storage.type = aty; + av->size = 16; + } else if (cg_type_is_aggregate(g->c, aty)) { + api_ensure_reg(g, &arg); + Operand st = arg.op; + st.type = aty; + av->storage = st; + av->size = abi_cg_sizeof(g->c->abi, aty); + } else { + api_ensure_reg(g, &arg); + av->storage = (api_is_lvalue_sv(&arg) || arg.op.kind == OPK_GLOBAL) + ? api_force_reg(g, &arg, aty) + : arg.op; + } +} + +static void api_alloc_call_ret_storage(CfreeCg *g, CGTarget *T, + CfreeCgTypeId ret_ty, Operand *out) { + if (api_arg_storage_must_be_addr(g->c, ret_ty)) { + FrameSlotDesc fsd; + memset(&fsd, 0, sizeof fsd); + fsd.type = ret_ty; + fsd.size = abi_cg_sizeof(g->c->abi, ret_ty); + fsd.align = abi_cg_alignof(g->c->abi, ret_ty); + fsd.kind = FS_LOCAL; + fsd.flags = FSF_ADDR_TAKEN; + FrameSlot slot = T->frame_slot(T, &fsd); + *out = api_op_local(slot, ret_ty); + } else { + Reg r = api_alloc_reg_or_spill(g, api_type_class(ret_ty), ret_ty); + *out = api_op_reg(r, ret_ty); + } +} + +static void api_release_call_args(CfreeCg *g, CGABIValue *avs, u32 nargs) { + for (u32 i = 0; i < nargs; ++i) { + api_release_arg_storage(g, &avs[i].storage); + } + g->avs_in_flight = NULL; + g->avs_in_flight_n = 0; +} + +static void api_push_call_result(CfreeCg *g, Operand ret_storage, + CfreeCgTypeId ret_ty) { + if (ret_storage.kind == OPK_LOCAL || ret_storage.kind == OPK_GLOBAL || + ret_storage.kind == OPK_INDIRECT) { + api_push(g, api_make_lv(ret_storage, ret_ty)); + } else { + api_push(g, api_make_sv(ret_storage, ret_ty)); + } +} + void cfree_cg_call(CfreeCg *g, uint32_t nargs, CfreeCgTypeId fn_type, CfreeCgCallAttrs attrs) { CGTarget *T; @@ -6371,48 +6468,10 @@ void cfree_cg_call(CfreeCg *g, uint32_t nargs, CfreeCgTypeId fn_type, return; } - avs = NULL; - if (nargs) { - avs = arena_array(g->c->tu, CGABIValue, nargs); - memset(avs, 0, sizeof(CGABIValue) * nargs); - } - - g->avs_in_flight = avs; - g->avs_in_flight_n = nargs; - + avs = api_alloc_call_args(g, nargs); for (u32 i = 0; i < nargs; ++i) { u32 idx = nargs - 1u - i; - ApiSValue arg = api_pop(g); - int is_vararg = (idx >= abi->nparams); - CfreeCgTypeId aty; - if (is_vararg) { - aty = arg.type ? arg.type : api_sv_type(&arg); - } else { - aty = cg_type_func_param_id(g->c, fty, idx); - if (!aty) - aty = arg.type; - } - avs[idx].type = aty; - avs[idx].abi = is_vararg ? NULL : &abi->params[idx]; - int is_aggregate = cg_type_is_aggregate(g->c, aty); - if (api_is_wide16_scalar_type(g->c, aty)) { - ApiSValue lv = api_wide16_materialize_lvalue(g, &arg, aty); - avs[idx].storage = lv.op; - avs[idx].storage.type = aty; - avs[idx].size = 16; - } else if (is_aggregate) { - api_ensure_reg(g, &arg); - Operand st = arg.op; - st.type = aty; - avs[idx].storage = st; - avs[idx].size = abi_cg_sizeof(g->c->abi, aty); - } else { - api_ensure_reg(g, &arg); - avs[idx].storage = - (api_is_lvalue_sv(&arg) || arg.op.kind == OPK_GLOBAL) - ? api_force_reg(g, &arg, aty) - : arg.op; - } + api_pack_call_arg(g, &avs[idx], fty, abi, idx); } callee = api_pop(g); @@ -6432,22 +6491,7 @@ void cfree_cg_call(CfreeCg *g, uint32_t nargs, CfreeCgTypeId fn_type, desc.ret.abi = &abi->ret; if (has_result) { - int ret_is_aggregate = cg_type_is_aggregate(g->c, ret_ty); - if (ret_is_aggregate || api_is_wide16_scalar_type(g->c, ret_ty)) { - FrameSlotDesc fsd; - memset(&fsd, 0, sizeof fsd); - fsd.type = ret_ty; - fsd.size = abi_cg_sizeof(g->c->abi, ret_ty); - fsd.align = abi_cg_alignof(g->c->abi, ret_ty); - fsd.kind = FS_LOCAL; - if (ret_is_aggregate || api_is_wide16_scalar_type(g->c, ret_ty)) - fsd.flags = FSF_ADDR_TAKEN; - FrameSlot ret_slot = T->frame_slot(T, &fsd); - desc.ret.storage = api_op_local(ret_slot, ret_ty); - } else { - Reg r = api_alloc_reg_or_spill(g, api_type_class(ret_ty), ret_ty); - desc.ret.storage = api_op_reg(r, ret_ty); - } + api_alloc_call_ret_storage(g, T, ret_ty, &desc.ret.storage); } else { desc.ret.storage = api_op_imm(0, builtin_id(CFREE_CG_BUILTIN_VOID)); } @@ -6456,24 +6500,14 @@ void cfree_cg_call(CfreeCg *g, uint32_t nargs, CfreeCgTypeId fn_type, api_regalloc_finish(g); T->call(T, &desc); - for (u32 i = 0; i < nargs; ++i) { - api_release_arg_storage(g, &avs[i].storage); - } - g->avs_in_flight = NULL; - g->avs_in_flight_n = 0; + api_release_call_args(g, avs, nargs); if (callee.op.kind != OPK_GLOBAL) { api_free_reg(g, callee_op.v.reg, RC_INT); } if (has_result) { - if (desc.ret.storage.kind == OPK_LOCAL || - desc.ret.storage.kind == OPK_GLOBAL || - desc.ret.storage.kind == OPK_INDIRECT) { - api_push(g, api_make_lv(desc.ret.storage, ret_ty)); - } else { - api_push(g, api_make_sv(desc.ret.storage, ret_ty)); - } + api_push_call_result(g, desc.ret.storage, ret_ty); } } @@ -6565,42 +6599,10 @@ static void api_call_symbol_common(CfreeCg *g, CfreeCgSym sym, uint32_t nargs, compiler_panic(g->c, g->cur_loc, "CfreeCg: call stack underflow"); return; } - avs = NULL; - if (nargs) { - avs = arena_array(g->c->tu, CGABIValue, nargs); - memset(avs, 0, sizeof(CGABIValue) * nargs); - } - g->avs_in_flight = avs; - g->avs_in_flight_n = nargs; + avs = api_alloc_call_args(g, nargs); for (u32 i = 0; i < nargs; ++i) { u32 idx = nargs - 1u - i; - ApiSValue arg = api_pop(g); - int is_vararg = (idx >= abi->nparams); - CfreeCgTypeId aty; - aty = is_vararg ? (arg.type ? arg.type : api_sv_type(&arg)) - : cg_type_func_param_id(g->c, fty, idx); - if (!aty) - aty = arg.type; - avs[idx].type = aty; - avs[idx].abi = is_vararg ? NULL : &abi->params[idx]; - if (api_is_wide16_scalar_type(g->c, aty)) { - ApiSValue lv = api_wide16_materialize_lvalue(g, &arg, aty); - avs[idx].storage = lv.op; - avs[idx].storage.type = aty; - avs[idx].size = 16; - } else if (cg_type_is_aggregate(g->c, aty)) { - api_ensure_reg(g, &arg); - Operand st = arg.op; - st.type = aty; - avs[idx].storage = st; - avs[idx].size = abi_cg_sizeof(g->c->abi, aty); - } else { - api_ensure_reg(g, &arg); - avs[idx].storage = - (api_is_lvalue_sv(&arg) || arg.op.kind == OPK_GLOBAL) - ? api_force_reg(g, &arg, aty) - : arg.op; - } + api_pack_call_arg(g, &avs[idx], fty, abi, idx); } callee_op = api_op_global((ObjSymId)sym, 0, cg_type_ptr_to(g->c, fty)); memset(&desc, 0, sizeof desc); @@ -6613,41 +6615,16 @@ static void api_call_symbol_common(CfreeCg *g, CfreeCgSym sym, uint32_t nargs, desc.ret.type = ret_ty; desc.ret.abi = &abi->ret; if (has_result) { - if (cg_type_is_aggregate(g->c, ret_ty) || - api_is_wide16_scalar_type(g->c, ret_ty)) { - FrameSlotDesc fsd; - FrameSlot ret_slot; - memset(&fsd, 0, sizeof fsd); - fsd.type = ret_ty; - fsd.size = abi_cg_sizeof(g->c->abi, ret_ty); - fsd.align = abi_cg_alignof(g->c->abi, ret_ty); - fsd.kind = FS_LOCAL; - fsd.flags = FSF_ADDR_TAKEN; - ret_slot = T->frame_slot(T, &fsd); - desc.ret.storage = api_op_local(ret_slot, ret_ty); - } else { - Reg r = api_alloc_reg_or_spill(g, api_type_class(ret_ty), ret_ty); - desc.ret.storage = api_op_reg(r, ret_ty); - } + api_alloc_call_ret_storage(g, T, ret_ty, &desc.ret.storage); } else { desc.ret.storage = api_op_imm(0, builtin_id(CFREE_CG_BUILTIN_VOID)); } if (tail) api_regalloc_finish(g); T->call(T, &desc); - for (u32 i = 0; i < nargs; ++i) { - api_release_arg_storage(g, &avs[i].storage); - } - g->avs_in_flight = NULL; - g->avs_in_flight_n = 0; + api_release_call_args(g, avs, nargs); if (has_result) { - if (desc.ret.storage.kind == OPK_LOCAL || - desc.ret.storage.kind == OPK_GLOBAL || - desc.ret.storage.kind == OPK_INDIRECT) { - api_push(g, api_make_lv(desc.ret.storage, ret_ty)); - } else { - api_push(g, api_make_sv(desc.ret.storage, ret_ty)); - } + api_push_call_result(g, desc.ret.storage, ret_ty); } }