commit f9d61faa02ff59f3d347478852d1959247a8e133
parent 37ad8f0e64ce819ad11ede23f6f2ff7e97a696fe
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Wed, 20 May 2026 09:59:57 -0700
c_target: land Phase 0/1 of the C-source backend
Adds src/arch/c_target/, a CGTarget that writes portable target-locked C
source instead of object bytes. Plumbed via CodeOptions.emit_c_source +
c_source_writer; selected in cfree_cg_new (session.c) so MCEmitter and
Debug are bypassed and opt is forced to 0. Driver gains --emit=c. C frontend
and toy/wasm source-emit paths both skip emit_object_bytes when the flag
is set, so non-C frontends drive the C target too.
Phase 1 emission covers func_begin/end, param (via frame slots), ret,
binop, load_imm, copy, load, store, call, set_loc, finalize, destroy,
plus minimal int/void/pointer types. Per-function decls and TU-wide
forwards/body buffers are heap-grown CBufs; a new optional CGTarget.alias
hook lets the C target emit __attribute__((alias)) (ELF/PE) or a thunk
fallback (Mach-O, where clang rejects the attribute).
Test surface: extend test/parse/, test/toy/, and test/wasm/ runners with
a new path C that runs cfree --emit=c, compiles the result with the host
cc, and runs it. Phased-rollout panics ("C target: ... not implemented |
not yet supported") report as SKIP rather than FAIL. New make test-cbackend
target drives all three runners with CFREE_TEST_PATHS=C; not in the default
test aggregate (skip noise).
Diffstat:
16 files changed, 1859 insertions(+), 206 deletions(-)
diff --git a/doc/CBACKEND.md b/doc/CBACKEND.md
@@ -81,57 +81,20 @@ sizes and struct layouts match the downstream gcc invocation.
### Selection
-Add a new `CfreeObjFmt` variant or a `CodeOptions` flag — `emit_c_source`.
-When set, `cfree_cg_new` constructs the C `CGTarget` instead of dispatching
-through `arch_impl_*.cgtarget_new`. Concretely: branch in
-`src/arch/cgtarget.c:cgtarget_new` (currently the only call site) on the
-new flag and return a C-source `CGTarget`. The `MCEmitter` is still
-constructed (the CG holds a pointer) but receives no calls from the C target.
-
-The downstream driver workflow: `cfree --emit=c foo.c -o foo.c.cfree.c`, then
-the user runs `cc -O2 foo.c.cfree.c`. No object format coupling at the
-cfree boundary.
-
-## Why the prior plan's framing was wrong
-
-The previous version of this doc proposed an "ABI Storage-Shape Refactor":
-add a `ABIStorageShape` enum, make `api_arg_storage_must_be_addr` consult an
-ABI helper, then write a "trivial C ABI" vtable that classifies everything
-as `DIRECT/1-full-part`. Reasons that's the wrong tool:
-
-1. **The CG vtable surface is the real work, not the predicate.** Even if
- `api_pack_call_arg` produced a value-shaped storage for an aggregate, the
- C target still needs implementations for ~50 `CGTarget` methods. The
- predicate refactor would save *one* address-shaped path at the call site
- and gain nothing for the rest.
-
-2. **`Operand` cannot hold an aggregate by value.** `OpKind` is
- `IMM/REG/LOCAL/GLOBAL/INDIRECT`. None of those carry a struct value. So
- the proposed `ABI_STORAGE_VALUE` for a single-part DIRECT aggregate would
- yield a malformed `Operand` (e.g. `OPK_LOCAL` with frame-slot id but no
- actual register class for the struct). Native backends — which the prior
- plan promised would see byte-identical output — would actually break,
- because their `T->call` path reads `desc.args[i].storage.kind` and is not
- prepared for "REG that's actually 256 bytes wide".
-
-3. **For C output, the aggregate-via-address shape is fine.** Given an
- aggregate arg as `OPK_LOCAL slot_3` (address of a frame slot), the C
- target emits `f(*(struct T*)slot_3)` or, better, `f(slot_3)` where
- `slot_3` is already typed as `struct T` in the emitted code. No new
- Operand kind, no ABI invariant change, no native-backend regression risk.
-
-4. **The wide16 / SysV-x64 i128 discussion is unrelated.** GCC accepts
- `__int128` and `long double` natively. The C backend emits the source
- type, gcc does the rest. Fixing native ABI classifiers for i128 is real
- work but is not on the C-backend critical path.
-
-5. **The line numbers in the prior plan are dead.** The CG layer was split
- from `src/api/cg.c` (gone) into `src/cg/{call,value,memory,...}.c`. Every
- `src/api/cg.c:NNNN` reference in the old doc points to nothing. The "Prep
- A" / "Prep B" landings did happen but the helpers now live in
- `src/cg/call.c` and `src/cg/value.c`.
-
-So we discard the storage-shape framing and plan the actual work directly.
+`CfreeCodeOptions` gains two fields:
+
+- `bool emit_c_source` — opt into the C target.
+- `CfreeWriter* c_source_writer` — destination for emitted source.
+
+When `emit_c_source` is set, `cfree_cg_new` (in `src/cg/session.c`)
+constructs `c_cgtarget_new(c, ob, writer)` instead of dispatching through
+`cgtarget_new`. `MCEmitter` and `Debug` are skipped entirely in this mode
+(no machine bytes, no DWARF). `cgtarget_new` itself is unchanged: it
+doesn't see `CodeOptions`, and the branch is cleaner one level up.
+
+The downstream driver workflow: `cfree --emit=c foo.c -o foo.c.cfree.c`,
+then the user runs `cc -O2 foo.c.cfree.c`. No object format coupling at
+the cfree boundary.
## Architecture sketch
@@ -348,29 +311,46 @@ source-mapped debug info back to the original cfree input. The cfree
New files (additions only):
-- `src/arch/c_target/` — directory for the C `CGTarget` implementation.
- - `target.c` — vtable construction; `c_cgtarget_new()` entry point.
- - `emit.c` — per-method emission bodies.
- - `types.c` — type worklist and typedef ordering.
- - `data.c` — data-definition buffering and emission.
- - `names.c` — Reg/FrameSlot/Local/Label/Sym → C identifier mapping.
- - `internal.h` — local types and helpers.
+- `src/arch/c_target/` — the C `CGTarget` implementation.
+ - `target.c` — vtable construction; `c_cgtarget_new()` entry point;
+ panic stubs for unimplemented methods.
+ - `emit.c` — per-method emission bodies (includes name mapping; no
+ separate `names.c`).
+ - `cbuf.c` — heap-backed growable byte buffer for per-function
+ declarations and TU-wide body/forwards text.
+ - `internal.h` — `CTarget` struct and local prototypes.
-Existing files touched (minimal, additive):
+Type emission, data emission, and name mapping all live in `emit.c` for
+now — the `types.c`/`data.c`/`names.c` split anticipated in the original
+plan turned out to be overkill at Phase 1 size; revisit if a file grows
+unwieldy.
-- `src/arch/cgtarget.c` — branch on the C-output mode and call
- `c_cgtarget_new()` instead of dispatching through `ArchImpl.cgtarget_new`.
-- `include/cfree/compile.h` / `include/cfree/core.h` — add `emit_c_source`
- to `CodeOptions` (or add a `CFREE_OBJ_C_SOURCE` value to `CfreeObjFmt` —
- decision deferred to v0 scoping, but `CodeOptions` flag is the lighter
- change since the file is conceptually still a translation unit, just a
- different surface format).
-- `src/cg/session.c` — when the flag is set, force `opt_level = 0`.
-- `driver/cc/` — accept `--emit=c` and wire it through to `CodeOptions`.
+Existing files touched (minimal, additive):
-Nothing in `src/cg/*.c`, `src/abi/*`, `src/arch/{aa64,x64,rv64}/`,
-`src/arch/regalloc.c`, or `src/arch/mc.c` needs to change. The C backend is
-strictly additive.
+- `include/cfree/core.h` — `CodeOptions` gains `emit_c_source` and
+ `c_source_writer`.
+- `src/cg/session.c` — `cfree_cg_new` branches on `emit_c_source`: skips
+ `mc_new`/`debug_new`, calls `c_cgtarget_new`, forces `opt_level = 0`.
+- `src/cg/asm.c` — `cfree_cg_file_scope_asm` panics with a "C target: …
+ not yet supported" message when `g->mc == NULL`, so file-scope asm
+ surfaces as a phased-rollout SKIP in the harness rather than a
+ segfault.
+- `src/api/compile.c` — both `cfree_compile_c_obj_emit` and
+ `cfree_compile_source_obj_emit` skip `emit_object_bytes` when
+ `emit_c_source` is set (the C target already wrote source to the
+ writer). The skip in `source_obj_emit` is what lets non-C frontends
+ (toy, wasm) drive the C target too.
+- `src/arch/arch.h` / `src/cg/session.c` — new optional `CGTarget.alias`
+ hook. Native backends leave it NULL; the C target implements it to
+ emit `__attribute__((alias))` (ELF/PE) or a thunk (Mach-O), because
+ the alias relationship doesn't survive serialization to text the way
+ it does in an obj-level `(section, value)` pair.
+- `driver/cc.c` — accepts `--emit=c`, sets `compile_only`, routes the
+ output writer to `copts.code.c_source_writer`.
+
+Nothing in `src/abi/*`, `src/arch/{aa64,x64,rv64}/`,
+`src/arch/regalloc.c`, `src/arch/mc.c`, or `src/arch/cgtarget.c` needs to
+change. The C backend is strictly additive at the CGTarget seam.
## Things to **not** do
@@ -386,57 +366,75 @@ strictly additive.
## Test surface
-Add a new test directory `test/cbackend/`. Tests compile a cfree CG fixture
-(or run the C frontend on a `.c` corpus from `test/parse/` and similar),
-emit C source via the new target, then compile that C source with the host
-`cc` and run the resulting binary, asserting the same exit code or stdout
-as a reference run.
-
-Test tiers, in priority order:
-
-1. **v0 sanity** — one fixture per CG primitive family (int arith, fp arith,
- load/store, branches, switch, calls, returns, const data, scalar params).
- Pass criterion: emitted C compiles with `cc -Werror -std=c11` and
- produces the expected exit code.
-2. **Coverage** — aggregates by value, sret, varargs, bitfields,
- computed-goto, inline asm, atomics, TLS, weak/visibility, alloca,
- intrinsics (overflow, trap, popcount, etc.), setjmp/longjmp,
- wide16 (i128, long double, f128). Each its own fixture file.
-3. **Frontend integration** — run the existing `test/toy/` and `test/parse/`
- corpora through the C backend and require the resulting binary to match
- the native-backend binary's behavior.
-4. **Self-hosting smoke** — eventually compile libcfree through libcfree-via-C
- and check that the bootstrapped artifact still passes its test suite. This
- is a separate effort; flagged here only to note the long-term shape.
+Instead of a standalone `test/cbackend/`, we added a new path `C` to the
+existing `test/parse/` and `test/toy/` and `test/wasm/` runners. The frontends
+together prove the CGTarget seam is frontend-agnostic. The aggregate `make
+test-cbackend` invokes both runners with `CFREE_TEST_PATHS=C`.
+
+Per case, path `C` runs:
+1. `cfree cc --emit=c <src> -o <work>/<name>.cfree.c` (parse uses the
+ `parse-runner --emit-c` harness with the cross-target overridden to
+ the host's obj format; toy uses the driver directly).
+2. Host `cc` compiles the emitted source (`-std=gnu99
+ -Wno-main-return-type` for toy because `fn main(): i64` emits as
+ `int64_t main(void)`; parse uses `-Werror -std=c11` because its
+ wrapper provides `int main()`).
+3. Native exec; exit code compared against `<name>.expected`.
+
+Phased-rollout handling: stderr matching `"C target: … not
+implemented|not yet supported"` is reported as SKIP rather than FAIL, so
+the harness signal reflects the implemented surface. Cases that need a
+later phase for non-panic reasons (e.g. require multi-TU LTO) can opt out
+of path `C` only via a `<name>.cbackend.skip` sidecar without affecting
+the other paths.
+
+Path `C` requires `is_native_target == 1` (the emitted C is target-locked
+and must be compiled by a host cc with a matching triple).
+
+Future tiers (still TODO):
+
+- **Self-hosting smoke** — compile libcfree through libcfree-via-C and
+ check that the bootstrapped artifact still passes its test suite.
+ Separate effort; flagged here only to note the long-term shape.
## Phasing
-### Phase 0 — scaffolding
-
-- Add `CodeOptions.emit_c_source`.
-- Branch in `cgtarget_new` and `session.c`.
-- Stub `c_cgtarget_new` returning a vtable where every method is
- `compiler_panic("C target: <method> not implemented")`.
-- Wire `--emit=c` in `driver/cc/`.
-- Acceptance: `cfree --emit=c empty.c -o /tmp/x.c` panics with a *specific*
- unimplemented-method message (not a crash, not silent success).
-
-### Phase 1 — minimal viable: scalar arithmetic and calls
-
-Implement, in roughly this order, only the methods needed for:
-
-```c
-int add(int a, int b) { return a + b; }
-int main(void) { return add(2, 3); }
-```
-
-- `func_begin`/`func_end`, `param`, `ret`, `binop`, `load_imm`, `copy`,
- `call`, plus minimal type emission for `int`/`void`.
-- Identifier-mapping helper (Reg, FrameSlot, Label, Sym → C name).
-- Writer plumbing (the C target owns a `CfreeWriter` set at construction).
-
-Acceptance: the example above round-trips through the C target, `cc` it,
-run it, exit code 5.
+### Phase 0 — scaffolding ✅ landed
+
+- `CodeOptions.{emit_c_source, c_source_writer}` added.
+- `cfree_cg_new` branches on the flag; `opt_level` forced to 0.
+- `c_cgtarget_new` returns a vtable where unimplemented methods
+ `compiler_panic("C target: method <name> not implemented")`.
+- `--emit=c` wired through `driver/cc.c`.
+- Acceptance met: an empty C source emits a clean prologue;
+ unimplemented methods surface their name in the panic message.
+
+### Phase 1 — minimal viable: scalar arithmetic and calls ✅ landed
+
+Implemented: `func_begin`/`func_end`, `param`, `frame_slot`, `ret`,
+`binop`, `load_imm`, `copy`, `load`, `store`, `call`, `set_loc`,
+`finalize`, `destroy`, plus minimal int/void/pointer type emission.
+
+Other implementation choices that landed in Phase 1:
+
+- **Params land in frame slots, not regs.** `c_param` allocates a fresh
+ frame slot via `c_frame_slot`, declares `T slot_N;`, and emits
+ `slot_N = pN;` at function entry; CG then references the param as
+ `OPK_LOCAL`. Simpler than coordinating fresh reg ids with the regalloc
+ from outside CG.
+- **Per-TU forwards buffer.** `c_func_begin` and `c_call` both register
+ forward declarations via `c_ensure_forward_decl` (dedup'd by
+ `ObjSymId`). This handles out-of-order callees and references to
+ external symbols.
+- **`alias` hook on CGTarget.** `c_alias` emits
+ `__attribute__((alias("target")))` on ELF/PE and a wrapper thunk on
+ Mach-O (clang on Darwin rejects the attribute outright).
+- **Mach-O linker-symbol underscore.** `c_sym_name` strips the leading
+ `_` on Mach-O so the emitted C uses the source-level name (the host
+ cc re-adds the underscore at link time).
+
+Acceptance met: `int add(int,int){return a+b;} int main(){return add(2,3);}`
+round-trips through the C target and exits 5.
### Phase 2 — control flow and memory
@@ -494,26 +492,25 @@ modifications to CG, ABI, regalloc, or existing arch backends.
## Open questions
-- **Aggregate returns in C source**: should the C target emit
- `slot_R = f(args)` for aggregate returns (relying on gcc to handle the
- ABI) or pre-lower to `f_into_buf(&slot_R, args)`? The first is simpler
- and is what gcc would do anyway. Default to the simple form; revisit if
- it produces bad codegen.
-- **Output format flag location**: `CodeOptions.emit_c_source` (boolean)
- vs `CFREE_OBJ_C_SOURCE` (enum extension). The latter forces every code
- path that switches on `obj_fmt` to know about C-source; the former is a
- narrower addition. Lean toward the boolean.
-- **Multi-TU emission**: cfree compiles one TU at a time today. The C
- backend follows the same model — one `.c` source out per `.c` source in.
- Cross-TU LTO is gcc's job downstream.
+- **Output format flag location**: resolved — `CodeOptions.emit_c_source`
+ (bool) plus `c_source_writer` (CfreeWriter*). A `CFREE_OBJ_C_SOURCE`
+ enum extension would have forced every `obj_fmt` switch to handle
+ it; the boolean keeps the change additive.
+- **Aggregate returns in C source**: still pending Phase 3. Plan: emit
+ `slot_R = f(args)` and let gcc handle the ABI. Revisit if codegen is
+ poor.
+- **Multi-TU emission**: one `.c` source out per `.c` source in. Cross-TU
+ LTO is gcc's job downstream.
- **Floating-point reproducibility**: cfree's FP-flag enum
- (`CfreeCgFpFlag.REASSOC`/`APPROX`/…) maps to gcc's
- `-ffast-math`-style behavior, but per-operation. C doesn't have a per-op
- syntax for these. Options: ignore the flags (correct but pessimistic),
- wrap in `#pragma STDC FP_CONTRACT off` blocks, or emit
- `__attribute__((optimize(...)))` on the enclosing function when any
- flag fires. Probably ignore in v0/v1, document the gap.
-- **i128 division/modulo and f128 ops** are emitted by cfree CG today via
- calls to `__divti3` / `__multf3` / `__addtf3` etc. The C target can
- prefer native `__int128` operators and `__float128`/`long double` so
- gcc inlines them. Detail to validate in Phase 1.
+ (`CfreeCgFpFlag.REASSOC`/`APPROX`/…) maps to gcc's `-ffast-math`-style
+ behavior, but per-operation. C doesn't have a per-op syntax. Options:
+ ignore the flags (correct but pessimistic), wrap in `#pragma STDC
+ FP_CONTRACT off`, or emit `__attribute__((optimize(...)))` on the
+ enclosing function. Probably ignore in v0/v1, document the gap.
+- **i128 division/modulo and f128 ops** today get lowered to `__divti3` /
+ `__multf3` / `__addtf3` calls. The C target could prefer native
+ `__int128` / `long double` so gcc inlines them. Defer to Phase 3 along
+ with the rest of wide-scalar work.
+- **Mach-O aliases lose `&alias == &target` identity** because we emit a
+ thunk. No fixture depends on this; document the gap and revisit only
+ if a real consumer hits it.
diff --git a/driver/cc.c b/driver/cc.c
@@ -98,6 +98,7 @@ typedef struct CcOptions {
int compile_only; /* -c */
int preprocess_only; /* -E */
int dump_tokens; /* --dump-tokens */
+ int emit_c_source; /* --emit=c */
int opt_level; /* -O0/-O1/-O2 */
int debug_info; /* -g */
int warnings_are_errors; /* -Werror */
@@ -829,6 +830,13 @@ static int cc_parse(int argc, char** argv, CcOptions* o) {
o->dump_tokens = 1;
continue;
}
+ if (driver_streq(a, "--emit=c")) {
+ /* C-source output instead of object bytes. Forces -c-style single-input
+ * compile (no link). See doc/CBACKEND.md. */
+ o->emit_c_source = 1;
+ o->compile_only = 1;
+ continue;
+ }
if (driver_streq(a, "-g")) {
o->debug_info = 1;
continue;
@@ -1738,6 +1746,7 @@ static void cc_fill_c_opts(const CcOptions* o, const CfreePreprocessOptions* pp,
* direct codegen until the optimizer path is robust for large C libraries. */
copts->code.opt_level = 0;
copts->code.debug_info = o->debug_info;
+ copts->code.emit_c_source = o->emit_c_source ? true : false;
copts->code.epoch = o->epoch;
copts->code.path_map = o->npath_map ? o->path_map : NULL;
copts->code.npath_map = o->npath_map;
@@ -1818,6 +1827,12 @@ static int cc_run_compile_one(DriverEnv* env, const CcOptions* o,
}
cc_fill_c_opts(o, pp, &copts);
+ if (copts.code.emit_c_source) {
+ /* --emit=c routes the output writer to the C-source CGTarget instead of
+ * the object emitter. The downstream `cfree_compile_*_emit` path will
+ * skip the object-serialize step when this is set. */
+ copts.code.c_source_writer = obj_w;
+ }
{
CfreeLanguage lang =
is_memory ? o->source_memory[index].lang : o->source_langs[index];
diff --git a/include/cfree/core.h b/include/cfree/core.h
@@ -134,9 +134,17 @@ typedef struct CfreePathPrefixMap {
typedef struct CfreeCodeOptions {
int opt_level; /* 0 direct, 1 minimal, 2 full */
bool debug_info; /* emit source/debug records when supported */
+ /* When set, CG emits portable C source instead of machine-code bytes.
+ * The TU is still target-locked: the emitted source uses the configured
+ * triple's struct layouts and pointer width. Forces opt_level=0.
+ * Output is written to c_source_writer (set on the emit-side compile
+ * entry point); object emission is bypassed. See doc/CBACKEND.md. */
+ bool emit_c_source;
uint64_t epoch; /* reproducible timestamp seed; 0 means no timestamp */
const CfreePathPrefixMap *path_map;
uint32_t npath_map;
+ /* Destination for emit_c_source mode. Ignored when emit_c_source is 0. */
+ struct CfreeWriter *c_source_writer;
} CfreeCodeOptions;
typedef struct CfreeHeap CfreeHeap;
diff --git a/src/api/compile.c b/src/api/compile.c
@@ -228,7 +228,11 @@ CfreeStatus cfree_compile_c_obj_emit(CfreeCompiler* c,
metrics_scope_begin(c, "compile.tu");
metrics_count(c, "compile.input_bytes", (u64)input->len);
compile_c_into(c, opts, input, ob);
- emit_object_bytes(c, ob, out);
+ /* In emit_c_source mode the CGTarget wrote portable C source to
+ * opts->code.c_source_writer (the same writer `out` points at); the
+ * object builder still got symbols and decls but no machine code. Skip
+ * object serialization in that case. */
+ if (!opts->code.emit_c_source) emit_object_bytes(c, ob, out);
metrics_scope_end(c, "compile.tu");
obj_free(ob);
compiler_panic_restore(c, &saved);
@@ -361,7 +365,10 @@ CfreeStatus cfree_compile_source_obj_emit(
metrics_scope_begin(c, "compile.tu");
metrics_count(c, "compile.input_bytes", (u64)input->bytes.len);
compile_source_into(c, opts, input, ob);
- emit_object_bytes(c, ob, out);
+ /* See cfree_compile_c_obj_emit: in emit_c_source mode the CGTarget wrote
+ * portable C source to opts->code.c_source_writer (same destination as
+ * `out`); skip object serialization. */
+ if (!opts->code.emit_c_source) emit_object_bytes(c, ob, out);
metrics_scope_end(c, "compile.tu");
obj_free(ob);
compiler_panic_restore(c, &saved);
diff --git a/src/arch/arch.h b/src/arch/arch.h
@@ -610,6 +610,19 @@ struct CGTarget {
void (*func_begin)(CGTarget*, const CGFuncDesc*);
void (*func_end)(CGTarget*);
+ /* Symbol-aliasing hook. Optional (may be NULL). cg invokes this from
+ * cfree_cg_alias after the obj symbol-table mirror is wired so the
+ * backend can emit any out-of-band representation it needs — e.g. the
+ * C-source target writes
+ * `T alias_sym(...) __attribute__((alias("target")));`
+ * because the alias relationship isn't expressible by sharing a
+ * (section, value) pair the way a relocatable object can. Native
+ * machine-code backends don't need this hook because obj_symbol_define
+ * already aliases the bytes. `type` is the alias's CG type (function
+ * or object), needed by the C target to render the prototype. */
+ void (*alias)(CGTarget*, ObjSymId alias_sym, ObjSymId target_sym,
+ CfreeCgTypeId type);
+
/* Optional fast path for optimized emitters that know all frame slots and
* outgoing call area needs before body emission. `out_slots`, when non-NULL,
* has `frame->nslots` entries and receives target FrameSlot ids in order. */
diff --git a/src/arch/c_target/cbuf.c b/src/arch/c_target/cbuf.c
@@ -0,0 +1,79 @@
+/* Heap-backed growable byte buffer used by the C target for per-function
+ * declarations and body text. The TU writer is bytewise but CG hands us
+ * declarations and body emission interleaved; we accumulate both and flush
+ * at func_end. */
+
+#include "arch/c_target/internal.h"
+
+#include "core/heap.h"
+
+enum { CBUF_MIN_CAP = 256 };
+
+void cbuf_init(CBuf* b, Heap* h) {
+ b->heap = h;
+ b->data = NULL;
+ b->len = 0;
+ b->cap = 0;
+}
+
+void cbuf_fini(CBuf* b) {
+ if (b->data) b->heap->free(b->heap, b->data, b->cap);
+ b->data = NULL;
+ b->len = 0;
+ b->cap = 0;
+}
+
+void cbuf_reset(CBuf* b) { b->len = 0; }
+
+static void cbuf_grow(CBuf* b, size_t want) {
+ size_t newcap = b->cap ? b->cap : CBUF_MIN_CAP;
+ while (newcap < want) newcap *= 2;
+ u8* nd = (u8*)b->heap->realloc(b->heap, b->data, b->cap, newcap, 1);
+ if (!nd) {
+ /* Out of memory; truncate silently. Caller writer will produce
+ * partial output but no UB. */
+ return;
+ }
+ b->data = nd;
+ b->cap = newcap;
+}
+
+void cbuf_putc(CBuf* b, char c) {
+ if (b->len + 1 > b->cap) cbuf_grow(b, b->len + 1);
+ if (b->len < b->cap) b->data[b->len++] = (u8)c;
+}
+
+void cbuf_puts(CBuf* b, const char* s) {
+ while (*s) cbuf_putc(b, *s++);
+}
+
+void cbuf_putn(CBuf* b, const char* s, size_t n) {
+ if (b->len + n > b->cap) cbuf_grow(b, b->len + n);
+ for (size_t i = 0; i < n && b->len < b->cap; ++i) b->data[b->len++] = (u8)s[i];
+}
+
+void cbuf_put_u64(CBuf* b, u64 v) {
+ char tmp[24];
+ size_t n = 0;
+ if (v == 0) {
+ cbuf_putc(b, '0');
+ return;
+ }
+ while (v) {
+ tmp[n++] = (char)('0' + (v % 10));
+ v /= 10;
+ }
+ while (n--) cbuf_putc(b, tmp[n]);
+}
+
+void cbuf_put_i64(CBuf* b, i64 v) {
+ u64 u;
+ if (v < 0) {
+ cbuf_putc(b, '-');
+ /* careful with INT64_MIN */
+ u = (u64)(-(v + 1)) + 1u;
+ } else {
+ u = (u64)v;
+ }
+ cbuf_put_u64(b, u);
+}
diff --git a/src/arch/c_target/emit.c b/src/arch/c_target/emit.c
@@ -0,0 +1,698 @@
+/* Phase 1 C-source emission for the CGTarget vtable. See doc/CBACKEND.md.
+ *
+ * Output strategy
+ * ---------------
+ * Each function buffers two CBufs while CG walks the body:
+ * decls — variable declarations: " long long v3;\n"
+ * body — TU-wide running output; we accumulate signature/body/closing-brace
+ * across all functions; func_end splices decls in after the open
+ * brace using the recorded fn_body_start bookmark.
+ *
+ * c_finalize flushes a tiny prologue + body to the writer.
+ *
+ * Register declaration is lazy: every operand emit goes through c_ensure_reg,
+ * which the first time it sees a Reg id appends a declaration to decls keyed
+ * on the Operand's source type. Frame slots are declared eagerly when CG
+ * calls c_frame_slot. */
+
+#include "arch/c_target/internal.h"
+
+#include "cg/type.h"
+#include "core/core.h"
+#include "core/heap.h"
+#include "core/pool.h"
+#include "obj/obj.h"
+
+/* === Writer helpers === */
+
+void c_writer_write(CTarget* t, const void* data, size_t n) {
+ CfreeStatus st = cfree_writer_write(t->w, data, n);
+ if (st != CFREE_OK) {
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+ compiler_panic(t->c, loc, "C target: writer error %d", (int)st);
+ }
+}
+
+void c_writer_puts(CTarget* t, const char* s) {
+ size_t n = 0;
+ while (s[n]) ++n;
+ c_writer_write(t, s, n);
+}
+
+/* === Reg / type emission === */
+
+static const char* c_int_type_name_for_width(u32 width, int signed_) {
+ switch (width) {
+ case 1:
+ case 8:
+ return signed_ ? "int8_t" : "uint8_t";
+ case 16:
+ return signed_ ? "int16_t" : "uint16_t";
+ case 32:
+ return signed_ ? "int32_t" : "uint32_t";
+ case 64:
+ return signed_ ? "int64_t" : "uint64_t";
+ default:
+ return NULL;
+ }
+}
+
+/* Phase 1: void / bool / sized int / pointer. Aggregates and floats panic. */
+static const char* c_typename(CTarget* t, CfreeCgTypeId type) {
+ CfreeCgTypeId resolved = api_unalias_type(t->c, type);
+ const CgType* ty = cg_type_get(t->c, resolved);
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+ if (!ty) {
+ compiler_panic(t->c, loc, "C target: unknown type id %u", (unsigned)type);
+ }
+ switch (ty->kind) {
+ case CFREE_CG_TYPE_VOID:
+ return "void";
+ case CFREE_CG_TYPE_BOOL:
+ return "int32_t";
+ case CFREE_CG_TYPE_INT: {
+ const char* s = c_int_type_name_for_width(ty->integer.width, 1);
+ if (!s) {
+ compiler_panic(t->c, loc, "C target: int width %u not yet supported",
+ (unsigned)ty->integer.width);
+ }
+ return s;
+ }
+ case CFREE_CG_TYPE_PTR:
+ return "void*";
+ default:
+ compiler_panic(t->c, loc, "C target: type kind %d not yet supported",
+ (int)ty->kind);
+ }
+}
+
+void c_emit_type(CTarget* t, CBuf* b, CfreeCgTypeId type) {
+ cbuf_puts(b, c_typename(t, type));
+}
+
+void c_reg_name(Reg r, char* out, size_t cap) {
+ size_t i = 0;
+ if (cap == 0) return;
+ if (cap > 1) out[i++] = 'v';
+ char tmp[16];
+ size_t n = 0;
+ u32 v = (u32)r;
+ if (v == 0) {
+ tmp[n++] = '0';
+ } else {
+ while (v) {
+ tmp[n++] = (char)('0' + (v % 10));
+ v /= 10;
+ }
+ }
+ while (n && i + 1 < cap) out[i++] = tmp[--n];
+ out[i] = '\0';
+}
+
+static void c_slot_name(FrameSlot s, char* out, size_t cap) {
+ size_t i = 0;
+ if (cap == 0) return;
+ const char* prefix = "slot_";
+ while (*prefix && i + 1 < cap) out[i++] = *prefix++;
+ char tmp[16];
+ size_t n = 0;
+ u32 v = (u32)s;
+ if (v == 0) {
+ tmp[n++] = '0';
+ } else {
+ while (v) {
+ tmp[n++] = (char)('0' + (v % 10));
+ v /= 10;
+ }
+ }
+ while (n && i + 1 < cap) out[i++] = tmp[--n];
+ out[i] = '\0';
+}
+
+static void c_grow_reg_table(CTarget* t, u32 needed) {
+ Heap* h = t->c->ctx->heap;
+ u32 newcap = t->reg_cap ? t->reg_cap : 16;
+ while (newcap < needed) newcap *= 2;
+ u8* nd = (u8*)h->realloc(h, t->reg_declared, t->reg_cap, newcap, 1);
+ CfreeCgTypeId* nt = (CfreeCgTypeId*)h->realloc(
+ h, t->reg_type, t->reg_cap * sizeof(CfreeCgTypeId),
+ newcap * sizeof(CfreeCgTypeId), _Alignof(CfreeCgTypeId));
+ if ((!nd && newcap) || (!nt && newcap)) {
+ compiler_panic(t->c, (SrcLoc){0, 0, 0}, "C target: out of memory");
+ }
+ for (u32 i = t->reg_cap; i < newcap; ++i) {
+ nd[i] = 0;
+ nt[i] = 0;
+ }
+ t->reg_declared = nd;
+ t->reg_type = nt;
+ t->reg_cap = newcap;
+}
+
+static void c_grow_slot_table(CTarget* t, u32 needed) {
+ Heap* h = t->c->ctx->heap;
+ u32 newcap = t->slot_cap ? t->slot_cap : 8;
+ while (newcap < needed) newcap *= 2;
+ CfreeCgTypeId* nt = (CfreeCgTypeId*)h->realloc(
+ h, t->slot_type, t->slot_cap * sizeof(CfreeCgTypeId),
+ newcap * sizeof(CfreeCgTypeId), _Alignof(CfreeCgTypeId));
+ if (!nt && newcap) {
+ compiler_panic(t->c, (SrcLoc){0, 0, 0}, "C target: out of memory");
+ }
+ for (u32 i = t->slot_cap; i < newcap; ++i) nt[i] = 0;
+ t->slot_type = nt;
+ t->slot_cap = newcap;
+}
+
+void c_ensure_reg(CTarget* t, Reg r, CfreeCgTypeId type, RegClass cls) {
+ (void)cls;
+ if (r == (Reg)REG_NONE) {
+ compiler_panic(t->c, (SrcLoc){0, 0, 0},
+ "C target: REG_NONE reached emission");
+ }
+ if ((u32)r >= t->reg_cap) c_grow_reg_table(t, (u32)r + 1u);
+ if (t->reg_declared[r]) return;
+ t->reg_declared[r] = 1;
+ t->reg_type[r] = type;
+ cbuf_puts(&t->decls, " ");
+ c_emit_type(t, &t->decls, type);
+ cbuf_puts(&t->decls, " ");
+ char buf[24];
+ c_reg_name(r, buf, sizeof buf);
+ cbuf_puts(&t->decls, buf);
+ cbuf_puts(&t->decls, ";\n");
+}
+
+void c_emit_operand(CTarget* t, Operand op) {
+ char buf[24];
+ switch (op.kind) {
+ case OPK_REG:
+ c_ensure_reg(t, op.v.reg, op.type, (RegClass)op.cls);
+ c_reg_name(op.v.reg, buf, sizeof buf);
+ cbuf_puts(&t->body, buf);
+ return;
+ case OPK_IMM:
+ cbuf_puts(&t->body, "((");
+ c_emit_type(t, &t->body, op.type);
+ cbuf_puts(&t->body, ")");
+ cbuf_put_i64(&t->body, op.v.imm);
+ cbuf_puts(&t->body, ")");
+ return;
+ case OPK_LOCAL:
+ c_slot_name(op.v.frame_slot, buf, sizeof buf);
+ cbuf_puts(&t->body, buf);
+ return;
+ default: {
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+ compiler_panic(t->c, loc, "C target: operand kind %d not yet supported",
+ (int)op.kind);
+ }
+ }
+}
+
+/* === Symbol name lookup === */
+
+const char* c_sym_name(CTarget* t, ObjSymId sym) {
+ const ObjSym* os = obj_symbol_get(t->obj, sym);
+ if (!os) {
+ compiler_panic(t->c, (SrcLoc){0, 0, 0}, "C target: unknown ObjSymId %u",
+ (unsigned)sym);
+ }
+ const char* s = pool_str(t->c->global, os->name, NULL);
+ /* Mach-O linker symbols are mangled with a leading underscore; the host
+ * C compiler will re-add it on its own, so strip when re-emitting source. */
+ if (t->c->target.obj == CFREE_OBJ_MACHO && s && s[0] == '_') s += 1;
+ return s;
+}
+
+/* === Prologue / finalize === */
+
+void c_emit_prologue(CTarget* t) {
+ if (t->prologue_emitted) return;
+ t->prologue_emitted = 1;
+ c_writer_puts(t,
+ "/* generated by cfree --emit=c */\n"
+ "#include <stdint.h>\n"
+ "\n");
+}
+
+/* === func_begin / func_end === */
+
+/* Write `RetT name(P0, P1, ...)` (without trailing `;` or `{`) to `b`. */
+static void c_emit_func_signature(CTarget* t, CBuf* b, const char* name,
+ CfreeCgTypeId fn_type) {
+ CfreeCgTypeId ret_type = cg_type_func_ret_id(t->c, fn_type);
+ const CgType* fty = cg_type_get(t->c, api_unalias_type(t->c, fn_type));
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+ if (!fty || fty->kind != CFREE_CG_TYPE_FUNC) {
+ compiler_panic(t->c, loc, "C target: fn_type is not a function type");
+ }
+ c_emit_type(t, b, ret_type);
+ cbuf_puts(b, " ");
+ cbuf_puts(b, name);
+ cbuf_puts(b, "(");
+ if (fty->func.nparams == 0) {
+ cbuf_puts(b, "void");
+ } else {
+ for (u32 i = 0; i < fty->func.nparams; ++i) {
+ if (i > 0) cbuf_puts(b, ", ");
+ c_emit_type(t, b, fty->func.params[i].type);
+ cbuf_puts(b, " p");
+ cbuf_put_u64(b, (u64)i);
+ }
+ }
+ cbuf_puts(b, ")");
+}
+
+void c_func_begin(CGTarget* T, const CGFuncDesc* fd) {
+ CTarget* t = (CTarget*)T;
+
+ c_emit_prologue(t);
+
+ t->cur_fn = fd;
+ cbuf_reset(&t->decls);
+ for (u32 i = 0; i < t->reg_cap; ++i) {
+ t->reg_declared[i] = 0;
+ t->reg_type[i] = 0;
+ }
+ t->nslots = 0;
+
+ const char* name = c_sym_name(t, fd->sym);
+
+ /* Forward-declare so out-of-order callers and same-TU references find the
+ * prototype regardless of definition order. */
+ c_ensure_forward_decl(t, fd->sym, fd->fn_type);
+
+ c_emit_func_signature(t, &t->body, name, fd->fn_type);
+ cbuf_puts(&t->body, " {\n");
+ t->fn_body_start = t->body.len;
+}
+
+void c_ensure_forward_decl(CTarget* t, ObjSymId sym, CfreeCgTypeId fn_type) {
+ Heap* h = t->c->ctx->heap;
+ if ((u32)sym >= t->sym_forwarded_cap) {
+ u32 newcap = t->sym_forwarded_cap ? t->sym_forwarded_cap : 16;
+ while (newcap <= (u32)sym) newcap *= 2;
+ u8* nd = (u8*)h->realloc(h, t->sym_forwarded, t->sym_forwarded_cap, newcap,
+ 1);
+ if (!nd && newcap) {
+ compiler_panic(t->c, (SrcLoc){0, 0, 0}, "C target: out of memory");
+ }
+ for (u32 i = t->sym_forwarded_cap; i < newcap; ++i) nd[i] = 0;
+ t->sym_forwarded = nd;
+ t->sym_forwarded_cap = newcap;
+ }
+ if (t->sym_forwarded[sym]) return;
+ t->sym_forwarded[sym] = 1;
+ const char* name = c_sym_name(t, sym);
+ c_emit_func_signature(t, &t->forwards, name, fn_type);
+ cbuf_puts(&t->forwards, ";\n");
+}
+
+void c_func_end(CGTarget* T) {
+ CTarget* t = (CTarget*)T;
+ size_t splice_at = t->fn_body_start;
+ size_t body_after = t->body.len;
+ size_t fn_body_len = body_after - splice_at;
+ Heap* h = t->c->ctx->heap;
+
+ u8* tmp = NULL;
+ if (fn_body_len) {
+ tmp = (u8*)h->alloc(h, fn_body_len, 1);
+ if (!tmp) {
+ compiler_panic(t->c, t->cur_fn->loc, "C target: out of memory");
+ }
+ for (size_t i = 0; i < fn_body_len; ++i) {
+ tmp[i] = t->body.data[splice_at + i];
+ }
+ }
+
+ t->body.len = splice_at;
+ if (t->decls.len) cbuf_putn(&t->body, (const char*)t->decls.data, t->decls.len);
+ if (tmp) {
+ cbuf_putn(&t->body, (const char*)tmp, fn_body_len);
+ h->free(h, tmp, fn_body_len);
+ }
+ cbuf_puts(&t->body, "}\n\n");
+
+ t->cur_fn = NULL;
+}
+
+/* === frame_slot, param === */
+
+FrameSlot c_frame_slot(CGTarget* T, const FrameSlotDesc* fsd) {
+ CTarget* t = (CTarget*)T;
+ if (t->nslots + 1u >= t->slot_cap) c_grow_slot_table(t, t->nslots + 2u);
+ /* Slot ids start at 1 (FRAME_SLOT_NONE == 0). */
+ FrameSlot id = (FrameSlot)(t->nslots + 1u);
+ t->slot_type[t->nslots] = fsd->type;
+ t->nslots += 1u;
+
+ cbuf_puts(&t->decls, " ");
+ c_emit_type(t, &t->decls, fsd->type);
+ cbuf_puts(&t->decls, " ");
+ char buf[24];
+ c_slot_name(id, buf, sizeof buf);
+ cbuf_puts(&t->decls, buf);
+ cbuf_puts(&t->decls, ";\n");
+ return id;
+}
+
+CGLocalStorage c_param(CGTarget* T, const CGParamDesc* pd) {
+ CTarget* t = (CTarget*)T;
+ CGLocalStorage st = pd->storage;
+ /* Allocate a frame slot for the param, then emit "slot_N = pN;". */
+ FrameSlotDesc fsd;
+ fsd.type = pd->type;
+ fsd.name = pd->name;
+ fsd.loc = pd->loc;
+ fsd.size = pd->size;
+ fsd.align = pd->align;
+ fsd.kind = FS_PARAM;
+ fsd.pad = 0;
+ fsd.flags = 0;
+ if (pd->flags & CG_LOCAL_ADDR_TAKEN) fsd.flags |= FSF_ADDR_TAKEN;
+ FrameSlot slot = c_frame_slot(T, &fsd);
+
+ char buf[24];
+ c_slot_name(slot, buf, sizeof buf);
+ cbuf_puts(&t->body, " ");
+ cbuf_puts(&t->body, buf);
+ cbuf_puts(&t->body, " = p");
+ cbuf_put_u64(&t->body, (u64)pd->index);
+ cbuf_puts(&t->body, ";\n");
+
+ st.kind = CG_LOCAL_STORAGE_FRAME;
+ st.v.frame_slot = slot;
+ return st;
+}
+
+/* === load_imm, copy, binop === */
+
+void c_load_imm(CGTarget* T, Operand dst, i64 imm) {
+ CTarget* t = (CTarget*)T;
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+ if (dst.kind != OPK_REG) {
+ compiler_panic(t->c, loc, "C target: load_imm dst must be REG");
+ }
+ c_ensure_reg(t, dst.v.reg, dst.type, (RegClass)dst.cls);
+ char buf[24];
+ c_reg_name(dst.v.reg, buf, sizeof buf);
+ cbuf_puts(&t->body, " ");
+ cbuf_puts(&t->body, buf);
+ cbuf_puts(&t->body, " = (");
+ c_emit_type(t, &t->body, dst.type);
+ cbuf_puts(&t->body, ")");
+ cbuf_put_i64(&t->body, imm);
+ cbuf_puts(&t->body, ";\n");
+}
+
+void c_copy(CGTarget* T, Operand dst, Operand src) {
+ CTarget* t = (CTarget*)T;
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+ if (dst.kind != OPK_REG) {
+ compiler_panic(t->c, loc, "C target: copy dst must be REG");
+ }
+ c_ensure_reg(t, dst.v.reg, dst.type, (RegClass)dst.cls);
+ char buf[24];
+ c_reg_name(dst.v.reg, buf, sizeof buf);
+ cbuf_puts(&t->body, " ");
+ cbuf_puts(&t->body, buf);
+ cbuf_puts(&t->body, " = ");
+ c_emit_operand(t, src);
+ cbuf_puts(&t->body, ";\n");
+}
+
+static const char* binop_to_c(BinOp op) {
+ switch (op) {
+ case BO_IADD:
+ case BO_FADD:
+ return "+";
+ case BO_ISUB:
+ case BO_FSUB:
+ return "-";
+ case BO_IMUL:
+ case BO_FMUL:
+ return "*";
+ case BO_SDIV:
+ case BO_UDIV:
+ case BO_FDIV:
+ return "/";
+ case BO_SREM:
+ case BO_UREM:
+ return "%";
+ case BO_AND:
+ return "&";
+ case BO_OR:
+ return "|";
+ case BO_XOR:
+ return "^";
+ case BO_SHL:
+ return "<<";
+ case BO_SHR_S:
+ case BO_SHR_U:
+ return ">>";
+ }
+ return NULL;
+}
+
+void c_binop(CGTarget* T, BinOp op, Operand dst, Operand a, Operand b) {
+ CTarget* t = (CTarget*)T;
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+ const char* sym = binop_to_c(op);
+ if (!sym) {
+ compiler_panic(t->c, loc, "C target: unknown binop %d", (int)op);
+ }
+ if (dst.kind != OPK_REG) {
+ compiler_panic(t->c, loc, "C target: binop dst must be REG");
+ }
+ c_ensure_reg(t, dst.v.reg, dst.type, (RegClass)dst.cls);
+ char buf[24];
+ c_reg_name(dst.v.reg, buf, sizeof buf);
+ cbuf_puts(&t->body, " ");
+ cbuf_puts(&t->body, buf);
+ cbuf_puts(&t->body, " = ");
+ c_emit_operand(t, a);
+ cbuf_puts(&t->body, " ");
+ cbuf_puts(&t->body, sym);
+ cbuf_puts(&t->body, " ");
+ c_emit_operand(t, b);
+ cbuf_puts(&t->body, ";\n");
+}
+
+/* === call === */
+
+void c_call(CGTarget* T, const CGCallDesc* d) {
+ CTarget* t = (CTarget*)T;
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+
+ if (d->callee.kind != OPK_GLOBAL) {
+ compiler_panic(t->c, loc, "C target: indirect call not yet supported");
+ }
+
+ const CgType* fty = cg_type_get(t->c, api_unalias_type(t->c, d->fn_type));
+ if (!fty || fty->kind != CFREE_CG_TYPE_FUNC) {
+ compiler_panic(t->c, loc, "C target: call: bad fn_type");
+ }
+ CfreeCgTypeId ret_type = fty->func.ret;
+ int has_ret = !cg_type_is_void(t->c, ret_type);
+
+ cbuf_puts(&t->body, " ");
+ if (has_ret) {
+ if (d->ret.storage.kind != OPK_REG) {
+ compiler_panic(t->c, loc,
+ "C target: aggregate return not yet supported");
+ }
+ c_ensure_reg(t, d->ret.storage.v.reg, ret_type,
+ (RegClass)d->ret.storage.cls);
+ char buf[24];
+ c_reg_name(d->ret.storage.v.reg, buf, sizeof buf);
+ cbuf_puts(&t->body, buf);
+ cbuf_puts(&t->body, " = ");
+ }
+ /* Emit a forward declaration so calls to symbols defined later or not at
+ * all (external) compile against a known prototype. */
+ c_ensure_forward_decl(t, d->callee.v.global.sym, d->fn_type);
+ cbuf_puts(&t->body, c_sym_name(t, d->callee.v.global.sym));
+ cbuf_puts(&t->body, "(");
+ for (u32 i = 0; i < d->nargs; ++i) {
+ if (i > 0) cbuf_puts(&t->body, ", ");
+ c_emit_operand(t, d->args[i].storage);
+ }
+ cbuf_puts(&t->body, ");\n");
+}
+
+/* === load / store === */
+
+void c_load(CGTarget* T, Operand dst, Operand addr, MemAccess m) {
+ CTarget* t = (CTarget*)T;
+ (void)m;
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+ if (dst.kind != OPK_REG) {
+ compiler_panic(t->c, loc, "C target: load dst must be REG");
+ }
+ c_ensure_reg(t, dst.v.reg, dst.type, (RegClass)dst.cls);
+ char buf[24];
+ c_reg_name(dst.v.reg, buf, sizeof buf);
+ cbuf_puts(&t->body, " ");
+ cbuf_puts(&t->body, buf);
+ cbuf_puts(&t->body, " = ");
+ switch (addr.kind) {
+ case OPK_LOCAL:
+ /* slot_N is already a typed C variable; direct read. */
+ c_slot_name(addr.v.frame_slot, buf, sizeof buf);
+ cbuf_puts(&t->body, buf);
+ break;
+ default:
+ compiler_panic(t->c, loc,
+ "C target: load from operand kind %d not yet supported",
+ (int)addr.kind);
+ }
+ cbuf_puts(&t->body, ";\n");
+}
+
+void c_store(CGTarget* T, Operand addr, Operand src, MemAccess m) {
+ CTarget* t = (CTarget*)T;
+ (void)m;
+ SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0};
+ cbuf_puts(&t->body, " ");
+ switch (addr.kind) {
+ case OPK_LOCAL: {
+ char buf[24];
+ c_slot_name(addr.v.frame_slot, buf, sizeof buf);
+ cbuf_puts(&t->body, buf);
+ break;
+ }
+ default:
+ compiler_panic(t->c, loc,
+ "C target: store to operand kind %d not yet supported",
+ (int)addr.kind);
+ }
+ cbuf_puts(&t->body, " = ");
+ c_emit_operand(t, src);
+ cbuf_puts(&t->body, ";\n");
+}
+
+void c_ret(CGTarget* T, const CGABIValue* val) {
+ CTarget* t = (CTarget*)T;
+ cbuf_puts(&t->body, " return");
+ if (val) {
+ cbuf_puts(&t->body, " ");
+ c_emit_operand(t, val->storage);
+ }
+ cbuf_puts(&t->body, ";\n");
+}
+
+/* === alias ===
+ * `cfree_cg_alias` makes alias_sym refer to target_sym's body. In obj-file
+ * land that's two ObjSyms sharing a (section_id, value); in C source we
+ * have to spell it out:
+ *
+ * ELF/PE → `Ret alias(args) __attribute__((alias("target")));`
+ * Single definition, true aliasing, &alias == &target.
+ * Mach-O → emit a thunk `Ret alias(args) { return target(args); }`.
+ * Clang on Darwin rejects __attribute__((alias)) outright,
+ * so we fall back to a wrapper. Loses the `&alias==&target`
+ * identity but preserves call-through semantics, which is
+ * all the cfree-emitted code path needs.
+ *
+ * The emitted decl serves as the alias definition AND a forward prototype
+ * for callers, so we mark sym_forwarded to dedup against a later c_call. */
+void c_alias(CGTarget* T, ObjSymId alias_sym, ObjSymId target_sym,
+ CfreeCgTypeId type) {
+ CTarget* t = (CTarget*)T;
+ Heap* h = t->c->ctx->heap;
+ if ((u32)alias_sym >= t->sym_forwarded_cap) {
+ u32 newcap = t->sym_forwarded_cap ? t->sym_forwarded_cap : 16;
+ while (newcap <= (u32)alias_sym) newcap *= 2;
+ u8* nd = (u8*)h->realloc(h, t->sym_forwarded, t->sym_forwarded_cap, newcap,
+ 1);
+ if (!nd && newcap) {
+ compiler_panic(t->c, (SrcLoc){0, 0, 0}, "C target: out of memory");
+ }
+ for (u32 i = t->sym_forwarded_cap; i < newcap; ++i) nd[i] = 0;
+ t->sym_forwarded = nd;
+ t->sym_forwarded_cap = newcap;
+ }
+ if (t->sym_forwarded[alias_sym]) return;
+ t->sym_forwarded[alias_sym] = 1;
+ const char* alias_name = c_sym_name(t, alias_sym);
+ const char* target_name = c_sym_name(t, target_sym);
+ const CgType* fty = cg_type_get(t->c, api_unalias_type(t->c, type));
+ int is_func = fty && fty->kind == CFREE_CG_TYPE_FUNC;
+
+ if (t->c->target.obj != CFREE_OBJ_MACHO) {
+ /* Attribute form. Works for both function and object aliases on ELF
+ * and PE/COFF. */
+ c_emit_func_signature(t, &t->forwards, alias_name, type);
+ cbuf_puts(&t->forwards, " __attribute__((alias(\"");
+ cbuf_puts(&t->forwards, target_name);
+ cbuf_puts(&t->forwards, "\")));\n");
+ return;
+ }
+
+ /* Mach-O thunk fallback. Functions only for v1 — object aliases on
+ * Darwin would need a more elaborate scheme (see doc/CBACKEND.md). */
+ if (!is_func) {
+ compiler_panic(t->c, (SrcLoc){0, 0, 0},
+ "C target: object alias on Mach-O not yet supported");
+ }
+ /* Forward prototype for the target (its full definition lands separately
+ * via c_func_begin). Also dedup that. */
+ c_ensure_forward_decl(t, target_sym, type);
+ /* `static`? No — alias must be externally visible. */
+ c_emit_func_signature(t, &t->forwards, alias_name, type);
+ cbuf_puts(&t->forwards, " { ");
+ CfreeCgTypeId ret_type = cg_type_func_ret_id(t->c, type);
+ if (!cg_type_is_void(t->c, ret_type)) cbuf_puts(&t->forwards, "return ");
+ cbuf_puts(&t->forwards, target_name);
+ cbuf_puts(&t->forwards, "(");
+ for (u32 i = 0; i < fty->func.nparams; ++i) {
+ if (i > 0) cbuf_puts(&t->forwards, ", ");
+ cbuf_puts(&t->forwards, "p");
+ cbuf_put_u64(&t->forwards, (u64)i);
+ }
+ cbuf_puts(&t->forwards, "); }\n");
+}
+
+/* === set_loc === */
+
+void c_set_loc(CGTarget* T, SrcLoc l) {
+ (void)T;
+ (void)l;
+}
+
+/* === finalize / destroy === */
+
+void c_finalize(CGTarget* T) {
+ CTarget* t = (CTarget*)T;
+ if (t->finalized) return;
+ t->finalized = 1;
+ c_emit_prologue(t);
+ if (t->forwards.len) {
+ c_writer_write(t, t->forwards.data, t->forwards.len);
+ c_writer_puts(t, "\n");
+ }
+ if (t->body.len) c_writer_write(t, t->body.data, t->body.len);
+}
+
+void c_destroy(CGTarget* T) {
+ CTarget* t = (CTarget*)T;
+ Heap* h = t->c->ctx->heap;
+ cbuf_fini(&t->forwards);
+ cbuf_fini(&t->decls);
+ cbuf_fini(&t->body);
+ if (t->sym_forwarded) h->free(h, t->sym_forwarded, t->sym_forwarded_cap);
+ t->sym_forwarded = NULL;
+ t->sym_forwarded_cap = 0;
+ if (t->reg_declared) h->free(h, t->reg_declared, t->reg_cap);
+ if (t->reg_type)
+ h->free(h, t->reg_type, t->reg_cap * sizeof(CfreeCgTypeId));
+ if (t->slot_type)
+ h->free(h, t->slot_type, t->slot_cap * sizeof(CfreeCgTypeId));
+ t->reg_declared = NULL;
+ t->reg_type = NULL;
+ t->slot_type = NULL;
+ t->reg_cap = 0;
+ t->slot_cap = 0;
+}
diff --git a/src/arch/c_target/internal.h b/src/arch/c_target/internal.h
@@ -0,0 +1,109 @@
+#ifndef CFREE_C_TARGET_INTERNAL_H
+#define CFREE_C_TARGET_INTERNAL_H
+
+/* C-source emission CGTarget. See doc/CBACKEND.md.
+ *
+ * This target replaces the machine-code CGTarget when CodeOptions.emit_c_source
+ * is set. It writes target-locked C source to CodeOptions.c_source_writer
+ * instead of object bytes via MCEmitter. Operates with virtual_regs=1, so CG
+ * mints fresh Reg ids and never spills. */
+
+#include <cfree/core.h>
+
+#include "arch/arch.h"
+#include "core/core.h"
+
+/* Heap-backed growable byte buffer. Used for the per-function declaration
+ * and body buffers; CG needs decls at function top but doesn't surface them
+ * before body emission, so we accumulate both and flush at func_end. */
+typedef struct CBuf {
+ Heap* heap;
+ u8* data;
+ size_t len;
+ size_t cap;
+} CBuf;
+
+void cbuf_init(CBuf* b, Heap* h);
+void cbuf_fini(CBuf* b);
+void cbuf_reset(CBuf* b);
+void cbuf_putc(CBuf* b, char c);
+void cbuf_puts(CBuf* b, const char* s);
+void cbuf_putn(CBuf* b, const char* s, size_t n);
+void cbuf_put_i64(CBuf* b, i64 v);
+void cbuf_put_u64(CBuf* b, u64 v);
+
+typedef struct CTarget {
+ CGTarget base;
+
+ Compiler* c;
+ ObjBuilder* obj;
+ CfreeWriter* w;
+
+ /* TU prologue (e.g. #include <stdint.h>) emitted once, on first
+ * function or finalize, whichever first. */
+ u8 prologue_emitted;
+ u8 finalized;
+ u8 pad[2];
+
+ /* TU-wide forward declarations: one `RetT name(params);` line per function
+ * we've seen a definition or call for. Emitted just after the prologue so
+ * callers defined earlier in the TU still see the prototype, and calls to
+ * undefined-in-TU externs get a declaration too. */
+ CBuf forwards;
+ /* Forward-decl dedup: which ObjSymIds have we already declared. Lazily
+ * grown bitmap indexed by ObjSymId. */
+ u8* sym_forwarded;
+ u32 sym_forwarded_cap;
+
+ /* Per-function buffers. Reset on func_end. */
+ CBuf decls;
+ CBuf body;
+
+ /* Per-function regdecl tracking: for each Reg id seen, mark whether we
+ * have already emitted a declaration into `decls`. Sized by reg_cap.
+ * Grown lazily as new reg ids appear. */
+ u8* reg_declared;
+ CfreeCgTypeId* reg_type; /* type each reg was first seen with */
+ u32 reg_cap;
+
+ /* Per-function frame-slot table. The C target invents its own slot ids;
+ * each slot becomes a `T slot_N;` declaration. slot_type[i] is the CG type
+ * the slot was declared with. */
+ CfreeCgTypeId* slot_type;
+ u32 slot_cap;
+ u32 nslots; /* count of slots in current function */
+
+ /* Splice bookmark: byte offset into body where the current function's body
+ * region starts (right after the open brace). func_end uses this to insert
+ * the per-function declarations between the signature and the body. */
+ size_t fn_body_start;
+
+ const CGFuncDesc* cur_fn;
+} CTarget;
+
+CGTarget* c_cgtarget_new(Compiler* c, ObjBuilder* o, CfreeWriter* w);
+
+/* Helpers shared across emit.c. */
+void c_emit_prologue(CTarget* t);
+/* Ensure reg `r` (typed `type`, class `cls`) has been declared. */
+void c_ensure_reg(CTarget* t, Reg r, CfreeCgTypeId type, RegClass cls);
+/* Get a stable C identifier for reg r. Writes into caller-supplied buf. */
+void c_reg_name(Reg r, char* out, size_t cap);
+/* Write the C type for a CG int/float/ptr type to `b`. */
+void c_emit_type(CTarget* t, CBuf* b, CfreeCgTypeId type);
+/* Write operand expression to body (e.g. "v3", "(int32_t)42"). For Phase 1
+ * only OPK_REG and OPK_IMM are supported. */
+void c_emit_operand(CTarget* t, Operand op);
+
+/* Lookup the C linker name for an ObjSymId. Returns interned string. */
+const char* c_sym_name(CTarget* t, ObjSymId sym);
+
+/* Emit a forward declaration for `sym` (of function type `fn_type`) into
+ * the TU forwards buffer if not already done. Idempotent per sym. */
+void c_ensure_forward_decl(CTarget* t, ObjSymId sym, CfreeCgTypeId fn_type);
+
+/* Write `n` bytes to t->w; panic on error. */
+void c_writer_write(CTarget* t, const void* data, size_t n);
+void c_writer_puts(CTarget* t, const char* s);
+
+#endif
diff --git a/src/arch/c_target/target.c b/src/arch/c_target/target.c
@@ -0,0 +1,395 @@
+/* C-source CGTarget construction and vtable wiring.
+ *
+ * See doc/CBACKEND.md. The C target writes portable, target-locked C source
+ * text to the CfreeWriter passed via CodeOptions.c_source_writer. CG operates
+ * with virtual_regs=1, so we never run register allocation or spilling. */
+
+#include "arch/c_target/internal.h"
+
+#include <string.h>
+
+#include "core/arena.h"
+#include "core/core.h"
+#include "core/heap.h"
+
+/* Forward declarations for all CGTarget methods. Implementations either land
+ * in emit.c (working) or are stubs that compiler_panic. */
+void c_func_begin(CGTarget*, const CGFuncDesc*);
+void c_func_end(CGTarget*);
+void c_alias(CGTarget*, ObjSymId, ObjSymId, CfreeCgTypeId);
+void c_ret(CGTarget*, const CGABIValue*);
+void c_load_imm(CGTarget*, Operand, i64);
+void c_copy(CGTarget*, Operand, Operand);
+void c_binop(CGTarget*, BinOp, Operand, Operand, Operand);
+void c_call(CGTarget*, const CGCallDesc*);
+void c_load(CGTarget*, Operand, Operand, MemAccess);
+void c_store(CGTarget*, Operand, Operand, MemAccess);
+CGLocalStorage c_param(CGTarget*, const CGParamDesc*);
+FrameSlot c_frame_slot(CGTarget*, const FrameSlotDesc*);
+void c_set_loc(CGTarget*, SrcLoc);
+void c_finalize(CGTarget*);
+void c_destroy(CGTarget*);
+
+/* Unimplemented stubs panic specifically per the doc's Phase 0 acceptance
+ * criterion. The method name is encoded in the panic message. */
+#define C_UNIMPL(name) \
+ compiler_panic(((CTarget*)t)->c, ((CTarget*)t)->cur_fn ? \
+ ((CTarget*)t)->cur_fn->loc : (SrcLoc){0,0,0}, \
+ "C target: method " name " not implemented")
+
+static void c_unimpl_func_begin_known_frame(CGTarget* t, const CGFuncDesc* f,
+ const CGKnownFrameDesc* k,
+ FrameSlot* out) {
+ (void)f; (void)k; (void)out;
+ C_UNIMPL("func_begin_known_frame");
+}
+
+static CGLocalStorage c_unimpl_local(CGTarget* t, const CGLocalDesc* d) {
+ (void)d;
+ C_UNIMPL("local");
+}
+
+static void c_unimpl_local_addr(CGTarget* t, Operand dst, const CGLocalDesc* d,
+ CGLocalStorage s) {
+ (void)dst; (void)d; (void)s;
+ C_UNIMPL("local_addr");
+}
+
+static void c_unimpl_spill_reg(CGTarget* t, Operand a, FrameSlot s,
+ MemAccess m) {
+ (void)a; (void)s; (void)m;
+ C_UNIMPL("spill_reg");
+}
+
+static void c_unimpl_reload_reg(CGTarget* t, Operand a, FrameSlot s,
+ MemAccess m) {
+ (void)a; (void)s; (void)m;
+ C_UNIMPL("reload_reg");
+}
+
+/* Register-pool descriptors: virtual_regs=1 means CG skips these, but the
+ * non-null contract requires a callable. Return empty pools. */
+static void c_no_regs(CGTarget* t, RegClass cls, const Reg** out, u32* n) {
+ (void)t; (void)cls;
+ *out = NULL;
+ *n = 0;
+}
+
+static void c_no_phys_regs(CGTarget* t, RegClass cls,
+ const CGPhysRegInfo** out, u32* n) {
+ (void)t; (void)cls;
+ *out = NULL;
+ *n = 0;
+}
+
+static int c_is_caller_saved(CGTarget* t, RegClass cls, Reg r) {
+ (void)t; (void)cls; (void)r;
+ return 0;
+}
+
+static u32 c_zero_mask(CGTarget* t, const CGCallDesc* d, RegClass cls) {
+ (void)t; (void)d; (void)cls;
+ return 0;
+}
+static u32 c_zero_ret_mask(CGTarget* t, const ABIFuncInfo* f, RegClass cls) {
+ (void)t; (void)f; (void)cls;
+ return 0;
+}
+static u32 c_zero_cs_mask(CGTarget* t, RegClass cls) {
+ (void)t; (void)cls;
+ return 0;
+}
+
+static void c_noop_plan_regs(CGTarget* t, RegClass cls, const Reg* r, u32 n) {
+ (void)t; (void)cls; (void)r; (void)n;
+}
+static void c_noop_reserve_regs(CGTarget* t, RegClass cls, const Reg* r, u32 n) {
+ (void)t; (void)cls; (void)r; (void)n;
+}
+static u32 c_call_stack_size_zero(CGTarget* t, const CGCallDesc* d) {
+ (void)t; (void)d;
+ return 0;
+}
+
+static Label c_unimpl_label_new(CGTarget* t) {
+ C_UNIMPL("label_new");
+}
+static void c_unimpl_label_place(CGTarget* t, Label l) {
+ (void)l;
+ C_UNIMPL("label_place");
+}
+static void c_unimpl_jump(CGTarget* t, Label l) {
+ (void)l;
+ C_UNIMPL("jump");
+}
+static void c_unimpl_cmp_branch(CGTarget* t, CmpOp op, Operand a, Operand b,
+ Label l) {
+ (void)op; (void)a; (void)b; (void)l;
+ C_UNIMPL("cmp_branch");
+}
+
+static CGScope c_unimpl_scope_begin(CGTarget* t, const CGScopeDesc* d) {
+ (void)d;
+ C_UNIMPL("scope_begin");
+}
+static void c_unimpl_scope_else(CGTarget* t, CGScope s) {
+ (void)s;
+ C_UNIMPL("scope_else");
+}
+static void c_unimpl_scope_end(CGTarget* t, CGScope s) {
+ (void)s;
+ C_UNIMPL("scope_end");
+}
+static void c_unimpl_break_to(CGTarget* t, CGScope s) {
+ (void)s;
+ C_UNIMPL("break_to");
+}
+static void c_unimpl_continue_to(CGTarget* t, CGScope s) {
+ (void)s;
+ C_UNIMPL("continue_to");
+}
+
+static void c_unimpl_load_const(CGTarget* t, Operand dst, ConstBytes cb) {
+ (void)dst; (void)cb;
+ C_UNIMPL("load_const");
+}
+static void c_unimpl_addr_of(CGTarget* t, Operand dst, Operand lv) {
+ (void)dst; (void)lv;
+ C_UNIMPL("addr_of");
+}
+static void c_unimpl_tls_addr_of(CGTarget* t, Operand dst, ObjSymId sym,
+ i64 addend) {
+ (void)dst; (void)sym; (void)addend;
+ C_UNIMPL("tls_addr_of");
+}
+static void c_unimpl_copy_bytes(CGTarget* t, Operand a, Operand b,
+ AggregateAccess m) {
+ (void)a; (void)b; (void)m;
+ C_UNIMPL("copy_bytes");
+}
+static void c_unimpl_set_bytes(CGTarget* t, Operand a, Operand b,
+ AggregateAccess m) {
+ (void)a; (void)b; (void)m;
+ C_UNIMPL("set_bytes");
+}
+static void c_unimpl_bitfield_load(CGTarget* t, Operand dst, Operand addr,
+ BitFieldAccess bf) {
+ (void)dst; (void)addr; (void)bf;
+ C_UNIMPL("bitfield_load");
+}
+static void c_unimpl_bitfield_store(CGTarget* t, Operand addr, Operand src,
+ BitFieldAccess bf) {
+ (void)addr; (void)src; (void)bf;
+ C_UNIMPL("bitfield_store");
+}
+
+static void c_unimpl_unop(CGTarget* t, UnOp op, Operand d, Operand a) {
+ (void)op; (void)d; (void)a;
+ C_UNIMPL("unop");
+}
+static void c_unimpl_cmp(CGTarget* t, CmpOp op, Operand d, Operand a,
+ Operand b) {
+ (void)op; (void)d; (void)a; (void)b;
+ C_UNIMPL("cmp");
+}
+static void c_unimpl_convert(CGTarget* t, ConvKind k, Operand d, Operand s) {
+ (void)k; (void)d; (void)s;
+ C_UNIMPL("convert");
+}
+
+static void c_unimpl_plan_call(CGTarget* t, const CGCallDesc* d, CGCallPlan* p) {
+ (void)d; (void)p;
+ C_UNIMPL("plan_call");
+}
+static void c_unimpl_load_call_arg(CGTarget* t, Operand d,
+ const CGCallPlanMove* m) {
+ (void)d; (void)m;
+ C_UNIMPL("load_call_arg");
+}
+static void c_unimpl_store_call_arg(CGTarget* t, const CGCallPlanMove* m) {
+ (void)m;
+ C_UNIMPL("store_call_arg");
+}
+static void c_unimpl_store_call_ret(CGTarget* t, const CGCallPlanRet* r,
+ Operand s) {
+ (void)r; (void)s;
+ C_UNIMPL("store_call_ret");
+}
+static void c_unimpl_emit_call_plan(CGTarget* t, const CGCallPlan* p) {
+ (void)p;
+ C_UNIMPL("emit_call_plan");
+}
+
+static void c_unimpl_alloca(CGTarget* t, Operand d, Operand s, u32 a) {
+ (void)d; (void)s; (void)a;
+ C_UNIMPL("alloca_");
+}
+static void c_unimpl_va_start(CGTarget* t, Operand a) {
+ (void)a;
+ C_UNIMPL("va_start_");
+}
+static void c_unimpl_va_arg(CGTarget* t, Operand d, Operand a,
+ CfreeCgTypeId ty) {
+ (void)d; (void)a; (void)ty;
+ C_UNIMPL("va_arg_");
+}
+static void c_unimpl_va_end(CGTarget* t, Operand a) {
+ (void)a;
+ C_UNIMPL("va_end_");
+}
+static void c_unimpl_va_copy(CGTarget* t, Operand d, Operand s) {
+ (void)d; (void)s;
+ C_UNIMPL("va_copy_");
+}
+
+static void c_unimpl_atomic_load(CGTarget* t, Operand d, Operand a, MemAccess m,
+ MemOrder o) {
+ (void)d; (void)a; (void)m; (void)o;
+ C_UNIMPL("atomic_load");
+}
+static void c_unimpl_atomic_store(CGTarget* t, Operand a, Operand s, MemAccess m,
+ MemOrder o) {
+ (void)a; (void)s; (void)m; (void)o;
+ C_UNIMPL("atomic_store");
+}
+static void c_unimpl_atomic_rmw(CGTarget* t, AtomicOp op, Operand d, Operand a,
+ Operand v, MemAccess m, MemOrder o) {
+ (void)op; (void)d; (void)a; (void)v; (void)m; (void)o;
+ C_UNIMPL("atomic_rmw");
+}
+static void c_unimpl_atomic_cas(CGTarget* t, Operand p, Operand ok, Operand a,
+ Operand e, Operand de, MemAccess m, MemOrder so,
+ MemOrder fo) {
+ (void)p; (void)ok; (void)a; (void)e; (void)de; (void)m; (void)so; (void)fo;
+ C_UNIMPL("atomic_cas");
+}
+static void c_unimpl_fence(CGTarget* t, MemOrder o) {
+ (void)o;
+ C_UNIMPL("fence");
+}
+
+static void c_unimpl_intrinsic(CGTarget* t, IntrinKind k, Operand* d, u32 nd,
+ const Operand* a, u32 na) {
+ (void)k; (void)d; (void)nd; (void)a; (void)na;
+ C_UNIMPL("intrinsic");
+}
+static void c_unimpl_asm_block(CGTarget* t, const char* tmpl,
+ const AsmConstraint* outs, u32 no, Operand* oo,
+ const AsmConstraint* ins, u32 ni,
+ const Operand* io, const Sym* clobs, u32 nc) {
+ (void)tmpl; (void)outs; (void)no; (void)oo;
+ (void)ins; (void)ni; (void)io;
+ (void)clobs; (void)nc;
+ C_UNIMPL("asm_block");
+}
+
+static void cgt_cleanup(void* arg) { cgtarget_free((CGTarget*)arg); }
+
+CGTarget* c_cgtarget_new(Compiler* c, ObjBuilder* o, CfreeWriter* w) {
+ CTarget* x = arena_new(c->tu, CTarget);
+ memset(x, 0, sizeof *x);
+
+ x->c = c;
+ x->obj = o;
+ x->w = w;
+ cbuf_init(&x->forwards, c->ctx->heap);
+ cbuf_init(&x->decls, c->ctx->heap);
+ cbuf_init(&x->body, c->ctx->heap);
+
+ CGTarget* t = &x->base;
+ t->c = c;
+ t->obj = o;
+ t->mc = NULL;
+ t->virtual_regs = 1;
+
+ /* ---- function lifecycle ---- */
+ t->func_begin = c_func_begin;
+ t->func_begin_known_frame = c_unimpl_func_begin_known_frame;
+ t->func_end = c_func_end;
+ t->alias = c_alias;
+
+ /* ---- frame slots and locals ---- */
+ t->frame_slot = c_frame_slot;
+ t->local = c_unimpl_local;
+ t->local_addr = c_unimpl_local_addr;
+ t->param = c_param;
+ t->spill_reg = c_unimpl_spill_reg;
+ t->reload_reg = c_unimpl_reload_reg;
+
+ /* ---- regalloc coordination (virtual_regs => mostly inert) ---- */
+ t->get_allocable_regs = c_no_regs;
+ t->get_phys_regs = c_no_phys_regs;
+ t->get_scratch_regs = c_no_regs;
+ t->is_caller_saved = c_is_caller_saved;
+ t->call_clobber_mask = c_zero_mask;
+ t->return_reg_mask = c_zero_ret_mask;
+ t->callee_save_mask = c_zero_cs_mask;
+ t->plan_hard_regs = c_noop_plan_regs;
+ t->reserve_hard_regs = c_noop_reserve_regs;
+ t->call_stack_size = c_call_stack_size_zero;
+
+ /* ---- labels and control flow ---- */
+ t->label_new = c_unimpl_label_new;
+ t->label_place = c_unimpl_label_place;
+ t->jump = c_unimpl_jump;
+ t->cmp_branch = c_unimpl_cmp_branch;
+ t->scope_begin = c_unimpl_scope_begin;
+ t->scope_else = c_unimpl_scope_else;
+ t->scope_end = c_unimpl_scope_end;
+ t->break_to = c_unimpl_break_to;
+ t->continue_to = c_unimpl_continue_to;
+
+ /* ---- data movement ---- */
+ t->load_imm = c_load_imm;
+ t->load_const = c_unimpl_load_const;
+ t->copy = c_copy;
+ t->load = c_load;
+ t->store = c_store;
+ t->addr_of = c_unimpl_addr_of;
+ t->tls_addr_of = c_unimpl_tls_addr_of;
+ t->copy_bytes = c_unimpl_copy_bytes;
+ t->set_bytes = c_unimpl_set_bytes;
+ t->bitfield_load = c_unimpl_bitfield_load;
+ t->bitfield_store = c_unimpl_bitfield_store;
+
+ /* ---- arithmetic, compare, convert ---- */
+ t->binop = c_binop;
+ t->unop = c_unimpl_unop;
+ t->cmp = c_unimpl_cmp;
+ t->convert = c_unimpl_convert;
+
+ /* ---- calls / return ---- */
+ t->call = c_call;
+ t->plan_call = c_unimpl_plan_call;
+ t->load_call_arg = c_unimpl_load_call_arg;
+ t->store_call_arg = c_unimpl_store_call_arg;
+ t->store_call_ret = c_unimpl_store_call_ret;
+ t->emit_call_plan = c_unimpl_emit_call_plan;
+ t->ret = c_ret;
+
+ /* ---- alloca / varargs ---- */
+ t->alloca_ = c_unimpl_alloca;
+ t->va_start_ = c_unimpl_va_start;
+ t->va_arg_ = c_unimpl_va_arg;
+ t->va_end_ = c_unimpl_va_end;
+ t->va_copy_ = c_unimpl_va_copy;
+
+ /* ---- atomics ---- */
+ t->atomic_load = c_unimpl_atomic_load;
+ t->atomic_store = c_unimpl_atomic_store;
+ t->atomic_rmw = c_unimpl_atomic_rmw;
+ t->atomic_cas = c_unimpl_atomic_cas;
+ t->fence = c_unimpl_fence;
+
+ /* ---- intrinsics / asm ---- */
+ t->intrinsic = c_unimpl_intrinsic;
+ t->asm_block = c_unimpl_asm_block;
+ t->resolve_reg_name = NULL;
+
+ t->set_loc = c_set_loc;
+ t->finalize = c_finalize;
+ t->destroy = c_destroy;
+
+ compiler_defer(c, cgt_cleanup, t);
+ return t;
+}
diff --git a/src/cg/asm.c b/src/cg/asm.c
@@ -308,6 +308,14 @@ void cfree_cg_file_scope_asm(CfreeCg* g, const char* asm_source,
size_t asm_source_len) {
AsmLexer* lex;
if (!g || !asm_source) return;
+ /* The C-source backend bypasses MCEmitter entirely; file-scope asm has
+ * no portable C source equivalent (Phase 4 territory — see
+ * doc/CBACKEND.md). Panic with the same shape as other unimplemented
+ * C-target paths so the test harness recognizes the diagnostic. */
+ if (!g->mc) {
+ compiler_panic(g->c, api_no_loc(),
+ "C target: file-scope asm not yet supported");
+ }
api_local_const_memory_boundary(g);
lex = asm_lex_open_mem(g->c, "<file-scope asm>", asm_source, asm_source_len);
if (!lex)
diff --git a/src/cg/session.c b/src/cg/session.c
@@ -1,13 +1,20 @@
#include "cg/internal.h"
+/* C-source CGTarget constructor lives in src/arch/c_target/target.c. Declared
+ * here rather than in arch/arch.h because it's the only consumer of the
+ * CfreeWriter passed via CodeOptions — every other CGTarget is constructed
+ * via the per-arch ArchImpl. */
+CGTarget* c_cgtarget_new(Compiler* c, ObjBuilder* o, CfreeWriter* w);
+
CfreeStatus cfree_cg_new(CfreeCompiler* c, CfreeObjBuilder* out,
const CfreeCodeOptions* opts, CfreeCg** cg_out) {
Heap* h;
CfreeCg* g;
- MCEmitter* mc;
+ MCEmitter* mc = NULL;
CGTarget* target;
Debug* debug = NULL;
int opt_level = opts ? opts->opt_level : 0;
+ int emit_c_source = opts ? (int)opts->emit_c_source : 0;
if (!cg_out) return CFREE_INVALID;
*cg_out = NULL;
if (!c || !out) return CFREE_INVALID;
@@ -15,21 +22,38 @@ CfreeStatus cfree_cg_new(CfreeCompiler* c, CfreeObjBuilder* out,
compiler_panic((Compiler*)c, api_no_loc(),
"CfreeCg: unsupported opt_level %d", opt_level);
}
+ if (emit_c_source) {
+ /* See doc/CBACKEND.md §"Sequencing with opt": opt churns IR we'd
+ * immediately re-flatten to C source. Force opt_level=0 and bypass the
+ * MCEmitter / Debug producers entirely. */
+ if (!opts->c_source_writer) {
+ compiler_panic((Compiler*)c, api_no_loc(),
+ "CfreeCg: emit_c_source requires c_source_writer");
+ }
+ opt_level = 0;
+ }
h = (Heap*)c->ctx->heap;
- mc = mc_new((Compiler*)c, (ObjBuilder*)out);
- if (!mc) return CFREE_NOMEM;
- if (opts && opts->debug_info) {
- debug = debug_new((Compiler*)c, (ObjBuilder*)out);
- if (!debug) {
- mc_free(mc);
- return CFREE_NOMEM;
+ if (!emit_c_source) {
+ mc = mc_new((Compiler*)c, (ObjBuilder*)out);
+ if (!mc) return CFREE_NOMEM;
+ if (opts && opts->debug_info) {
+ debug = debug_new((Compiler*)c, (ObjBuilder*)out);
+ if (!debug) {
+ mc_free(mc);
+ return CFREE_NOMEM;
+ }
+ mc->debug = debug;
}
- mc->debug = debug;
}
- target = cgtarget_new((Compiler*)c, (ObjBuilder*)out, mc);
+ if (emit_c_source) {
+ target =
+ c_cgtarget_new((Compiler*)c, (ObjBuilder*)out, opts->c_source_writer);
+ } else {
+ target = cgtarget_new((Compiler*)c, (ObjBuilder*)out, mc);
+ }
if (!target) {
if (debug) debug_free(debug);
- mc_free(mc);
+ if (mc) mc_free(mc);
return CFREE_UNSUPPORTED;
}
target->debug = debug;
@@ -146,6 +170,14 @@ CfreeCgSym cfree_cg_alias(CfreeCg* g, CfreeCgAlias alias) {
decl_attrs = api_sym_attrs(g, alias.target);
decl_attrs.sym = alias.sym;
api_remember_sym(g, sym, api_sym_type(g, alias.target), decl_attrs);
+ /* Notify the backend so it can mirror the alias in any output form that
+ * isn't a relocatable obj — e.g. the C-source target. Native machine-code
+ * backends leave this hook NULL because obj_symbol_define above already
+ * aliased the underlying bytes. */
+ if (g->target && g->target->alias) {
+ g->target->alias(g->target, sym, (ObjSymId)alias.target,
+ api_sym_type(g, alias.target));
+ }
return (CfreeCgSym)sym;
}
diff --git a/test/parse/harness/parse_runner.c b/test/parse/harness/parse_runner.c
@@ -1,7 +1,10 @@
/* parse-runner — file-driven C front-end test runner.
*
- * parse-runner --emit FILE.c OUT.o # full pipeline → ELF .o
- * parse-runner --jit FILE.c # full pipeline → JIT, call test_main
+ * parse-runner --emit FILE.c OUT.o # full pipeline → ELF .o
+ * parse-runner --emit-c FILE.c OUT.c # full pipeline → C source via
+ * --emit=c CGTarget (Phase 1 C backend;
+ * see doc/CBACKEND.md)
+ * parse-runner --jit FILE.c # full pipeline → JIT, call test_main
*
* Exclusively uses the public cfree.h surface: this is the same path real
* driver consumers take. Built once; the shell runner walks
@@ -327,7 +330,8 @@ static int add_runtime_archive(CfreeJitLinkOptions* opts,
/* ---- modes ---- */
-static int mode_emit(const char* src_path, const char* out_path) {
+static int mode_emit_impl(const char* src_path, const char* out_path,
+ int emit_c) {
uint8_t* src = NULL;
size_t src_len = 0;
CfreeTarget tgt;
@@ -359,10 +363,16 @@ static int mode_emit(const char* src_path, const char* out_path) {
in.len = src_len;
memset(&opts, 0, sizeof opts);
- opts.code.opt_level = opt_level_from_env();
+ /* --emit-c forces opt_level=0 inside CG anyway, but be explicit so the
+ * env-driven opt level doesn't surprise. */
+ opts.code.opt_level = emit_c ? 0 : opt_level_from_env();
add_test_system_includes(&opts);
(void)cfree_writer_mem(&g_heap, &w);
+ if (emit_c) {
+ opts.code.emit_c_source = true;
+ opts.code.c_source_writer = w;
+ }
if (cfree_compile_c_obj_emit(c, &opts, &in, w) != CFREE_OK) {
cfree_writer_close(w);
cfree_compiler_free(c);
@@ -508,8 +518,9 @@ static int mode_jit(const char* src_path) {
static int usage(void) {
fprintf(stderr,
- "usage: parse-runner --emit FILE.c OUT.o\n"
- " parse-runner --jit FILE.c\n");
+ "usage: parse-runner --emit FILE.c OUT.o\n"
+ " parse-runner --emit-c FILE.c OUT.c\n"
+ " parse-runner --jit FILE.c\n");
return 2;
}
@@ -518,7 +529,9 @@ int main(int argc, char** argv) {
if (ps > 0) g_execmem.page_size = (size_t)ps;
if (argc < 2) return usage();
if (!strcmp(argv[1], "--emit") && argc == 4)
- return mode_emit(argv[2], argv[3]);
+ return mode_emit_impl(argv[2], argv[3], 0);
+ if (!strcmp(argv[1], "--emit-c") && argc == 4)
+ return mode_emit_impl(argv[2], argv[3], 1);
if (!strcmp(argv[1], "--jit") && argc == 3) return mode_jit(argv[2]);
return usage();
}
diff --git a/test/parse/run.sh b/test/parse/run.sh
@@ -1,7 +1,7 @@
#!/usr/bin/env bash
# test/parse/run.sh — file-driven C-parser test harness.
#
-# For each test/parse/cases/*.c, runs up to four paths (the test/cg path
+# For each test/parse/cases/*.c, runs up to five paths (the test/cg path
# matrix minus W; DWARF directives may be added later via .dwarf sidecars):
#
# D in-process JIT — parse-runner --jit FILE.c → exit code matches
@@ -11,6 +11,11 @@
# E exec via qemu — parse-runner --emit + start.o → link-exe-runner →
# qemu/podman → exit code. Cross-host friendly.
# J jit-via-file — parse-runner --emit + jit-runner. aarch64 host.
+# C emit-c host — parse-runner --emit-c + host cc + test_main wrapper,
+# run native. Validates the --emit=c C-source backend.
+# Host arch must match cross target. Cases that hit an
+# unimplemented C-target method are reported as SKIP
+# (not FAIL) so phased backend rollout is tolerated.
#
# Reuses the test/link harness binaries (cfree-roundtrip, link-exe-runner,
# jit-runner) and test/link/harness/start.c verbatim.
@@ -24,7 +29,7 @@
# Filtering:
# ./run.sh [name_filter] [paths]
# name_filter substring match against case basename
-# paths subset of "DREJ" (default "DREJ")
+# paths subset of "DREJC" (default "DREJ" — C opt-in)
# Equivalent env vars: CFREE_TEST_FILTER, CFREE_TEST_PATHS.
#
# Optimization levels:
@@ -100,7 +105,8 @@ case "$PATHS" in *D*) RUN_D=1;; *) RUN_D=0;; esac
case "$PATHS" in *R*) RUN_R=1;; *) RUN_R=0;; esac
case "$PATHS" in *E*) RUN_E=1;; *) RUN_E=0;; esac
case "$PATHS" in *J*) RUN_J=1;; *) RUN_J=0;; esac
-T_D=0; T_R=0; T_E=0; T_J=0
+case "$PATHS" in *C*) RUN_C=1;; *) RUN_C=0;; esac
+T_D=0; T_R=0; T_E=0; T_J=0; T_C=0
now_ms() { python3 -c 'import time;print(int(time.time()*1000))'; }
mkdir -p "$BUILD_DIR" "$BUILD_DIR/parse"
@@ -157,6 +163,7 @@ replay_events() {
R) T_R=$(( T_R + b )) ;;
E) T_E=$(( T_E + b )) ;;
J) T_J=$(( T_J + b )) ;;
+ C) T_C=$(( T_C + b )) ;;
esac
;;
QUEUE_E)
@@ -237,6 +244,14 @@ command -v podman >/dev/null 2>&1 && have_podman=1
arch_raw="$(uname -m 2>/dev/null || true)"
{ [ "$arch_raw" = "aarch64" ] || [ "$arch_raw" = "arm64" ]; } && is_aarch64=1
+# Host object format for path C: the emitted C is target-locked, so the
+# C-target must use the host's obj format (controls ELF-vs-Mach-O choices
+# like `__attribute__((alias))` vs a thunk fallback).
+case "$(uname -s 2>/dev/null)" in
+ Darwin) HOST_OBJ_FMT=macho ;;
+ *) HOST_OBJ_FMT=elf ;;
+esac
+
# is_native_target=1 when the cross-target arch matches the host arch.
# Required for in-process JIT (path D) and the jit-runner (path J).
is_native_target=0
@@ -328,6 +343,27 @@ if [ $have_clang_cross -eq 1 ]; then
fi
fi
+# Cached test_main main-wrapper.o — used by path C to link the emitted C
+# source against a host-cc-compiled main() that returns test_main()'s value.
+# Phase 1 C target only supports native host arch (no cross-emit), so this
+# wrapper is built with the host CC, not the cross clang.
+C_WRAPPER_SRC="$BUILD_DIR/parse_c_wrapper.c"
+C_WRAPPER_OBJ="$BUILD_DIR/parse_c_wrapper.o"
+have_c_wrapper=0
+if [ ! -f "$C_WRAPPER_SRC" ] || [ ! -s "$C_WRAPPER_SRC" ]; then
+ cat > "$C_WRAPPER_SRC" <<'EOF'
+/* Generated by test/parse/run.sh — bridges main() to test_main() for path C. */
+extern int test_main(void);
+int main(void) { return test_main(); }
+EOF
+fi
+if $CC -std=c11 -c "$C_WRAPPER_SRC" -o "$C_WRAPPER_OBJ" 2>/dev/null; then
+ have_c_wrapper=1
+ printf ' %s c-wrapper\n' "$(color_grn built)"
+else
+ printf ' %s c-wrapper (host CC failed)\n' "$(color_yel warn)" >&2
+fi
+
printf 'Running cases (%s jobs, opt levels: %s)...\n' "$TEST_JOBS" "$OPT_LEVELS"
# ---- per-case loop ---------------------------------------------------------
@@ -357,6 +393,7 @@ run_parse_case() {
local _idx="$1" item="$2" event="$3"
local opt src base_name name work reason expected expected_byte obj t0 dt d_rc r_ok r_msg rt
local exe link_dt j_rc
+ local c_src c_bin c_rc missing run_c
: "$_idx"
opt="${item%%:*}"
@@ -470,6 +507,72 @@ run_parse_case() {
emit_event "$event" SKIP "$name/J" "no jit-runner (host arch != $TEST_ARCH)"
fi
fi
+
+ # ---- Path C: --emit=c + host cc + run --------------------------------
+ # Phase 1 of the C-source backend only handles a small slice of the
+ # CGTarget vtable (see doc/CBACKEND.md). Cases that hit an
+ # unimplemented method produce a panic that we surface as SKIP, so
+ # the test pass/fail signal reflects the implemented surface rather
+ # than churning while phases land.
+ #
+ # A per-case `<name>.cbackend.skip` sidecar opts the case out of
+ # path C only (other paths still run), for surface gaps that don't
+ # show up as a panic — e.g. emitted-C link errors that need Phase 4
+ # work (aliases, asm definitions) to fix.
+ run_c=$RUN_C
+ if [ $run_c -eq 1 ] && [ -e "$TEST_DIR/cases/$base_name.cbackend.skip" ]; then
+ reason=$(head -n1 "$TEST_DIR/cases/$base_name.cbackend.skip")
+ emit_event "$event" SKIP "$name/C" "$reason"
+ run_c=0
+ fi
+ if [ $run_c -eq 1 ]; then
+ if [ $have_c_wrapper -eq 1 ] && [ $is_native_target -eq 1 ]; then
+ t0=$(now_ms)
+ c_src="$work/$base_name.cfree.c"
+ c_bin="$work/$base_name.cbackend.bin"
+ # Emitted C is target-locked, so we override CFREE_TEST_OBJ to
+ # the host's object format for the --emit-c invocation —
+ # otherwise ELF-only constructs like
+ # __attribute__((alias("x"))) leak into source compiled by a
+ # Mach-O-targeting host cc and fail at compile time.
+ if ! CFREE_TEST_OBJ="$HOST_OBJ_FMT" "$PARSE_RUNNER" \
+ --emit-c "$src" "$c_src" \
+ >"$work/c.emit.out" 2>"$work/c.emit.err"; then
+ dt=$(( $(now_ms) - t0 ))
+ emit_event "$event" TIME C "$dt"
+ # Recognize "C target: ... not implemented" and
+ # "C target: ... not yet supported" as phased-rollout
+ # skips rather than regressions. Anything else is a real
+ # failure that the harness flags so it can't hide.
+ missing=$(grep -oE 'C target: .*(not implemented|not yet supported)' \
+ "$work/c.emit.err" 2>/dev/null | head -n1 || true)
+ if [ -n "$missing" ]; then
+ emit_event "$event" SKIP "$name/C" "$missing"
+ else
+ emit_event "$event" FAIL "$name/C (parse-runner --emit-c failed; see $work/c.emit.err)"
+ fi
+ elif ! $CC -std=c11 "$c_src" "$C_WRAPPER_OBJ" -o "$c_bin" \
+ >"$work/c.cc.out" 2>"$work/c.cc.err"; then
+ dt=$(( $(now_ms) - t0 ))
+ emit_event "$event" TIME C "$dt"
+ emit_event "$event" FAIL "$name/C (host cc rejected emitted source; see $work/c.cc.err)"
+ else
+ "$c_bin" >"$work/c.run.out" 2>"$work/c.run.err"
+ c_rc=$?
+ dt=$(( $(now_ms) - t0 ))
+ emit_event "$event" TIME C "$dt"
+ if [ "$c_rc" -eq "$expected_byte" ]; then
+ emit_event "$event" PASS "$name/C (${dt}ms)"
+ else
+ emit_event "$event" FAIL "$name/C (expected $expected_byte got $c_rc, ${dt}ms)"
+ fi
+ fi
+ elif [ $have_c_wrapper -eq 0 ]; then
+ emit_event "$event" SKIP "$name/C" "no c-wrapper (host CC failed)"
+ else
+ emit_event "$event" SKIP "$name/C" "host arch != $TEST_ARCH (C target is target-locked)"
+ fi
+ fi
return 0
}
@@ -517,8 +620,8 @@ if [ ${#SKIP_NAMES[@]} -gt 0 ] && [ "$ALLOW_SKIP" != "1" ]; then
fi
printf '\nResults: %s pass, %s fail, %s skip\n' "$PASS" "$FAIL" "$SKIP"
-printf 'Time: D=%dms R=%dms E=%dms (batch %dms) J=%dms\n' \
- "$T_D" "$T_R" "$T_E" "$T_E_BATCH" "$T_J"
+printf 'Time: D=%dms R=%dms E=%dms (batch %dms) J=%dms C=%dms\n' \
+ "$T_D" "$T_R" "$T_E" "$T_E_BATCH" "$T_J" "$T_C"
if [ $FAIL -gt 0 ]; then exit 1; fi
if [ $SKIP -gt 0 ] && [ "$ALLOW_SKIP" != "1" ]; then exit 1; fi
diff --git a/test/test.mk b/test/test.mk
@@ -27,13 +27,27 @@
# asm_parse / cfree_disasm_iter_* are still stubs; the harness builds
# and runs end-to-end so the wiring stays exercised. See doc/ASM.md.
-.PHONY: test test-driver test-lex test-pp test-pp-err test-elf test-ar test-ar-driver test-link test-cg-api test-toy test-opt test-dwarf test-debug test-parse test-parse-err test-asm test-wasm-front test-isa test-aa64-inline test-rt-headers test-rt-runtime test-libc test-musl test-glibc test-lib-deps test-smoke-x64 test-smoke-rv64
+.PHONY: test test-driver test-lex test-pp test-pp-err test-elf test-ar test-ar-driver test-link test-cg-api test-toy test-opt test-dwarf test-debug test-parse test-parse-err test-asm test-wasm-front test-isa test-aa64-inline test-rt-headers test-rt-runtime test-libc test-musl test-glibc test-lib-deps test-smoke-x64 test-smoke-rv64 test-cbackend
test: test-driver test-lex test-pp test-pp-err test-elf test-ar test-ar-driver test-link test-toy test-dwarf test-debug test-parse test-parse-err test-asm test-isa test-aa64-inline test-rt-headers test-lib-deps
+# `test-cbackend` is intentionally not in the default `test` target: the
+# Phase 1 C backend skips most fixtures pending later phases, which would
+# add noise to the default summary. Run it explicitly to gate progress.
test-driver: bin
@CFREE=$(abspath $(BIN)) sh test/driver/run.sh
+# test-cbackend: --emit=c C-source backend, driven through three
+# frontends — parse-runner (C), toy-runner (toy), wasm-runner (wat/wasm).
+# Each invokes its existing runner with paths=C so a single corpus per
+# frontend exercises both the existing backends and the C backend.
+# Together they prove the CGTarget seam is frontend-agnostic.
+# Unimplemented CGTarget methods report as SKIP; see doc/CBACKEND.md.
+test-cbackend: bin
+ @CFREE_TEST_PATHS=C CFREE_TEST_ALLOW_SKIP=1 sh test/parse/run.sh
+ @CFREE_TEST_PATHS=C CFREE=$(abspath $(BIN)) sh test/toy/run.sh
+ @CFREE_TEST_PATHS=C CFREE=$(abspath $(BIN)) bash test/wasm/run.sh
+
test-lex: bin
@CFREE=$(abspath $(BIN)) test/lex/run.sh
diff --git a/test/toy/run.sh b/test/toy/run.sh
@@ -5,17 +5,25 @@
# R cfree run -O{level} case.toy
# L cfree cc -O{level} -c case.toy -> cfree ld case.o -> native executable
# X cfree cc -O{level} -target -> cfree ld -> exec_target for Linux cross targets
+# C cfree cc --emit=c case.toy -> host cc -> native exec. Exercises the
+# --emit=c C-source backend driven by a non-C frontend (validates that
+# the CGTarget seam is frontend-agnostic). Phased-rollout panics from
+# the C target report as SKIP. Host cc runs without -Werror so the
+# i64 toy main type doesn't trigger -Wmain-return-type as an error.
#
# Sidecars:
-# <name>.expected expected process exit code, default 0
-# <name>.objdump fixed substrings expected in `cfree objdump -h -t`
-# after the linked-object compile path
-# err/<name>.expected expected diagnostic substring for compile-fail cases
+# <name>.expected expected process exit code, default 0
+# <name>.objdump fixed substrings expected in `cfree objdump -h -t`
+# after the linked-object compile path
+# <name>.cbackend.skip opts the case out of path C (with reason),
+# without affecting other paths
+# err/<name>.expected expected diagnostic substring for compile-fail cases
#
# Filtering:
# ./run.sh [name_filter] [paths]
-# CFREE_TEST_FILTER / CFREE_TEST_PATHS, where paths is a subset of "RLX".
+# CFREE_TEST_FILTER / CFREE_TEST_PATHS, where paths is a subset of "RLXC".
# X is opt-in cross-arch cc+ld+exec for aa64, x64, and rv64.
+# C is opt-in C-source emit; default paths are "RL".
# CFREE_TOY_OPT_LEVELS selects optimization levels, default "0 1".
set -u
@@ -30,8 +38,10 @@ PATHS="${2:-${CFREE_TEST_PATHS:-RL}}"
case "$PATHS" in *R*) RUN_R=1;; *) RUN_R=0;; esac
case "$PATHS" in *L*) RUN_L=1;; *) RUN_L=0;; esac
case "$PATHS" in *X*) RUN_X=1;; *) RUN_X=0;; esac
+case "$PATHS" in *C*) RUN_C=1;; *) RUN_C=0;; esac
TOY_CROSS_ARCHS="${CFREE_TOY_CROSS_ARCHS:-aa64 x64 rv64}"
TOY_OPT_LEVELS="${CFREE_TOY_OPT_LEVELS:-0 1}"
+HOST_CC="${CC:-cc}"
mkdir -p "$BUILD_DIR"
@@ -293,6 +303,53 @@ run_case_cross() {
done
}
+run_case_emit_c() {
+ local name="$1" src="$2" expected="$3" work="$4" opt="$5"
+ local label="$name/C-O$opt"
+ local cbackend_skip="${src%.toy}.cbackend.skip"
+ if [ -e "$cbackend_skip" ]; then
+ note_skip "$label" "$(head -n1 "$cbackend_skip")"
+ return
+ fi
+ local out_c="$work/$name.cfree.c"
+ local out_bin="$work/$name.cbackend.bin"
+ local emit_err="$work/c.emit.err"
+ local cc_err="$work/c.cc.err"
+ local run_out="$work/c.run.out" run_err="$work/c.run.err"
+ local rc missing
+
+ # --emit=c forces opt_level=0 internally; pass -O$opt anyway so the
+ # driver flag parsing stays exercised.
+ if ! "$CFREE" cc "-O$opt" --emit=c "$src" -o "$out_c" \
+ > "$work/c.emit.out" 2> "$emit_err"; then
+ # Phased-rollout panic from the C target → SKIP, not FAIL.
+ missing=$(grep -oE 'C target: .*(not implemented|not yet supported)' \
+ "$emit_err" 2>/dev/null | head -n1 || true)
+ if [ -n "$missing" ]; then
+ note_skip "$label" "$missing"
+ return
+ fi
+ note_fail "$label"
+ printf ' cfree cc --emit=c failed\n'
+ sed 's/^/ | /' "$emit_err"
+ return
+ fi
+ # Toy's main returns i64; modern clang under -std=c11 makes that a
+ # hard error ("'main' must return 'int'"). The emitted C only needs
+ # the stdint typedefs, so we compile under -std=gnu99 + the relax
+ # flag — gnu99's main-return-type check is a warning, not an error.
+ if ! $HOST_CC -std=gnu99 -Wno-main-return-type "$out_c" -o "$out_bin" \
+ > "$work/c.cc.out" 2> "$cc_err"; then
+ note_fail "$label"
+ printf ' host cc rejected emitted source\n'
+ sed 's/^/ | /' "$cc_err"
+ return
+ fi
+ "$out_bin" > "$run_out" 2> "$run_err"
+ rc=$?
+ check_rc "$label" "$rc" "$expected" "$run_err"
+}
+
if [ ! -x "$CFREE" ]; then
printf 'missing cfree binary: %s\n' "$CFREE" >&2
exit 2
@@ -339,6 +396,9 @@ for src in "${cases[@]}"; do
if [ $RUN_X -eq 1 ]; then
run_case_cross "$name" "$src" "$expected" "$work" "$opt"
fi
+ if [ $RUN_C -eq 1 ]; then
+ run_case_emit_c "$name" "$src" "$expected" "$work" "$opt"
+ fi
done
done
diff --git a/test/wasm/run.sh b/test/wasm/run.sh
@@ -14,6 +14,24 @@ LINK_EXE_RUNNER="$ROOT/build/test/link-exe-runner"
TEST_ARCH="${CFREE_TEST_ARCH:-aa64}"
TEST_OBJ="${CFREE_TEST_OBJ:-macho}"
+# Path filtering. Default runs the legacy set (everything except C):
+# W wat2wasm
+# D cfree run (JIT)
+# O cfree cc -c (object output)
+# J jit-runner against the produced obj
+# E link + native exec
+# C --emit=c + host cc + native exec (C-source backend; see doc/CBACKEND.md)
+# C is opt-in because Phase 1 of the C backend skips most wasm cases; the
+# combined `make test-cbackend` invokes this runner with CFREE_TEST_PATHS=C.
+PATHS="${CFREE_TEST_PATHS:-WDOJE}"
+case "$PATHS" in *W*) RUN_W=1;; *) RUN_W=0;; esac
+case "$PATHS" in *D*) RUN_D=1;; *) RUN_D=0;; esac
+case "$PATHS" in *O*) RUN_O=1;; *) RUN_O=0;; esac
+case "$PATHS" in *J*) RUN_J=1;; *) RUN_J=0;; esac
+case "$PATHS" in *E*) RUN_E=1;; *) RUN_E=0;; esac
+case "$PATHS" in *C*) RUN_C=1;; *) RUN_C=0;; esac
+HOST_CC="${CC:-cc}"
+
mkdir -p "$BUILD_DIR"
pass=0
@@ -105,6 +123,25 @@ if [ "$TEST_OBJ" = "macho" ] && command -v xcrun >/dev/null 2>&1; then
fi
fi
+# Path C wrapper: emitted C exports `test_main` (i32 → maps to int32_t in
+# the C target); bridge to `int main(void)` via a tiny host-cc-compiled
+# stub linked alongside.
+C_WRAPPER_SRC="$BUILD_DIR/wasm_c_wrapper.c"
+C_WRAPPER_OBJ="$BUILD_DIR/wasm_c_wrapper.o"
+have_c_wrapper=0
+if [ "$RUN_C" -eq 1 ]; then
+ cat > "$C_WRAPPER_SRC" <<'EOF'
+/* Generated by test/wasm/run.sh — bridges main() to test_main(). */
+#include <stdint.h>
+extern int32_t test_main(void);
+int main(void) { return (int)test_main(); }
+EOF
+ if $HOST_CC -std=gnu99 -c "$C_WRAPPER_SRC" -o "$C_WRAPPER_OBJ" \
+ 2>"$BUILD_DIR/wasm_c_wrapper.err"; then
+ have_c_wrapper=1
+ fi
+fi
+
run_expect_rc() {
local label=$1
local expected=$2
@@ -163,54 +200,116 @@ for wat in "$CASES_DIR"/*.wat; do
wat_obj="$work/$name.wat.o"
wasm_obj="$work/$name.wasm.o"
- if [ "$have_wasm_tool" -eq 1 ]; then
- run_expect_zero "$name/W" "$WASM_TOOL" --wat2wasm "$wat" "$wasm"
- else
- note_skip "$name/W" "no wasm-tool"
- continue
+ if [ "$RUN_W" -eq 1 ]; then
+ if [ "$have_wasm_tool" -eq 1 ]; then
+ run_expect_zero "$name/W" "$WASM_TOOL" --wat2wasm "$wat" "$wasm"
+ else
+ note_skip "$name/W" "no wasm-tool"
+ continue
+ fi
+ elif [ "$have_wasm_tool" -eq 1 ]; then
+ # Path C needs the .wasm only if it consumes binary input; we use the
+ # .wat directly, so building .wasm is optional. Skip silently.
+ "$WASM_TOOL" --wat2wasm "$wat" "$wasm" >/dev/null 2>&1 || true
fi
- run_expect_rc "$name/D-wat" "$expected" "$CFREE_BIN" run -e test_main "$wat"
- run_expect_rc "$name/D-wasm" "$expected" "$CFREE_BIN" run -e test_main "$wasm"
+ if [ "$RUN_D" -eq 1 ]; then
+ run_expect_rc "$name/D-wat" "$expected" "$CFREE_BIN" run -e test_main "$wat"
+ run_expect_rc "$name/D-wasm" "$expected" "$CFREE_BIN" run -e test_main "$wasm"
+ fi
- run_expect_zero "$name/O-wat" "$CFREE_BIN" cc -target "$target_triple" -c \
- "$wat" -o "$wat_obj"
- run_expect_zero "$name/O-wasm" "$CFREE_BIN" cc -target "$target_triple" -c \
- "$wasm" -o "$wasm_obj"
+ if [ "$RUN_O" -eq 1 ]; then
+ run_expect_zero "$name/O-wat" "$CFREE_BIN" cc -target "$target_triple" -c \
+ "$wat" -o "$wat_obj"
+ run_expect_zero "$name/O-wasm" "$CFREE_BIN" cc -target "$target_triple" -c \
+ "$wasm" -o "$wasm_obj"
+ fi
- if [ "$have_jit_runner" -eq 1 ]; then
- run_expect_rc "$name/J-wat-obj" "$expected" env CFREE_TEST_ARCH="$TEST_ARCH" \
- CFREE_TEST_OBJ="$TEST_OBJ" "$JIT_RUNNER" "$wat_obj"
- run_expect_rc "$name/J-wasm-obj" "$expected" env CFREE_TEST_ARCH="$TEST_ARCH" \
- CFREE_TEST_OBJ="$TEST_OBJ" "$JIT_RUNNER" "$wasm_obj"
- else
- note_skip "$name/J" "host arch does not match target or no jit-runner"
+ if [ "$RUN_J" -eq 1 ]; then
+ if [ "$have_jit_runner" -eq 1 ]; then
+ run_expect_rc "$name/J-wat-obj" "$expected" env CFREE_TEST_ARCH="$TEST_ARCH" \
+ CFREE_TEST_OBJ="$TEST_OBJ" "$JIT_RUNNER" "$wat_obj"
+ run_expect_rc "$name/J-wasm-obj" "$expected" env CFREE_TEST_ARCH="$TEST_ARCH" \
+ CFREE_TEST_OBJ="$TEST_OBJ" "$JIT_RUNNER" "$wasm_obj"
+ else
+ note_skip "$name/J" "host arch does not match target or no jit-runner"
+ fi
fi
- if [ "$have_link_runner" -eq 1 ] && [ "$have_wasm_start_obj" -eq 1 ]; then
- exe="$work/$name.exe"
- if "$LINK_EXE_RUNNER" -o "$exe" "${MACHO_DSO_ARGS[@]}" "$wat_obj" "$WASM_START_OBJ" \
- >"$work/link.out" 2>"$work/link.err"; then
- if exec_target_supported "$EXEC_TAG"; then
- exec_target_run "$EXEC_TAG" "$exe" "$work/exec.out" "$work/exec.err"
- rc=$RUN_RC
- if [ "$rc" -eq "$expected" ]; then
- note_pass "$name/E"
+ if [ "$RUN_E" -eq 1 ]; then
+ if [ "$have_link_runner" -eq 1 ] && [ "$have_wasm_start_obj" -eq 1 ]; then
+ exe="$work/$name.exe"
+ if "$LINK_EXE_RUNNER" -o "$exe" "${MACHO_DSO_ARGS[@]}" "$wat_obj" "$WASM_START_OBJ" \
+ >"$work/link.out" 2>"$work/link.err"; then
+ if exec_target_supported "$EXEC_TAG"; then
+ exec_target_run "$EXEC_TAG" "$exe" "$work/exec.out" "$work/exec.err"
+ rc=$RUN_RC
+ if [ "$rc" -eq "$expected" ]; then
+ note_pass "$name/E"
+ else
+ note_fail "$name/E expected $expected got $rc"
+ fi
else
- note_fail "$name/E expected $expected got $rc"
+ note_skip "$name/E" "no execution support for $TEST_ARCH"
fi
else
- note_skip "$name/E" "no execution support for $TEST_ARCH"
+ note_fail "$name/E link failed"
fi
else
- note_fail "$name/E link failed"
+ note_skip "$name/E" "requires link runner and wasm start.o"
+ fi
+ fi
+
+ if [ "$RUN_C" -eq 1 ]; then
+ label="$name/C"
+ if [ "$have_c_wrapper" -eq 0 ]; then
+ note_skip "$label" "host CC wrapper unavailable"
+ elif [ "$host_matches" -eq 0 ]; then
+ note_skip "$label" "host arch != $TEST_ARCH (C target is target-locked)"
+ else
+ c_src="$work/$name.cfree.c"
+ c_bin="$work/$name.cbackend.bin"
+ c_emit_err="$work/c.emit.err"
+ c_cc_err="$work/c.cc.err"
+ # `cfree cc` uses the configured host triple by default. The wasm
+ # frontend doesn't require -target, but pass it explicitly so the
+ # emitted C is locked to the same arch as the host that will compile
+ # it. Skip on phased-rollout panics from the C target.
+ if ! "$CFREE_BIN" cc -target "$target_triple" --emit=c "$wat" \
+ -o "$c_src" >"$work/c.emit.out" 2>"$c_emit_err"; then
+ missing=$(grep -oE 'C target: .*(not implemented|not yet supported)' \
+ "$c_emit_err" 2>/dev/null | head -n1 || true)
+ if [ -n "$missing" ]; then
+ note_skip "$label" "$missing"
+ else
+ note_fail "$label cfree cc --emit=c failed"
+ sed 's/^/ | /' "$c_emit_err"
+ fi
+ elif ! $HOST_CC -std=gnu99 -Wno-main-return-type "$c_src" \
+ "$C_WRAPPER_OBJ" -o "$c_bin" \
+ >"$work/c.cc.out" 2>"$c_cc_err"; then
+ note_fail "$label host cc rejected emitted source"
+ sed 's/^/ | /' "$c_cc_err"
+ else
+ "$c_bin" >"$work/c.run.out" 2>"$work/c.run.err"
+ rc=$?
+ if [ "$rc" -eq "$expected" ]; then
+ note_pass "$label"
+ else
+ note_fail "$label expected $expected got $rc"
+ fi
+ fi
fi
- else
- note_skip "$name/E" "requires link runner and wasm start.o"
fi
done
+# trap/err/meta corpora are diagnostics-focused and orthogonal to the
+# C-source emit path; skip them when only RUN_C is requested.
+RUN_DIAG=0
+[ "$RUN_W" -eq 1 ] || [ "$RUN_D" -eq 1 ] || [ "$RUN_O" -eq 1 ] && RUN_DIAG=1
+
for wat in "$TRAP_DIR"/*.wat; do
+ [ "$RUN_DIAG" -eq 1 ] || break
[ -e "$wat" ] || continue
name=$(basename "$wat" .wat)
work="$BUILD_DIR/trap-$name"
@@ -229,13 +328,14 @@ for wat in "$TRAP_DIR"/*.wat; do
done
for wat in "$ERR_DIR"/*.wat; do
+ [ "$RUN_DIAG" -eq 1 ] || break
[ -e "$wat" ] || continue
name=$(basename "$wat" .wat)
run_expect_fail "err/$name/cc" "$CFREE_BIN" cc -target "$target_triple" -c \
"$wat" -o "$BUILD_DIR/err-$name.o"
done
-if [ "$have_wasm_tool" -eq 1 ]; then
+if [ "$RUN_DIAG" -eq 1 ] && [ "$have_wasm_tool" -eq 1 ]; then
for wat in "$META_DIR"/*.wat; do
[ -e "$wat" ] || continue
name=$(basename "$wat" .wat)
@@ -250,10 +350,11 @@ if [ "$have_wasm_tool" -eq 1 ]; then
-target "$target_triple" -c "$wasm" -o "$BUILD_DIR/meta-$name.o"
fi
done
-else
+elif [ "$RUN_DIAG" -eq 1 ]; then
note_skip "meta" "no wasm-tool"
fi
+if [ "$RUN_DIAG" -eq 1 ]; then
bad_wasm="$BUILD_DIR/malformed-section.wasm"
printf '\000asm\001\000\000\000\001\005\001\140' > "$bad_wasm"
run_expect_fail "err/malformed-section/wasm" "$CFREE_BIN" cc \
@@ -263,6 +364,7 @@ bad_leb="$BUILD_DIR/malformed-leb.wasm"
printf '\000asm\001\000\000\000\001\200\200\200\200\020' > "$bad_leb"
run_expect_fail "err/malformed-leb/wasm" "$CFREE_BIN" cc \
-target "$target_triple" -c "$bad_leb" -o "$BUILD_DIR/bad-leb.o"
+fi
printf 'test-wasm-front: pass=%d fail=%d skip=%d\n' "$pass" "$fail" "$skip"
[ "$fail" -eq 0 ]