kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 14960865061e5894dedd596b907dd15985883adb
parent 5891d2801fc96a50e2f725ab300076e622704cd4
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Thu, 21 May 2026 11:06:52 -0700

Make source frontends lifecycle-driven

Diffstat:
Mdoc/FRONTEND.md | 324+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------
Mdoc/api-migration.md | 10++++++----
Mdriver/cc.c | 35+++++++++++++++++++++++++++++++----
Mdriver/dbg.c | 10++++++++--
Mdriver/inputs.c | 25++++++++++++++++++++-----
Minclude/cfree/compile.h | 23+++++++++++------------
Mlang/c/c.c | 8++++----
Mlang/toy/compile.c | 8++++----
Mlang/wasm/wasm.c | 8++++----
Msrc/api/compile.c | 222++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
10 files changed, 454 insertions(+), 219 deletions(-)

diff --git a/doc/FRONTEND.md b/doc/FRONTEND.md @@ -1,55 +1,89 @@ -# Persistent Context for JIT REPL +# Interactive Frontend REPL -This plan outlines how to upgrade the single-pass C frontend to maintain a persistent state across multiple `jit` and `expr` evaluations in the debugger REPL. +This document tracks the work needed to make `cfree dbg` an ergonomic, +interactive REPL for registered source frontends. The goal is not just "append a +new object"; snippets must share language context, compile quickly, and make +bare expression input feel native for Toy, C, and Wasm. -## Background & Challenge -Currently, `cfree_c_compile` is a one-shot process. It creates a `Parser`, `Pp`, `Scope`, and `CfreeCg`, processes the input, finalizes the `CfreeObjBuilder`, and then frees all frontend state. To make the REPL truly "hackable", subsequent snippets must see structs, macros, and global variables defined in previous snippets. +## Current State -Since `cfree` has no AST and the frontend drives Code Generation (CG) directly, the "context" is entirely contained within the `Parser` symbol table, the `Pp` macro definitions, and the type/symbol handles cached in `DeclTable`. +- [x] Frontend registration uses a lifecycle vtable: + `new_frontend`, `compile`, `free_frontend`. +- [x] Public source frontend instances are explicit: + `cfree_frontend_new`, `cfree_frontend_compile`, `cfree_frontend_free`. +- [x] `cfree_compile_source_obj{,_emit}` has been removed. Callers now create a + `CfreeObjBuilder`, compile through a frontend instance, and emit/link the + resulting object explicitly. +- [x] `driver/cc.c`, `driver/inputs.c`, and `driver/dbg.c` have been migrated to + the new frontend instance API. +- [ ] `driver/dbg.c` still creates a fresh frontend per `jit` snippet. +- [ ] C, Toy, and Wasm frontend implementations still allocate their parser/CG + state per compile. +- [ ] Expression input still fabricates C source in `driver/dbg.c`; it is not + frontend native yet. +- [ ] Bare REPL input is not yet the default frontend-specific expression/thunk + fallback. -The primary obstacle to keeping the frontend alive is that `CfreeCgSym` handles are currently direct casts of `ObjSymId`. When a new snippet is compiled, a new `ObjBuilder` must be used (so we don't duplicate code output), which invalidates all previously cached `CfreeCgSym` handles. +## Target UX -## Proposed Changes +Interactive sessions should support these workflows: -### 1. Stateful Frontend API (Single Entrypoint) -To address the desire for a single entrypoint (rather than keeping the old one and adding an incremental one), we will change the fundamental frontend interface in `include/cfree/compile.h` to be stateful for *all* consumers. - -Instead of a single function pointer `CfreeCompileFn`, the frontend registration will provide a vtable or lifecycle functions: ```c -typedef struct CfreeFrontend CfreeFrontend; +(cfree) :language c +(cfree) jit { #define SCALE(x) ((x) * 3) } +(cfree) jit { typedef struct { int x; int y; } Point; Point p = {4, 5}; } +(cfree) SCALE(p.x + p.y) +$1 = 27 (0x1b) +``` + +```toy +(cfree) :language toy +(cfree) jit { type Point = record { x: i64, y: i64 }; let p: Point = .{ .x = 4, .y = 5 }; } +(cfree) p.x + p.y +$1 = 9 (0x9) +``` -typedef CfreeFrontend* (*CfreeFrontendNewFn)(CfreeCompiler*); -typedef CfreeStatus (*CfreeFrontendCompileFn)(CfreeFrontend*, const CfreeFrontendCompileOptions*, const CfreeSourceInput*, CfreeObjBuilder*); -typedef void (*CfreeFrontendFreeFn)(CfreeFrontend*); +```wat +(cfree) :language wat +(cfree) jit { (module (func (export "add") (param i64 i64) (result i64) local.get 0 local.get 1 i64.add)) } +(cfree) invoke add 4 5 +$1 = 9 (0x9) ``` -- **AOT Compiler (`driver/cc.c`)**: Creates the frontend, calls `compile` once, and frees it. -- **JIT REPL (`driver/dbg.c`)**: Creates the frontend at startup, calls `compile` repeatedly for each snippet (passing a new `CfreeObjBuilder` each time), and frees it on exit. -This keeps the parser (`Parser`), preprocessor (`Pp`, including macros), and scope (`Scope`) alive between snippets for the REPL, without duplicating the frontend API. +For C and Toy, unrecognized bare input should be the expression/thunk fallback +and should compile a language-native expression wrapper. An explicit `expr` +command can remain as an alias, but it is not the primary workflow. For Wasm, +the natural interactive unit is a module plus explicit export invocation; WAT +expression shortcuts can come later as sugar over generated modules. -### 2. Symbol Uniqueness (Cloning vs Decoupling) -Instead of fully decoupling `CfreeCgSym` from `ObjSymId` (which would require a lot of `CgApiState` mapping), we can keep `CfreeCgSym == ObjSymId` and achieve global uniqueness by carrying the symbol table forward across `ObjBuilder` instances. +## Shared Design -When the REPL calls `cfree_frontend_compile` for the second snippet, it passes a fresh `CfreeObjBuilder`. Inside the frontend: -- The frontend notices it already has a populated `CfreeCg` from the first snippet. -- We introduce `cfree_cg_swap_obj(CfreeCg* cg, CfreeObjBuilder* new_ob)`. -- Inside `swap_obj`, `CG` iterates over all its known symbols (`sym_types` array). For each symbol, it injects an *external, undefined* declaration into `new_ob` at the exact same `ObjSymId`. -- Because `new_ob` starts empty, inserting the existing symbols sequentially guarantees that `new_ob` assigns the exact same `ObjSymId`s to the exact same symbols! -- When snippet 2 references a variable from snippet 1, `CfreeCg` emits a relocation using the same `ObjSymId`. `new_ob` sees this as a relocation against an external symbol. -- The JIT linker automatically resolves this external symbol against the definition provided by snippet 1's object file. +### Frontend API -This requires minimal changes to `CG` and `ObjBuilder`, preserves `CfreeCgSym == ObjSymId`, and guarantees monotonic uniqueness across the session. +Landed public shape: -### 3. REPL Expression Modes -`expr` and bare REPL input should become frontend compile modes, not -driver-side source string rewriting. Today `driver/dbg.c` fabricates a C -translation unit that defines a uniquely named zero-argument thunk, compiles it, -looks up the thunk symbol, and calls it. That works for C syntax only and forces -the driver to recover declarations from DWARF or JIT symbol metadata. +```c +typedef struct CfreeFrontend CfreeFrontend; +typedef struct CfreeFrontendState CfreeFrontendState; -In the persistent frontend model, `dbg` should keep the language-specific -frontend instance alive and call the same `compile` entrypoint with an input -kind describing how the source text should be interpreted: +typedef CfreeFrontendState* (*CfreeFrontendNewFn)(CfreeCompiler*); +typedef CfreeStatus (*CfreeFrontendCompileFn)( + CfreeFrontendState*, + const CfreeFrontendCompileOptions*, + const CfreeSourceInput*, + CfreeObjBuilder*); +typedef void (*CfreeFrontendFreeFn)(CfreeFrontendState*); + +CfreeStatus cfree_frontend_new(CfreeCompiler*, CfreeLanguage, + CfreeFrontend** out); +CfreeStatus cfree_frontend_compile(CfreeFrontend*, + const CfreeFrontendCompileOptions*, + const CfreeSourceInput*, + CfreeObjBuilder* out); +void cfree_frontend_free(CfreeFrontend*); +``` + +Next required compile options: ```c typedef enum CfreeFrontendInputKind { @@ -64,65 +98,193 @@ typedef struct CfreeFrontendCompileOptions { CfreeDiagnosticOptions diagnostics; const void *language_options; CfreeFrontendInputKind input_kind; - const char *repl_entry_name; /* for REPL_EXPR / REPL_BLOCK */ + const char *repl_entry_name; } CfreeFrontendCompileOptions; ``` -- **`jit { ... }`** uses `CFREE_FRONTEND_INPUT_REPL_TOPLEVEL`: compile - language top-level declarations/statements into a fresh object while keeping - frontend state. -- **`expr ...` and raw bare input** use `CFREE_FRONTEND_INPUT_REPL_EXPR`: the - frontend wraps the text as a zero-argument function named `repl_entry_name` - and returns an integer-compatible value (`uint64_t` / `i64`) for the debugger - to print. -- **`expr { ... }`** uses `CFREE_FRONTEND_INPUT_REPL_BLOCK`: the frontend wraps - the body as a zero-argument function and lets the language define whether the - block must contain an explicit `return`. +### Persistent Codegen + +`CfreeCgSym` is still an `ObjSymId`, so a persistent frontend cannot simply keep +old symbol handles while compiling into a new object. The intended bridge is: + +- [ ] Add `cfree_cg_swap_obj(CfreeCg*, CfreeObjBuilder*, const CfreeCodeOptions*)`. +- [ ] On swap, finalize the previous target/MC/debug state before replacing it. +- [ ] Recreate target/MC/debug state for the new object builder. +- [ ] Seed the new object builder with external declarations for all known CG + symbols in the same `ObjSymId` order. +- [ ] Preserve `sym_types`, `sym_attrs`, type ids, readonly-data counters, and + any source-level symbol metadata owned by the frontend. +- [ ] Add focused tests proving a second object can relocate against a symbol + defined by a first JIT-appended object. + +The JIT append path already resolves undefined symbols against the existing +image, so once the new object contains matching undefined symbols, references to +prior snippets should link naturally. + +### Debugger Driver -Each frontend owns its own wrapper spelling. C can lower an expression as: +- [ ] Add a small per-language frontend cache to `DbgState`. +- [ ] Keep the selected language frontend alive for the whole REPL session. +- [ ] Compile `jit` snippets by creating a fresh object builder and calling + `cfree_frontend_compile` on the cached frontend. +- [ ] Add `:language c|toy|wat|wasm|asm` and make `jit`, explicit `expr`, and + bare fallback input honor the selected language. +- [ ] Add `input_kind` and `repl_entry_name` wiring for bare expression fallback, + explicit `expr`, and block modes. +- [ ] Remove `dbg_append_expr_prelude` and driver-side C thunk fabrication once + C expression mode is implemented. +- [ ] Keep DWARF/JIT symbol recovery for external/preexisting code inspection, + not as the normal path for declarations typed during the session. +- [ ] Add scripted `dbg` smoke tests that drive stdin and assert output. + +## C Checklist + +Persistent C context must include the preprocessor, file-scope identifiers, +tags, typedefs, declaration table, and CG symbol/type handles. + +- [ ] Change `CFrontend` to own a long-lived `Pool`. +- [ ] Change `CFrontend` to own a long-lived `Pp`. +- [ ] Apply command-line include paths, predefined macros, `-D`, and `-U` once + at frontend creation or first compile, with clear behavior if options change. +- [ ] Keep the file-scope `Scope` alive across snippets. +- [ ] Keep `DeclTable` alive across snippets. +- [ ] Keep one persistent `CfreeCg` and use `cfree_cg_swap_obj` per snippet. +- [ ] Split parser initialization from translation-unit parsing so a parser can + reuse file-scope state with a new lexer. +- [ ] Ensure failed snippets do not corrupt persistent scope or macro state. + Initial implementation may mark a session frontend poisoned after hard errors. +- [ ] Implement `CFREE_FRONTEND_INPUT_REPL_TOPLEVEL` for normal declarations and + definitions. +- [ ] Implement `CFREE_FRONTEND_INPUT_REPL_EXPR` by wrapping the expression as: ```c -unsigned long long __cfree_dbg_expr_1(void) { +unsigned long long __cfree_dbg_expr_N(void) { return (unsigned long long)(USER_EXPR); } ``` -Toy can lower the same REPL request as: +- [ ] Implement `CFREE_FRONTEND_INPUT_REPL_BLOCK` by wrapping a block as: + +```c +unsigned long long __cfree_dbg_expr_N(void) { + USER_STATEMENTS +} +``` + +- [ ] Decide whether block mode requires an explicit `return` or permits + expression-final shorthand. +- [ ] Support macros across snippets: + `jit { #define N 7 }`, then bare `N + 1`. +- [ ] Support typedefs/tags across snippets: + `jit { typedef struct { int x; } S; S s = {41}; }`, then bare `s.x + 1`. +- [ ] Support function definitions across snippets: + `jit { int f(int x) { return x + 1; } }`, then bare `f(41)`. +- [ ] Diagnose strong redefinition cleanly when a later snippet defines the same + global function/object. +- [ ] Add targeted tests under a new `test/dbg` or `test/repl` harness. + +## Toy Checklist + +Toy is the best first full REPL target because it already sits cleanly on the +public CG API and has explicit source syntax for types, globals, functions, and +expressions. + +- [ ] Change `ToyFrontend` to own persistent symbol/type storage instead of + rebuilding all parser state per compile. +- [ ] Keep one persistent `CfreeCg` and use `cfree_cg_swap_obj` per snippet. +- [ ] Refactor `ToyParser` so lexical state is per snippet but declarations, + record/enum/type tables, globals, and function symbols live on `ToyFrontend`. +- [ ] Implement `CFREE_FRONTEND_INPUT_REPL_TOPLEVEL` for declarations and + definitions. +- [ ] Implement `CFREE_FRONTEND_INPUT_REPL_EXPR` by generating: ```toy -fn __cfree_dbg_expr_1(): i64 { +fn __cfree_dbg_expr_N(): i64 { return USER_EXPR as i64; } ``` -Because the wrapper is compiled by the persistent frontend, it naturally sees -macros, typedefs, functions, globals, and Toy declarations entered earlier in -the session. `dbg` only needs to generate the unique entry name, compile the -fresh object, append it to the JIT image, look up the entry symbol, and execute -it through `CfreeJitSession`. +- [ ] Implement `CFREE_FRONTEND_INPUT_REPL_BLOCK` using Toy block/function + syntax and require an explicit return in v1. +- [ ] Preserve global variables across snippets: + `jit { let x: i64 = 41; }`, then bare `x + 1`. +- [ ] Preserve nominal records/enums/type aliases across snippets. +- [ ] Preserve functions across snippets, including calls from later snippets. +- [ ] Add diagnostics for duplicate definitions and type mismatches that include + the snippet input name. +- [ ] Add Toy REPL smoke tests: + globals, functions, record field access, enum constants, expression wrapper, + and block wrapper. + +## Wasm Checklist + +Wasm is different from C/Toy: the user normally supplies complete WAT/Wasm +modules, not declarations in a source namespace. The ergonomic REPL target is a +module/session model with export invocation and instance-owned runtime state. + +- [ ] Decide v1 interaction model: + module append plus `invoke EXPORT ARGS...`, not arbitrary Wasm expression + snippets. +- [ ] Keep `WasmFrontend` lifecycle-shaped but treat most parser/module state as + per compile unless a specific cross-module context is introduced. +- [ ] Preserve instance/runtime state for appended modules where supported: + memories, tables, globals, start/init calls, and import slots. +- [ ] Add `dbg` command support for invoking Wasm exports by name with typed + integer/float arguments. +- [ ] Define how duplicate export names are handled across appended modules: + reject, shadow by generation, or require module qualification. +- [ ] Add module qualification in symbol lookup if multiple modules can export + the same name. +- [ ] Add clear diagnostics for unsupported interactive cases: + relocatable Wasm object input, multi-memory gaps, unsupported proposals, WASI + startup, and wasm64. +- [ ] Implement optional `CFREE_FRONTEND_INPUT_REPL_EXPR` later as WAT sugar, + lowering an expression into a generated module/function. +- [ ] Add Wasm REPL smoke tests: + WAT module append, export invocation, start function behavior, memory/data + persistence, imported function call, and duplicate export diagnostics. + +## Shared Acceptance Tests + +- [ ] `make bin` builds after each milestone. +- [ ] `make test-toy` stays green after Toy refactors. +- [ ] `make test-parse test-parse-err test-pp test-pp-err` stay green after C + refactors. +- [ ] `make test-wasm-front` stays green after Wasm REPL command work. +- [ ] New scripted REPL tests cover: + C macro/type/global persistence, Toy type/global/function persistence, Wasm + module invoke, bare expression fallback, explicit `expr` alias behavior, block + wrappers, duplicate definitions, and clean diagnostics after failed snippets. +- [ ] Manual smoke: + +```text +cfree dbg test.c +(cfree) :language c +(cfree) jit { typedef struct { int a; int b; } Point; Point p = {1, 2}; } +(cfree) p.a + p.b +$1 = 3 (0x3) +``` + +```text +cfree dbg test.toy +(cfree) :language toy +(cfree) jit { let x: i64 = 40; fn inc(v: i64): i64 { return v + 1; } } +(cfree) inc(x) + 1 +$1 = 42 (0x2a) +``` + +```text +cfree dbg empty.c +(cfree) :language wat +(cfree) jit { (module (func (export "answer") (result i64) i64.const 42)) } +(cfree) invoke answer +$1 = 42 (0x2a) +``` + +## Notes DWARF recovery remains useful for inspecting preexisting objects and external -debug info, but it should not be the primary mechanism for normal REPL -expressions. The persistent frontend has strictly better source-level context -for anything typed during the session. - -### Alternative Discussed: DWARF Recovery -*Tradeoffs of recovering from DWARF instead of a stateful frontend:* -We *could* keep the frontend single-shot and recover types/symbols by reading DWARF from the JIT session (like LLDB does). -- **Pros**: Zero state in the compiler; naturally allows scripting against external binaries (C++ libs). -- **Cons**: Macros (`#define`) are lost between snippets (unless `-g3` is forced and parsed); DWARF-to-C-AST reconstruction is lossy (typedefs, specific attrs); it's vastly more complex to write a DWARF AST importer than to just keep the `Parser` struct alive in memory. -Given the constraints, making the frontend stateful is vastly simpler and preserves 100% of the C context. - -## User Review Required -> [!IMPORTANT] -> - Do you approve of changing the primary `CfreeCompileFn` interface to an object-oriented `new`/`compile`/`free` lifecycle? -> - Do you approve of the "carry-forward external declarations" trick to maintain `ObjSymId` alignment across successive `ObjBuilder`s without decoupling them? - -## Verification Plan -1. Compile the toolchain with `make bin`. -2. Run `cfree dbg` and test REPL state persistence: - ```c - (cfree) jit { typedef struct { int a; int b; } Point; Point p = {1, 2}; } - (cfree) expr { return p.a + p.b; } - ``` - The second command should successfully compile and evaluate without complaining about undeclared identifiers. +debug info. It should not be the primary mechanism for normal REPL expressions +typed during the current session. The persistent frontend has better +source-level context for macros, typedefs, source-language type aliases, +front-end-only attributes, and frontend-specific syntax. diff --git a/doc/api-migration.md b/doc/api-migration.md @@ -63,7 +63,7 @@ typedef struct CfreeBytes { /* used everywhere except source compil size_t len; } CfreeBytes; -typedef struct CfreeSourceInput { /* used by cfree_compile_source_obj* */ +typedef struct CfreeSourceInput { /* used by cfree_frontend_compile */ CfreeBytes bytes; CfreeLanguage lang; } CfreeSourceInput; @@ -113,9 +113,11 @@ typedef struct CfreeFrontendCompileOptions { ``` `cfree_compile_obj` → `cfree_compile_c_obj`, `cfree_compile_asm_obj`, -`cfree_compile_source_obj` (registered frontend). All have `_emit` -variants. Frontend hook signature is now -`CfreeStatus (*)(CfreeCompiler*, const CfreeFrontendCompileOptions*, const CfreeSourceInput*, CfreeObjBuilder*)`. +registered frontends are driven through `cfree_frontend_new`, +`cfree_frontend_compile`, and `cfree_frontend_free`. The old +`cfree_compile_source_obj{,_emit}` convenience entrypoints were removed. +Frontend hook signature is now +`CfreeStatus (*)(CfreeFrontendState*, const CfreeFrontendCompileOptions*, const CfreeSourceInput*, CfreeObjBuilder*)`. ### Status-returning APIs diff --git a/driver/cc.c b/driver/cc.c @@ -1967,13 +1967,24 @@ static int cc_run_compile_one(DriverEnv* env, const CcOptions* o, st = cfree_compile_asm_obj_emit(compiler, &aopts, &input, obj_w); } else { CfreeFrontendCompileOptions fopts = {0}; + CfreeFrontend *frontend = NULL; + CfreeObjBuilder *ob = NULL; CfreeSourceInput sin; fopts.code = copts.code; fopts.diagnostics = copts.diagnostics; fopts.language_options = &copts; sin.bytes = input; sin.lang = lang; - st = cfree_compile_source_obj_emit(compiler, &fopts, &sin, obj_w); + st = cfree_frontend_new(compiler, lang, &frontend); + if (st == CFREE_OK) st = cfree_obj_builder_new(compiler, &ob); + if (st == CFREE_OK) { + st = cfree_frontend_compile(frontend, &fopts, &sin, ob); + } + if (st == CFREE_OK && !fopts.code.emit_c_source) { + st = cfree_obj_builder_emit(ob, obj_w); + } + cfree_obj_builder_free(ob); + cfree_frontend_free(frontend); } if (st != CFREE_OK) goto out; } @@ -2154,25 +2165,41 @@ static int cc_run_link_exe(DriverEnv* env, const CcOptions* o, st = cfree_compile_asm_obj(compiler, &aopts, &src_bytes[i], &objs[i]); } else { CfreeFrontendCompileOptions fopts = {0}; + CfreeFrontend *frontend = NULL; CfreeSourceInput sin; fopts.code = copts.code; fopts.diagnostics = copts.diagnostics; fopts.language_options = &copts; sin.bytes = src_bytes[i]; sin.lang = lang; - st = cfree_compile_source_obj(compiler, &fopts, &sin, &objs[i]); + st = cfree_frontend_new(compiler, lang, &frontend); + if (st == CFREE_OK) st = cfree_obj_builder_new(compiler, &objs[i]); + if (st == CFREE_OK) { + st = cfree_frontend_compile(frontend, &fopts, &sin, objs[i]); + } + cfree_frontend_free(frontend); } if (st != CFREE_OK) goto out; } for (i = 0; i < o->nsource_memory; ++i) { CfreeFrontendCompileOptions fopts; CfreeFrontendCompileOptions z = {0}; + CfreeFrontend *frontend = NULL; + CfreeStatus st; fopts = z; fopts.code = copts.code; fopts.diagnostics = copts.diagnostics; fopts.language_options = &copts; - if (cfree_compile_source_obj(compiler, &fopts, &o->source_memory[i], - &objs[o->nsource_files + i]) != CFREE_OK) + st = cfree_frontend_new(compiler, o->source_memory[i].lang, &frontend); + if (st == CFREE_OK) { + st = cfree_obj_builder_new(compiler, &objs[o->nsource_files + i]); + } + if (st == CFREE_OK) { + st = cfree_frontend_compile(frontend, &fopts, &o->source_memory[i], + objs[o->nsource_files + i]); + } + cfree_frontend_free(frontend); + if (st != CFREE_OK) goto out; } diff --git a/driver/dbg.c b/driver/dbg.c @@ -1617,7 +1617,9 @@ static int dbg_jit_compile_append(DbgState *s, CfreeLanguage lang, size_t len) { CfreeSourceInput sin; CfreeFrontendCompileOptions fopts; + CfreeFrontend *frontend = NULL; CfreeObjBuilder *ob = NULL; + CfreeStatus st; s->jit_counter++; sin.bytes.name = input_name ? input_name : dbg_jit_default_name(lang); @@ -1631,8 +1633,12 @@ static int dbg_jit_compile_append(DbgState *s, CfreeLanguage lang, fopts.code = s->copts.code; fopts.diagnostics = s->copts.diagnostics; fopts.language_options = &s->copts; - if (cfree_compile_source_obj(s->compiler, &fopts, &sin, &ob) != CFREE_OK || - !ob) { + st = cfree_frontend_new(s->compiler, lang, &frontend); + if (st == CFREE_OK) st = cfree_obj_builder_new(s->compiler, &ob); + if (st == CFREE_OK) st = cfree_frontend_compile(frontend, &fopts, &sin, ob); + cfree_frontend_free(frontend); + if (st != CFREE_OK || !ob) { + if (ob) cfree_obj_builder_free(ob); driver_errf(DBG_TOOL, "jit compile failed"); return 1; } diff --git a/driver/inputs.c b/driver/inputs.c @@ -158,8 +158,7 @@ int driver_inputs_compile_and_jit(DriverInputs *in, CfreeCompiler *compiler, } } - /* Load source files into CfreeBytes and compile via cfree_compile_c_obj. - * In-memory sources (stdin) compile through cfree_compile_source_obj. */ + /* Load source files into CfreeBytes and compile them into object builders. */ for (i = 0; i < in->nsources; ++i) { if (driver_load_bytes(io, tool, in->sources[i], &src_lf[i], &src_bytes[i]) != 0) @@ -178,13 +177,19 @@ int driver_inputs_compile_and_jit(DriverInputs *in, CfreeCompiler *compiler, st = cfree_compile_asm_obj(compiler, &aopts, &src_bytes[i], &objs[i]); } else { CfreeFrontendCompileOptions fopts = {0}; + CfreeFrontend *frontend = NULL; CfreeSourceInput sin; fopts.code = copts->code; fopts.diagnostics = copts->diagnostics; fopts.language_options = copts; sin.bytes = src_bytes[i]; sin.lang = lang; - st = cfree_compile_source_obj(compiler, &fopts, &sin, &objs[i]); + st = cfree_frontend_new(compiler, lang, &frontend); + if (st == CFREE_OK) st = cfree_obj_builder_new(compiler, &objs[i]); + if (st == CFREE_OK) { + st = cfree_frontend_compile(frontend, &fopts, &sin, objs[i]); + } + cfree_frontend_free(frontend); } if (st != CFREE_OK) goto out; } @@ -193,12 +198,22 @@ int driver_inputs_compile_and_jit(DriverInputs *in, CfreeCompiler *compiler, * lang tagging is honoured. */ CfreeFrontendCompileOptions fopts; CfreeFrontendCompileOptions z = {0}; + CfreeFrontend *frontend = NULL; + CfreeStatus st; fopts = z; fopts.code = copts->code; fopts.diagnostics = copts->diagnostics; fopts.language_options = copts; - if (cfree_compile_source_obj(compiler, &fopts, &in->source_memory[i], - &objs[in->nsources + i]) != CFREE_OK) + st = cfree_frontend_new(compiler, in->source_memory[i].lang, &frontend); + if (st == CFREE_OK) { + st = cfree_obj_builder_new(compiler, &objs[in->nsources + i]); + } + if (st == CFREE_OK) { + st = cfree_frontend_compile(frontend, &fopts, &in->source_memory[i], + objs[in->nsources + i]); + } + cfree_frontend_free(frontend); + if (st != CFREE_OK) goto out; } diff --git a/include/cfree/compile.h b/include/cfree/compile.h @@ -67,12 +67,13 @@ typedef struct CfreeFrontendCompileOptions { } CfreeFrontendCompileOptions; typedef struct CfreeFrontend CfreeFrontend; +typedef struct CfreeFrontendState CfreeFrontendState; -typedef CfreeFrontend *(*CfreeFrontendNewFn)(CfreeCompiler *); +typedef CfreeFrontendState *(*CfreeFrontendNewFn)(CfreeCompiler *); typedef CfreeStatus (*CfreeFrontendCompileFn)( - CfreeFrontend *, const CfreeFrontendCompileOptions *, + CfreeFrontendState *, const CfreeFrontendCompileOptions *, const CfreeSourceInput *, CfreeObjBuilder *out); -typedef void (*CfreeFrontendFreeFn)(CfreeFrontend *); +typedef void (*CfreeFrontendFreeFn)(CfreeFrontendState *); typedef struct CfreeFrontendVTable { CfreeFrontendNewFn new_frontend; @@ -83,6 +84,13 @@ typedef struct CfreeFrontendVTable { CfreeLanguage cfree_language_for_path(const char *path); CfreeStatus cfree_register_frontend(CfreeCompiler *, CfreeLanguage, const CfreeFrontendVTable *); +CfreeStatus cfree_frontend_new(CfreeCompiler *, CfreeLanguage, + CfreeFrontend **out); +CfreeStatus cfree_frontend_compile(CfreeFrontend *, + const CfreeFrontendCompileOptions *, + const CfreeSourceInput *, + CfreeObjBuilder *out); +void cfree_frontend_free(CfreeFrontend *); uint32_t cfree_compiler_arch_predefines(CfreeCompiler *, const CfreePredefinedMacro **out); @@ -98,15 +106,6 @@ CfreeStatus cfree_compile_asm_obj(CfreeCompiler *, CfreeStatus cfree_compile_asm_obj_emit(CfreeCompiler *, const CfreeAsmCompileOptions *, const CfreeBytes *, CfreeWriter *out); -CfreeStatus cfree_compile_source_obj(CfreeCompiler *, - const CfreeFrontendCompileOptions *, - const CfreeSourceInput *, - CfreeObjBuilder **out); -CfreeStatus cfree_compile_source_obj_emit(CfreeCompiler *, - const CfreeFrontendCompileOptions *, - const CfreeSourceInput *, - CfreeWriter *out); - typedef struct CfreeDepIter CfreeDepIter; typedef struct CfreeDepEdge { diff --git a/lang/c/c.c b/lang/c/c.c @@ -179,7 +179,7 @@ CfreeStatus cfree_c_dump_tokens(CfreeCompiler* c, const CfreeBytes* input, return cfree_frontend_run(c, c_dump_tokens_body, &r); } -static CfreeFrontend* c_frontend_new(CfreeCompiler* c) { +static CfreeFrontendState* c_frontend_new(CfreeCompiler* c) { CfreeHeap* h; CFrontend* fe; if (!c) return NULL; @@ -187,10 +187,10 @@ static CfreeFrontend* c_frontend_new(CfreeCompiler* c) { fe = (CFrontend*)h->alloc(h, sizeof(*fe), _Alignof(CFrontend)); if (!fe) return NULL; fe->c = c; - return (CfreeFrontend*)fe; + return (CfreeFrontendState*)fe; } -static CfreeStatus c_frontend_compile(CfreeFrontend* frontend, +static CfreeStatus c_frontend_compile(CfreeFrontendState* frontend, const CfreeFrontendCompileOptions* fe_opts, const CfreeSourceInput* input, CfreeObjBuilder* out) { @@ -264,7 +264,7 @@ static CfreeStatus c_frontend_compile(CfreeFrontend* frontend, return CFREE_OK; } -static void c_frontend_free(CfreeFrontend* frontend) { +static void c_frontend_free(CfreeFrontendState* frontend) { CFrontend* fe = (CFrontend*)frontend; CfreeHeap* h; if (!fe) return; diff --git a/lang/toy/compile.c b/lang/toy/compile.c @@ -6,7 +6,7 @@ typedef struct ToyFrontend { CfreeCompiler* c; } ToyFrontend; -static CfreeFrontend* toy_frontend_new(CfreeCompiler* c) { +static CfreeFrontendState* toy_frontend_new(CfreeCompiler* c) { CfreeHeap* h; ToyFrontend* fe; if (!c) return NULL; @@ -14,11 +14,11 @@ static CfreeFrontend* toy_frontend_new(CfreeCompiler* c) { fe = (ToyFrontend*)h->alloc(h, sizeof(*fe), _Alignof(ToyFrontend)); if (!fe) return NULL; fe->c = c; - return (CfreeFrontend*)fe; + return (CfreeFrontendState*)fe; } static CfreeStatus toy_frontend_compile( - CfreeFrontend* frontend, const CfreeFrontendCompileOptions* opts, + CfreeFrontendState* frontend, const CfreeFrontendCompileOptions* opts, const CfreeSourceInput* input, CfreeObjBuilder* out) { ToyFrontend* fe = (ToyFrontend*)frontend; CfreeCompiler* c; @@ -56,7 +56,7 @@ static CfreeStatus toy_frontend_compile( return CFREE_OK; } -static void toy_frontend_free(CfreeFrontend* frontend) { +static void toy_frontend_free(CfreeFrontendState* frontend) { ToyFrontend* fe = (ToyFrontend*)frontend; CfreeHeap* h; if (!fe) return; diff --git a/lang/wasm/wasm.c b/lang/wasm/wasm.c @@ -13,7 +13,7 @@ typedef struct WasmFrontend { CfreeCompiler* c; } WasmFrontend; -static CfreeFrontend* wasm_frontend_new(CfreeCompiler* c) { +static CfreeFrontendState* wasm_frontend_new(CfreeCompiler* c) { CfreeHeap* h; WasmFrontend* fe; if (!c) return NULL; @@ -21,11 +21,11 @@ static CfreeFrontend* wasm_frontend_new(CfreeCompiler* c) { fe = (WasmFrontend*)h->alloc(h, sizeof(*fe), _Alignof(WasmFrontend)); if (!fe) return NULL; fe->c = c; - return (CfreeFrontend*)fe; + return (CfreeFrontendState*)fe; } static CfreeStatus wasm_frontend_compile( - CfreeFrontend* frontend, const CfreeFrontendCompileOptions* opts, + CfreeFrontendState* frontend, const CfreeFrontendCompileOptions* opts, const CfreeSourceInput* input, CfreeObjBuilder* out) { WasmFrontend* fe = (WasmFrontend*)frontend; CfreeCompiler* c; @@ -40,7 +40,7 @@ static CfreeStatus wasm_frontend_compile( return CFREE_OK; } -static void wasm_frontend_free(CfreeFrontend* frontend) { +static void wasm_frontend_free(CfreeFrontendState* frontend) { WasmFrontend* fe = (WasmFrontend*)frontend; CfreeHeap* h; if (!fe) return; diff --git a/src/api/compile.c b/src/api/compile.c @@ -17,12 +17,19 @@ typedef struct AsmFrontend { Compiler* c; } AsmFrontend; -static CfreeFrontend* asm_frontend_new(CfreeCompiler* c); -static CfreeStatus asm_frontend_compile(CfreeFrontend* fe, +struct CfreeFrontend { + CfreeCompiler* c; + CfreeLanguage lang; + const CfreeFrontendVTable* vtable; + CfreeFrontendState* state; +}; + +static CfreeFrontendState* asm_frontend_new(CfreeCompiler* c); +static CfreeStatus asm_frontend_compile(CfreeFrontendState* fe, const CfreeFrontendCompileOptions* opts, const CfreeSourceInput* input, CfreeObjBuilder* out); -static void asm_frontend_free(CfreeFrontend* fe); +static void asm_frontend_free(CfreeFrontendState* fe); static const CfreeFrontendVTable asm_frontend_vtable = { asm_frontend_new, @@ -124,9 +131,85 @@ static const CfreeFrontendVTable* frontend_for_language(Compiler* c, return NULL; } -typedef struct FrontendCleanup { +static void validate_bytes(Compiler* c, const CfreeBytes* in); +static void compile_frontend_state_into(Compiler* c, + const CfreeFrontendVTable* vtable, + CfreeFrontendState* frontend, + const CfreeFrontendCompileOptions* opts, + const CfreeSourceInput* input, + ObjBuilder* ob); + +CfreeStatus cfree_frontend_new(CfreeCompiler* c, CfreeLanguage lang, + CfreeFrontend** out) { const CfreeFrontendVTable* vtable; + CfreeFrontendState* state; CfreeFrontend* frontend; + Heap* h; + + if (!out) return CFREE_INVALID; + *out = NULL; + if (!c || (unsigned)lang >= CFREE_LANG_COUNT) return CFREE_INVALID; + vtable = frontend_for_language((Compiler*)c, lang); + if (!vtable) return CFREE_UNSUPPORTED; + state = vtable->new_frontend(c); + if (!state) return CFREE_NOMEM; + h = (Heap*)c->ctx->heap; + frontend = + (CfreeFrontend*)h->alloc(h, sizeof(*frontend), _Alignof(CfreeFrontend)); + if (!frontend) { + vtable->free_frontend(state); + return CFREE_NOMEM; + } + frontend->c = c; + frontend->lang = lang; + frontend->vtable = vtable; + frontend->state = state; + *out = frontend; + return CFREE_OK; +} + +CfreeStatus cfree_frontend_compile(CfreeFrontend* frontend, + const CfreeFrontendCompileOptions* opts, + const CfreeSourceInput* input, + CfreeObjBuilder* out) { + Compiler* c; + PanicSave saved; + + if (!frontend || !frontend->c || !frontend->vtable || !frontend->state || + !opts || !input || !out) { + return CFREE_INVALID; + } + if (input->lang != frontend->lang) return CFREE_INVALID; + c = (Compiler*)frontend->c; + compiler_panic_save(c, &saved); + if (setjmp(c->panic)) { + compiler_run_cleanups(c); + compiler_panic_restore(c, &saved); + return CFREE_ERR; + } + validate_bytes(c, &input->bytes); + metrics_scope_begin(c, "compile.tu"); + metrics_count(c, "compile.input_bytes", (u64)input->bytes.len); + compile_frontend_state_into(c, frontend->vtable, frontend->state, opts, input, + (ObjBuilder*)out); + metrics_scope_end(c, "compile.tu"); + compiler_panic_restore(c, &saved); + return CFREE_OK; +} + +void cfree_frontend_free(CfreeFrontend* frontend) { + Heap* h; + if (!frontend) return; + if (frontend->vtable && frontend->state) { + frontend->vtable->free_frontend(frontend->state); + } + h = (Heap*)frontend->c->ctx->heap; + h->free(h, frontend, sizeof(*frontend)); +} + +typedef struct FrontendCleanup { + const CfreeFrontendVTable* vtable; + CfreeFrontendState* frontend; } FrontendCleanup; static void frontend_cleanup_run(void* arg) { @@ -137,16 +220,36 @@ static void frontend_cleanup_run(void* arg) { } } -static void compile_frontend_into(Compiler* c, - const CfreeFrontendVTable* vtable, - const CfreeFrontendCompileOptions* opts, - const CfreeSourceInput* input, - ObjBuilder* ob) { - CfreeFrontend* frontend; - FrontendCleanup* cleanup; - CompilerCleanup* cleanup_node; +static void compile_frontend_state_into(Compiler* c, + const CfreeFrontendVTable* vtable, + CfreeFrontendState* frontend, + const CfreeFrontendCompileOptions* opts, + const CfreeSourceInput* input, + ObjBuilder* ob) { CfreeStatus st; + metrics_scope_begin(c, "compile.frontend"); + st = vtable->compile(frontend, opts, input, ob); + metrics_scope_end(c, "compile.frontend"); + if (st != CFREE_OK) { + compiler_panic(c, no_loc(), "frontend failed for input: %s", + input->bytes.name); + } + + metrics_scope_begin(c, "compile.obj_finalize"); + obj_finalize(ob); + metrics_scope_end(c, "compile.obj_finalize"); + metrics_count(c, "compile.obj_sections", obj_section_count(ob)); + metrics_count(c, "compile.obj_relocs", obj_reloc_total(ob)); +} + +static void compile_frontend_oneshot_into( + Compiler* c, const CfreeFrontendVTable* vtable, + const CfreeFrontendCompileOptions* opts, const CfreeSourceInput* input, + ObjBuilder* ob) { + CfreeFrontendState* frontend; + FrontendCleanup* cleanup; + CompilerCleanup* cleanup_node; if (!vtable) { compiler_panic(c, no_loc(), "no frontend registered for language: %u", (u32)input->lang); @@ -170,31 +273,10 @@ static void compile_frontend_into(Compiler* c, vtable->free_frontend(frontend); compiler_panic(c, no_loc(), "frontend cleanup allocation failed"); } - - metrics_scope_begin(c, "compile.frontend"); - st = vtable->compile(frontend, opts, input, ob); - metrics_scope_end(c, "compile.frontend"); + compile_frontend_state_into(c, vtable, frontend, opts, input, ob); compiler_undefer(c, cleanup_node); vtable->free_frontend(frontend); cleanup->frontend = NULL; - if (st != CFREE_OK) { - compiler_panic(c, no_loc(), "frontend failed for input: %s", - input->bytes.name); - } - - metrics_scope_begin(c, "compile.obj_finalize"); - obj_finalize(ob); - metrics_scope_end(c, "compile.obj_finalize"); - metrics_count(c, "compile.obj_sections", obj_section_count(ob)); - metrics_count(c, "compile.obj_relocs", obj_reloc_total(ob)); -} - -/* Run the source-input-shaped path. */ -static void compile_source_into(Compiler* c, - const CfreeFrontendCompileOptions* opts, - const CfreeSourceInput* input, ObjBuilder* ob) { - compile_frontend_into(c, frontend_for_language(c, input->lang), opts, input, - ob); } /* ============================================================ @@ -218,7 +300,7 @@ static CfreeStatus compile_c_into(Compiler* c, const CfreeCCompileOptions* opts, si.bytes = *input; si.lang = CFREE_LANG_C; - compile_frontend_into(c, frontend, &fe, &si, ob); + compile_frontend_oneshot_into(c, frontend, &fe, &si, ob); return CFREE_OK; } @@ -283,7 +365,7 @@ CfreeStatus cfree_compile_c_obj_emit(CfreeCompiler* c, * Asm * ============================================================ */ -static CfreeFrontend* asm_frontend_new(CfreeCompiler* c) { +static CfreeFrontendState* asm_frontend_new(CfreeCompiler* c) { Heap* h; AsmFrontend* fe; if (!c) return NULL; @@ -291,10 +373,10 @@ static CfreeFrontend* asm_frontend_new(CfreeCompiler* c) { fe = (AsmFrontend*)h->alloc(h, sizeof(*fe), _Alignof(AsmFrontend)); if (!fe) return NULL; fe->c = c; - return (CfreeFrontend*)fe; + return (CfreeFrontendState*)fe; } -static CfreeStatus asm_frontend_compile(CfreeFrontend* frontend, +static CfreeStatus asm_frontend_compile(CfreeFrontendState* frontend, const CfreeFrontendCompileOptions* opts, const CfreeSourceInput* input, CfreeObjBuilder* out) { @@ -321,7 +403,7 @@ static CfreeStatus asm_frontend_compile(CfreeFrontend* frontend, return CFREE_OK; } -static void asm_frontend_free(CfreeFrontend* frontend) { +static void asm_frontend_free(CfreeFrontendState* frontend) { AsmFrontend* fe = (AsmFrontend*)frontend; Heap* h; if (!fe) return; @@ -357,7 +439,7 @@ CfreeStatus cfree_compile_asm_obj(CfreeCompiler* c, fe.language_options = opts; si.bytes = *input; si.lang = CFREE_LANG_ASM; - compile_frontend_into(c, &asm_frontend_vtable, &fe, &si, ob); + compile_frontend_oneshot_into(c, &asm_frontend_vtable, &fe, &si, ob); } metrics_scope_end(c, "compile.tu"); *out = ob; @@ -390,7 +472,7 @@ CfreeStatus cfree_compile_asm_obj_emit(CfreeCompiler* c, fe.language_options = opts; si.bytes = *input; si.lang = CFREE_LANG_ASM; - compile_frontend_into(c, &asm_frontend_vtable, &fe, &si, ob); + compile_frontend_oneshot_into(c, &asm_frontend_vtable, &fe, &si, ob); } emit_object_bytes(c, ob, out); metrics_scope_end(c, "compile.tu"); @@ -399,64 +481,6 @@ CfreeStatus cfree_compile_asm_obj_emit(CfreeCompiler* c, return CFREE_OK; } -/* ============================================================ - * Source (registered frontend) - * ============================================================ */ - -CfreeStatus cfree_compile_source_obj(CfreeCompiler* c, - const CfreeFrontendCompileOptions* opts, - const CfreeSourceInput* input, - CfreeObjBuilder** out) { - PanicSave saved; - ObjBuilder* ob; - - if (!out) return CFREE_INVALID; - *out = NULL; - if (!c || !opts || !input) return CFREE_INVALID; - compiler_panic_save(c, &saved); - if (setjmp(c->panic)) { - compiler_run_cleanups(c); - compiler_panic_restore(c, &saved); - return CFREE_ERR; - } - validate_bytes(c, &input->bytes); - ob = obj_new(c); - metrics_scope_begin(c, "compile.tu"); - metrics_count(c, "compile.input_bytes", (u64)input->bytes.len); - compile_source_into(c, opts, input, ob); - metrics_scope_end(c, "compile.tu"); - *out = ob; - compiler_panic_restore(c, &saved); - return CFREE_OK; -} - -CfreeStatus cfree_compile_source_obj_emit( - CfreeCompiler* c, const CfreeFrontendCompileOptions* opts, - const CfreeSourceInput* input, CfreeWriter* out) { - PanicSave saved; - ObjBuilder* ob; - if (!c || !opts || !input || !out) return CFREE_INVALID; - compiler_panic_save(c, &saved); - if (setjmp(c->panic)) { - compiler_run_cleanups(c); - compiler_panic_restore(c, &saved); - return CFREE_ERR; - } - validate_bytes(c, &input->bytes); - ob = obj_new(c); - metrics_scope_begin(c, "compile.tu"); - metrics_count(c, "compile.input_bytes", (u64)input->bytes.len); - compile_source_into(c, opts, input, ob); - /* See cfree_compile_c_obj_emit: in emit_c_source mode the CGTarget wrote - * portable C source to opts->code.c_source_writer (same destination as - * `out`); skip object serialization. */ - if (!opts->code.emit_c_source) emit_object_bytes(c, ob, out); - metrics_scope_end(c, "compile.tu"); - obj_free(ob); - compiler_panic_restore(c, &saved); - return CFREE_OK; -} - struct CfreeDepIter { Compiler* c; SourceDepIter* inner;