kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 41e9d3d610a824173a86e3e581e4921232699bbe
parent e2f339d3eade735eba75b825ca40478aa7d2c579
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Thu, 21 May 2026 10:35:21 -0700

persistent frontend plan

Diffstat:
Adoc/FRONTEND.md | 62++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+), 0 deletions(-)

diff --git a/doc/FRONTEND.md b/doc/FRONTEND.md @@ -0,0 +1,62 @@ +# Persistent Context for JIT REPL + +This plan outlines how to upgrade the single-pass C frontend to maintain a persistent state across multiple `jit` and `expr` evaluations in the debugger REPL. + +## Background & Challenge +Currently, `cfree_c_compile` is a one-shot process. It creates a `Parser`, `Pp`, `Scope`, and `CfreeCg`, processes the input, finalizes the `CfreeObjBuilder`, and then frees all frontend state. To make the REPL truly "hackable", subsequent snippets must see structs, macros, and global variables defined in previous snippets. + +Since `cfree` has no AST and the frontend drives Code Generation (CG) directly, the "context" is entirely contained within the `Parser` symbol table, the `Pp` macro definitions, and the type/symbol handles cached in `DeclTable`. + +The primary obstacle to keeping the frontend alive is that `CfreeCgSym` handles are currently direct casts of `ObjSymId`. When a new snippet is compiled, a new `ObjBuilder` must be used (so we don't duplicate code output), which invalidates all previously cached `CfreeCgSym` handles. + +## Proposed Changes + +### 1. Stateful Frontend API (Single Entrypoint) +To address the desire for a single entrypoint (rather than keeping the old one and adding an incremental one), we will change the fundamental frontend interface in `include/cfree/compile.h` to be stateful for *all* consumers. + +Instead of a single function pointer `CfreeCompileFn`, the frontend registration will provide a vtable or lifecycle functions: +```c +typedef struct CfreeFrontend CfreeFrontend; + +typedef CfreeFrontend* (*CfreeFrontendNewFn)(CfreeCompiler*); +typedef CfreeStatus (*CfreeFrontendCompileFn)(CfreeFrontend*, const CfreeFrontendCompileOptions*, const CfreeSourceInput*, CfreeObjBuilder*); +typedef void (*CfreeFrontendFreeFn)(CfreeFrontend*); +``` +- **AOT Compiler (`driver/cc.c`)**: Creates the frontend, calls `compile` once, and frees it. +- **JIT REPL (`driver/dbg.c`)**: Creates the frontend at startup, calls `compile` repeatedly for each snippet (passing a new `CfreeObjBuilder` each time), and frees it on exit. + +This keeps the parser (`Parser`), preprocessor (`Pp`, including macros), and scope (`Scope`) alive between snippets for the REPL, without duplicating the frontend API. + +### 2. Symbol Uniqueness (Cloning vs Decoupling) +Instead of fully decoupling `CfreeCgSym` from `ObjSymId` (which would require a lot of `CgApiState` mapping), we can keep `CfreeCgSym == ObjSymId` and achieve global uniqueness by carrying the symbol table forward across `ObjBuilder` instances. + +When the REPL calls `cfree_frontend_compile` for the second snippet, it passes a fresh `CfreeObjBuilder`. Inside the frontend: +- The frontend notices it already has a populated `CfreeCg` from the first snippet. +- We introduce `cfree_cg_swap_obj(CfreeCg* cg, CfreeObjBuilder* new_ob)`. +- Inside `swap_obj`, `CG` iterates over all its known symbols (`sym_types` array). For each symbol, it injects an *external, undefined* declaration into `new_ob` at the exact same `ObjSymId`. +- Because `new_ob` starts empty, inserting the existing symbols sequentially guarantees that `new_ob` assigns the exact same `ObjSymId`s to the exact same symbols! +- When snippet 2 references a variable from snippet 1, `CfreeCg` emits a relocation using the same `ObjSymId`. `new_ob` sees this as a relocation against an external symbol. +- The JIT linker automatically resolves this external symbol against the definition provided by snippet 1's object file. + +This requires minimal changes to `CG` and `ObjBuilder`, preserves `CfreeCgSym == ObjSymId`, and guarantees monotonic uniqueness across the session. + +### Alternative Discussed: DWARF Recovery +*Tradeoffs of recovering from DWARF instead of a stateful frontend:* +We *could* keep the frontend single-shot and recover types/symbols by reading DWARF from the JIT session (like LLDB does). +- **Pros**: Zero state in the compiler; naturally allows scripting against external binaries (C++ libs). +- **Cons**: Macros (`#define`) are lost between snippets (unless `-g3` is forced and parsed); DWARF-to-C-AST reconstruction is lossy (typedefs, specific attrs); it's vastly more complex to write a DWARF AST importer than to just keep the `Parser` struct alive in memory. +Given the constraints, making the frontend stateful is vastly simpler and preserves 100% of the C context. + +## User Review Required +> [!IMPORTANT] +> - Do you approve of changing the primary `CfreeCompileFn` interface to an object-oriented `new`/`compile`/`free` lifecycle? +> - Do you approve of the "carry-forward external declarations" trick to maintain `ObjSymId` alignment across successive `ObjBuilder`s without decoupling them? + +## Verification Plan +1. Compile the toolchain with `make bin`. +2. Run `cfree dbg` and test REPL state persistence: + ```c + (cfree) jit { typedef struct { int a; int b; } Point; Point p = {1, 2}; } + (cfree) expr { return p.a + p.b; } + ``` + The second command should successfully compile and evaluate without complaining about undeclared identifiers.