commit 41e9d3d610a824173a86e3e581e4921232699bbe
parent e2f339d3eade735eba75b825ca40478aa7d2c579
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Thu, 21 May 2026 10:35:21 -0700
persistent frontend plan
Diffstat:
| A | doc/FRONTEND.md | | | 62 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
1 file changed, 62 insertions(+), 0 deletions(-)
diff --git a/doc/FRONTEND.md b/doc/FRONTEND.md
@@ -0,0 +1,62 @@
+# Persistent Context for JIT REPL
+
+This plan outlines how to upgrade the single-pass C frontend to maintain a persistent state across multiple `jit` and `expr` evaluations in the debugger REPL.
+
+## Background & Challenge
+Currently, `cfree_c_compile` is a one-shot process. It creates a `Parser`, `Pp`, `Scope`, and `CfreeCg`, processes the input, finalizes the `CfreeObjBuilder`, and then frees all frontend state. To make the REPL truly "hackable", subsequent snippets must see structs, macros, and global variables defined in previous snippets.
+
+Since `cfree` has no AST and the frontend drives Code Generation (CG) directly, the "context" is entirely contained within the `Parser` symbol table, the `Pp` macro definitions, and the type/symbol handles cached in `DeclTable`.
+
+The primary obstacle to keeping the frontend alive is that `CfreeCgSym` handles are currently direct casts of `ObjSymId`. When a new snippet is compiled, a new `ObjBuilder` must be used (so we don't duplicate code output), which invalidates all previously cached `CfreeCgSym` handles.
+
+## Proposed Changes
+
+### 1. Stateful Frontend API (Single Entrypoint)
+To address the desire for a single entrypoint (rather than keeping the old one and adding an incremental one), we will change the fundamental frontend interface in `include/cfree/compile.h` to be stateful for *all* consumers.
+
+Instead of a single function pointer `CfreeCompileFn`, the frontend registration will provide a vtable or lifecycle functions:
+```c
+typedef struct CfreeFrontend CfreeFrontend;
+
+typedef CfreeFrontend* (*CfreeFrontendNewFn)(CfreeCompiler*);
+typedef CfreeStatus (*CfreeFrontendCompileFn)(CfreeFrontend*, const CfreeFrontendCompileOptions*, const CfreeSourceInput*, CfreeObjBuilder*);
+typedef void (*CfreeFrontendFreeFn)(CfreeFrontend*);
+```
+- **AOT Compiler (`driver/cc.c`)**: Creates the frontend, calls `compile` once, and frees it.
+- **JIT REPL (`driver/dbg.c`)**: Creates the frontend at startup, calls `compile` repeatedly for each snippet (passing a new `CfreeObjBuilder` each time), and frees it on exit.
+
+This keeps the parser (`Parser`), preprocessor (`Pp`, including macros), and scope (`Scope`) alive between snippets for the REPL, without duplicating the frontend API.
+
+### 2. Symbol Uniqueness (Cloning vs Decoupling)
+Instead of fully decoupling `CfreeCgSym` from `ObjSymId` (which would require a lot of `CgApiState` mapping), we can keep `CfreeCgSym == ObjSymId` and achieve global uniqueness by carrying the symbol table forward across `ObjBuilder` instances.
+
+When the REPL calls `cfree_frontend_compile` for the second snippet, it passes a fresh `CfreeObjBuilder`. Inside the frontend:
+- The frontend notices it already has a populated `CfreeCg` from the first snippet.
+- We introduce `cfree_cg_swap_obj(CfreeCg* cg, CfreeObjBuilder* new_ob)`.
+- Inside `swap_obj`, `CG` iterates over all its known symbols (`sym_types` array). For each symbol, it injects an *external, undefined* declaration into `new_ob` at the exact same `ObjSymId`.
+- Because `new_ob` starts empty, inserting the existing symbols sequentially guarantees that `new_ob` assigns the exact same `ObjSymId`s to the exact same symbols!
+- When snippet 2 references a variable from snippet 1, `CfreeCg` emits a relocation using the same `ObjSymId`. `new_ob` sees this as a relocation against an external symbol.
+- The JIT linker automatically resolves this external symbol against the definition provided by snippet 1's object file.
+
+This requires minimal changes to `CG` and `ObjBuilder`, preserves `CfreeCgSym == ObjSymId`, and guarantees monotonic uniqueness across the session.
+
+### Alternative Discussed: DWARF Recovery
+*Tradeoffs of recovering from DWARF instead of a stateful frontend:*
+We *could* keep the frontend single-shot and recover types/symbols by reading DWARF from the JIT session (like LLDB does).
+- **Pros**: Zero state in the compiler; naturally allows scripting against external binaries (C++ libs).
+- **Cons**: Macros (`#define`) are lost between snippets (unless `-g3` is forced and parsed); DWARF-to-C-AST reconstruction is lossy (typedefs, specific attrs); it's vastly more complex to write a DWARF AST importer than to just keep the `Parser` struct alive in memory.
+Given the constraints, making the frontend stateful is vastly simpler and preserves 100% of the C context.
+
+## User Review Required
+> [!IMPORTANT]
+> - Do you approve of changing the primary `CfreeCompileFn` interface to an object-oriented `new`/`compile`/`free` lifecycle?
+> - Do you approve of the "carry-forward external declarations" trick to maintain `ObjSymId` alignment across successive `ObjBuilder`s without decoupling them?
+
+## Verification Plan
+1. Compile the toolchain with `make bin`.
+2. Run `cfree dbg` and test REPL state persistence:
+ ```c
+ (cfree) jit { typedef struct { int a; int b; } Point; Point p = {1, 2}; }
+ (cfree) expr { return p.a + p.b; }
+ ```
+ The second command should successfully compile and evaluate without complaining about undeclared identifiers.