kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 633475291878a3af13d9e73ba962a37590ae3826
parent 0a585d8c2f0c9c8b47510de3a00305b74b0464e3
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Fri, 29 May 2026 15:15:41 -0700

toy: make the REPL frontend compile transactional

Split durable Toy state (ToyModule: functions, globals, type tables,
counters) from the per-compile ToyParser, and run each compile as a
transaction over the module: watermark the append-only tables, stage in
place, and on failure roll back to the watermarks plus a typed undo
journal for forward-declaration completions. The session drives the
rollback on both the diagnostic-error and compiler_panic (longjmp)
paths, so a failed snippet leaves persistent state untouched and the
next valid snippet runs normally. The old `poisoned` latch is gone.

Per-object CfreeCgSym handles are no longer durable identity: the module
holds only names/types/attrs, and a per-compile symbol environment
(fn_syms/global_syms) maps each declaration to its handle in the current
object, seeded by replaying committed declarations.

Commit is gated on JIT-publish success via a new compile-session API
(cfree_compile_session_stage/commit/abort plus optional frontend vtable
hooks), so a publish rejection (e.g. a duplicate global on redefinition)
rolls the frontend back instead of advertising a symbol the image lacks.

Also:
- compile session: drop the synthetic "frontend failed" fatal; frontend
  diagnostic failures return CFREE_ERR while internal panics still
  longjmp through the existing setjmp boundary.
- dbg: advance the $N result counter only on a fully successful expr,
  using a separate monotonic counter for the thunk symbol name.
- link_jit: fix a double-free in cfree_jit_publish's panic handler,
  which freed the borrowed link session without nulling it (mirrors
  link_session_guard's recovery).
- tests: flip toy-error-recovery green and add toy-rollback-{toplevel,
  type,after-define} and toy-redefine-function.
- doc/TOY_TRANSACTIONAL.md records the design and decisions.

Diffstat:
Mdoc/DBG_TODO.md | 59++++++++++++++++++++++++++++++++---------------------------
Adoc/TOY_TRANSACTIONAL.md | 368+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mdriver/dbg.c | 25+++++++++++++++++++++----
Minclude/cfree/compile.h | 31+++++++++++++++++++++++++++++++
Mlang/c/c.c | 2++
Mlang/toy/builtins.c | 2+-
Mlang/toy/compile.c | 44+++++++++++++++++++++++++++++++-------------
Mlang/toy/decls.c | 3++-
Mlang/toy/expr.c | 20++++++++++----------
Mlang/toy/internal.h | 116++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------
Mlang/toy/parser.c | 12++++++------
Mlang/toy/parser_core.c | 259+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------
Mlang/toy/symbols.c | 103++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
Mlang/toy/types.c | 84+++++++++++++++++++++++++++++++++++++++++++++++--------------------------------
Mlang/wasm/wasm.c | 2++
Msrc/api/compile.c | 92++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------
Msrc/link/link_jit.c | 7+++++++
Mtest/dbg/cases/toy-error-recovery/stderr | 2+-
Dtest/dbg/cases/toy-error-recovery/xfail | 1-
Atest/dbg/cases/toy-redefine-function/args | 2++
Atest/dbg/cases/toy-redefine-function/expected | 3+++
Atest/dbg/cases/toy-redefine-function/stderr | 2++
Atest/dbg/cases/toy-redefine-function/stdin | 5+++++
Atest/dbg/cases/toy-rollback-after-define/args | 2++
Atest/dbg/cases/toy-rollback-after-define/expected | 2++
Atest/dbg/cases/toy-rollback-after-define/stderr | 2++
Atest/dbg/cases/toy-rollback-after-define/stdin | 4++++
Atest/dbg/cases/toy-rollback-toplevel/args | 2++
Atest/dbg/cases/toy-rollback-toplevel/expected | 4++++
Atest/dbg/cases/toy-rollback-toplevel/stderr | 2++
Atest/dbg/cases/toy-rollback-toplevel/stdin | 7+++++++
Atest/dbg/cases/toy-rollback-type/args | 2++
Atest/dbg/cases/toy-rollback-type/expected | 2++
Atest/dbg/cases/toy-rollback-type/stderr | 2++
Atest/dbg/cases/toy-rollback-type/stdin | 5+++++
35 files changed, 1030 insertions(+), 250 deletions(-)

diff --git a/doc/DBG_TODO.md b/doc/DBG_TODO.md @@ -65,8 +65,8 @@ experience is solid. - [x] function calls through expressions with integer/address-like arguments - [x] expression blocks - - [ ] compile error recovery behavior (`toy-error-recovery` red - transcript is in place) + - [x] compile error recovery behavior (`toy-error-recovery` green; a + failed snippet rolls back and the next valid snippet runs as `$1`) - [x] `:language toy` switching from a non-Toy default - [ ] Add debugger-control Toy coverage for: - [x] `b sym`, `run`, `cont` @@ -96,33 +96,38 @@ experience is solid. ## Toy Transactional Frontend State -- [ ] Make each Toy REPL compile a transaction. A failed snippet must leave the - persistent Toy state exactly as it was before the snippet started, then the - next valid expression/top-level snippet should compile and run normally. -- [ ] Split durable Toy frontend state from per-input parser state. Durable state - should own functions, globals, named types, type ids, and counters that - survive successful snippets. Parser state should own the lexer, current +Implemented; see `doc/TOY_TRANSACTIONAL.md` for the design. Commit is gated on +JIT-publish success (not merely compile success), so a publish rejection such +as a duplicate global cannot leave the Toy table advertising a symbol the image +lacks. + +- [x] Make each Toy REPL compile a transaction. A failed snippet leaves the + persistent Toy state exactly as it was before the snippet started, and the + next valid expression/top-level snippet compiles and runs normally. +- [x] Split durable Toy frontend state from per-input parser state. `ToyModule` + owns functions, globals, named types, type ids, and counters that survive + successful snippets; the per-compile `ToyParser` owns the lexer, current token, current CG object, locals, scopes, labels, goto targets, current function return state, diagnostics, and input kind for one compile. -- [ ] Stop storing per-object CG symbol handles as durable Toy symbol identity. - Keep function/global names, types, and attrs as persistent declarations, and - build a per-compile symbol map when replaying previous declarations into - the current object. -- [ ] Stage new functions, globals, and named/type-table entries until the full - snippet succeeds. Lookups during the snippet should see both durable state - and staged declarations; commit appends staged entries, rollback frees - them. -- [ ] Make rollback safe for both ordinary diagnostics and compiler panics from - CG/backend code. Do not rely on copying the parser struct by value; array - growth and panic longjmp paths have already made that unsafe. -- [ ] Preserve the existing object/backend boundary: the compile-session layer - already discards the failed `ObjBuilder`, and dbg only publishes after a - successful compile. The missing transaction is Toy frontend state, not JIT - publication. -- [ ] Consider changing the compile-session error contract so frontend - diagnostic failures return `CFREE_ERR` without an extra fatal - `frontend failed` diagnostic. Internal compiler panics should still use the - panic path. +- [x] Stop storing per-object CG symbol handles as durable Toy symbol identity. + Function/global names, types, and attrs are persistent declarations; the + per-compile symbol environment (`fn_syms`/`global_syms`) maps each to its + `CfreeCgSym` in the current object when replaying previous declarations. +- [x] Stage new functions, globals, and named/type-table entries until the full + snippet succeeds. Lookups see both durable and staged entries (single + append-only tables); commit keeps the staged appends, rollback truncates + them to the compile-start watermarks plus an undo journal for the + forward-declaration in-place mutations. +- [x] Make rollback safe for both ordinary diagnostics and compiler panics from + CG/backend code. The session drives `abort` on the soft-error and panic + (longjmp) paths; rollback uses watermarks + a typed undo journal, never a + copy of the parser struct. +- [x] Preserve the existing object/backend boundary: the compile-session layer + discards the failed `ObjBuilder`, and dbg only publishes after a + successful compile. Frontend commit is gated on publish success. +- [x] Change the compile-session error contract so frontend diagnostic failures + return `CFREE_ERR` without an extra fatal `frontend failed` diagnostic. + Internal compiler panics still use the panic path. ## Shared REPL Work diff --git a/doc/TOY_TRANSACTIONAL.md b/doc/TOY_TRANSACTIONAL.md @@ -0,0 +1,368 @@ +# Transactional Toy frontend & incremental compile + +This document is the design + implementation plan for making the Toy +frontend and the REPL compile/link/publish chain **transactional**, so that +`cfree dbg` compilation and linking are incremental and a failed snippet +leaves persistent state exactly as it was. It implements the +"Toy Transactional Frontend State" section of `doc/DBG_TODO.md`. + +Status: IMPLEMENTED. Breaking API changes were made deliberately; the goal was +a clean codebase that supports this mode well, not a minimal patch. + +Decisions taken (see §8): commit is **publish-gated** (the +`cfree_compile_session_stage`/`commit`/`abort` API), and the durable tables use +the **journaled in-place** rollback model (§3.2). + +Implementation notes: + +- A pre-existing latent double-free in `cfree_jit_publish` was fixed as part of + enabling the publish-failure path: on a panic (e.g. duplicate global) its + handler ran `compiler_run_cleanups`, freeing the borrowed link session's + `Linker`, but did not null the session's `linker`/`image`, so the caller's + `cfree_link_session_free` double-freed. It now nulls them, mirroring + `link_session_guard` (`src/link/link_jit.c`). This is exercised by + `test/dbg/cases/toy-redefine-function`. +- `test/dbg/cases/toy-error-recovery` was flipped from xfail to green. Its + stderr golden column was corrected from `2:15` to the parser's actual `2:14` + (verified with a plain `cc` compile of the identical wrapper); the old value + was an unvalidated xfail guess. +- The panic-path rollback shares `toy_txn_abort` with the soft-error path and + is reasoned correct; the duplicate-global publish panic exercises the session + `abort` path. A dedicated transcript for a CG-internal panic *during parse* is + not included because there is no deterministic toy construct that triggers one + while remaining a recoverable REPL session. + +## 1. Problem + +`cfree dbg` runs one long-lived `CfreeCompileSession` per language +(`driver/dbg.c:1821-1845`, cached in `s->compile_sessions[lang]`). For Toy, +that session owns a single heap-resident `ToyParser` that accumulates all +REPL declarations across snippets (`lang/toy/compile.c:6-11`). Two things +are wrong: + +1. **No rollback.** A snippet mutates the durable parser arrays *in place* + during the parse. If the snippet then fails — a diagnostic *or* a + `compiler_panic` longjmp out of the CG/backend — the durable state is left + half-mutated. The frontend papers over this with a `poisoned` latch + (`compile.c:152,188,195`) that kills the frontend permanently after the + first error. That is exactly why `test/dbg/cases/toy-error-recovery` is + xfail. + +2. **Per-object handles stored as durable identity.** `ToyFn.sym` / + `ToyGlobal.sym` hold a `CfreeCgSym` that is only valid for one `CfreeCg` + object (`cfree_cg_decl(CfreeCg*, …)`, `cg.h:400`). `toy_seed_repl_symbols` + rewrites every one of them on every compile (`compile.c:88-122`). Identity + and per-object handle are conflated in the durable struct. + +The object/link/JIT layers are **already transactional**: a failed compile +discards the per-compile `ObjBuilder` (`src/api/compile.c:270-279`) and the +JIT image is only touched by `cfree_jit_publish` after success, which +preflights duplicate/undefined symbols before mutating +(`src/link/link_jit.c:949-1014`). The missing transaction is Toy frontend +state plus one driver counter. + +## 2. Key facts (from investigation) + +Lifetimes (verified in `include/cfree/cg.h`): + +| Handle | Scope | Notes | +| --- | --- | --- | +| `CfreeCgSym` (`cfree_cg_decl(CfreeCg*)`) | **per-object** | invalid after `cfree_cg_free`; must not be durable identity | +| `CfreeCgTypeId` (`cfree_cg_type_*(CfreeCompiler*)`) | **compiler-durable** | safe to store across compiles | +| `ToyTypeId` (index into `type_table.types`) | **durable identity** | table is append-only + dedup'd, never compacted → indices stable | +| `CfreeCgLabel` (`cfree_cg_label_new(CfreeCg*)`) | per-object | per-compile | + +Durability boundary (`parser_core.c:112-114`, `toy_parser_reinit`): resets +`nvars/nscopes/nlabels` only. The durable cross-snippet surface is exactly +four append-only arrays: + +- `fns` / `nfns` — appended by `toy_add_fn_typed` (`decls.c:461`) +- `globals` / `nglobals` — appended by `toy_add_global_typed` (`data.c:78`) +- `type_table.types` — appended by `toy_type_add` via + `toy_type_register_*` and **via `toy_type_from_cg` on a miss** (`types.c:444`) +- `type_table.named` — appended/`toy_add_named_type` (`types.c:761`) + +`vars`/`scopes`/`labels`/`goto_targets` are per-compile scratch and need no +transaction. A top-level REPL `let` is **not** a `ToyVar`: `jit { … }` and +bare `{ … }` strip braces and feed the body as `REPL_TOPLEVEL` +(`driver/dbg.c:1981-2034,3034`), parsed by `toy_parse_program` → +`toy_parse_global_var` → a **durable global** (`parser.c:1830`, +`data.c:78`). That is why `value` persists in `toy-empty-repl`. + +In-place mutations of (possibly committed) durable entries — these break a +naive truncate-only rollback: + +- `toy_add_named_type` completing a forward-declared named type + (`types.c:773-779`): overwrites `existing->type/toy_type/kind/base_type`. +- `toy_type_register_named_record` upgrading an incomplete record + (`types.c:520-528`): writes `type->cg` in place. +- `toy_set_named_type_fields` / `toy_set_named_type_enum_values` allocate the + entry's `fields`/`enum_values` arrays (`types.c:796-819`). + +Within one snippet the mutated entry is itself newly-added (safe to drop on +rollback). The only hard case is completing in snippet B a record that was +forward-declared (committed) in snippet A, then B fails. + +Rollback primitive (`src/core/core.c:100-129`): `compiler_defer(c, fn, arg)` +pushes a LIFO cleanup (allocated in `c->scratch`, never reset mid-compile); +`compiler_undefer` unlinks it; `compiler_run_cleanups` fires all of them and +is called by every `setjmp` landing pad on panic (`src/api/compile.c:204-208`) +and at `compiler_fini`. `toy_error` only sets `has_error` and emits a +non-fatal diag — it does **not** longjmp (`parser_core.c:254-261`); only +`compiler_panic`/`cfree_frontend_fatal` longjmp. The established pattern is +`src/link/link.c:87` (defer on construct) / `:98` (undefer on clean free). + +Publish can reject a clean compile: `cfree_jit_publish` with +`APPEND_OBJECTS` fails on a duplicate strong global +(`src/link/link_jit.c:968`) or undefined reference — checks that live in the +JIT layer, not the frontend. The appended `ObjBuilder` is **borrowed** by the +image (`dbg_objs_owned=0`, `link_jit.c:1386-1387`) and intentionally leaks +until `cfree_jit_free`; it must not be freed on the success path. + +Driver `$N`: `id = ++s->expr_counter` runs **before** the compile +(`driver/dbg.c:2195`) and is not rolled back on the `goto out` failure path, +so a failed expression still burns a `$N`. The golden requires the failed +`1 +` to leave numbering at `$1` for the next valid input. + +## 3. Design + +### 3.1 Split durable state from per-compile state + +Introduce two structs in place of the monolithic `ToyParser`: + +- **`ToyModule`** (durable, owned by `ToyFrontend`, lives for the session): + the four tables (`fns`, `globals`, `type_table.types`, `type_table.named`) + holding **declaration metadata only** — names, `CfreeCgTypeId`, + `ToyTypeId`, attrs, params, mutability, variadic, kind. **No `CfreeCgSym`.** + Plus durable counters (`static_counter`) and the builtin type + registrations (registered once). + +- **`ToyParser`** (per-compile): lexer, `cur`, `CfreeCg* cg`, builtin-type + cache, `vars`/`scopes`/`labels`/`goto_targets`, `cur_fn_ret*`, `diag`, + `input_name`, `file_id`, `has_error`, island/tail-call scalars, + `input_kind`, plus the **transaction state** (§3.2) and the **per-compile + symbol environment** (§3.3), and a borrowed pointer to the `ToyModule`. + +The ~200 helper signatures in `internal.h` keep taking `ToyParser*`; inside, +durable lookups/appends go through `p->module`. This is wide but mechanical. + +### 3.2 Transaction: journaled append + targeted undo + +Each durable table mutates **in place** during the parse (so lookups stay a +single reverse scan — no two-level lookup), guarded by a per-compile undo +journal owned by the `ToyParser`: + +- At compile start, capture a **watermark** (`nfns`, `nglobals`, `ntypes`, + `named.count`). +- **Appends** past the watermark need no per-record journaling; rollback + truncates each table back to its watermark and frees the per-element + allocations of the dropped tail (`ToyFn.params/toy_params`, + `ToyType.params`, `ToyNamedType.fields/enum_values`). +- **In-place mutation of a committed entry** (index < watermark) — only the + forward-decl-completion sites above — pushes a typed **undo record** that + snapshots the entry by value *before* the mutation. Rollback restores the + saved bytes and frees any sub-array the mutation allocated. +- `toy_type_from_cg`'s append-on-miss is covered by the `ntypes` watermark. + +Operations: + +- `toy_txn_begin(p)`: record watermarks, `compiler_defer(toy_txn_abort, p)`, + mark `txn_open`. +- `toy_txn_abort(p)` (idempotent on `!txn_open`): replay undo journal in + reverse, truncate tables to watermarks freeing dropped tails, free the + per-compile sym env, clear `txn_open`. Used both as the deferred + panic cleanup and the explicit abort. +- `toy_txn_commit(p)` (idempotent): the in-place appends are already durable, + so commit just `compiler_undefer`s the cleanup, discards the journal, and + clears `txn_open`. + +Because mutations are applied in place and the *rollback* is the deferred +cleanup, a `compiler_panic` longjmp out of the CG layer fires +`toy_txn_abort` automatically (`run_cleanups`), leaving `ToyModule` +pristine. Commit is a pointer-free disarm; abort frees only per-compile +memory. This avoids the copy-by-value parser hazard documented at +`compile.c:173-178`. + +> Alternative considered: isolated staging arrays with commit-appends (the +> literal phrasing in DBG_TODO). Rejected as the primary design because it +> forces two-level lookups and complicates `ToyTypeId` index stability for +> no behavioral gain; the journaled-in-place model gives identical observable +> semantics with simpler lookups. + +### 3.3 Per-compile symbol environment + +The `ToyModule` holds no `CfreeCgSym`. The `ToyParser` holds per-compile +parallel arrays `fn_syms[fn_index]` / `global_syms[global_index]` sized to the +module table length and grown in lockstep as new fns/globals are appended. + +- **Seed**: at compile start, for each committed fn/global, `cfree_cg_decl` + it as an external `SB_GLOBAL` into the current `CfreeCg` and store the + returned `CfreeCgSym` in the env (this replaces `toy_seed_repl_symbols` + writing into durable structs). +- **New decls**: `toy_parse_fn`/`toy_parse_global_var` append metadata to the + module *and* push the defining `CfreeCgSym` into the env. +- **Lookups**: `toy_find_fn`/`toy_find_global`/`toy_find_decl_sym` return a + small ref `{ const ToyFnDecl* decl; CfreeCgSym sym; }` (sym resolved from + the env by index). Emission sites (`expr.c`, `builtins.c`, `decls.c`, + `symbols.c` push/addr/call) use `ref.sym`. + +On rollback the env is per-compile and freed; the module never carried syms, +so there is nothing to undo there. + +### 3.4 Commit gating: publish-success (recommended) + +A snippet's durable commit is gated on the **whole** compile→link→publish +chain succeeding, not just compile, because publish can reject a clean +compile (duplicate global / undefined ref, `link_jit.c:968`). Otherwise Toy +would advertise a symbol the JIT does not have — breaking the next snippet's +seed/lookup. + +This requires the session/driver to drive the transaction explicitly. New +compile-session surface (breaking): + +```c +/* Batch one-shot (cc/as): compile and auto-commit on success. */ +CfreeStatus cfree_compile_session_compile(CfreeCompileSession*, + const CfreeSourceInput*, + CfreeObjBuilder** out); + +/* REPL: compile and leave the frontend transaction OPEN on success. */ +CfreeStatus cfree_compile_session_stage(CfreeCompileSession*, + const CfreeSourceInput*, + CfreeObjBuilder** out); +void cfree_compile_session_commit(CfreeCompileSession*); /* idempotent */ +void cfree_compile_session_abort(CfreeCompileSession*); /* idempotent */ +``` + +`CfreeFrontendVTable` gains optional `commit`/`abort` hooks (NULL ⇒ no-op for +asm/c/wasm). The session routes `compile` = `stage` + auto +`commit`/`abort`; `stage` leaves the txn open on `CFREE_OK`. The driver +(`dbg_jit_compile_append_ex`) calls `stage`, then link+publish, then +`commit` on full success or `abort` on any failure. `commit`/`abort` are +idempotent (guarded by `txn_open`) so the panic-fired abort and an explicit +abort never double-run. + +> Simpler alternative: keep `cfree_compile_session_compile` committing at +> compile-success and add only `abort` for the driver to call on publish +> failure (undo of an already-committed txn). Rejected: undoing a committed +> txn means the journal must outlive commit, which is strictly more complex +> and reintroduces a partial-commit window. Publish-gating makes commit the +> trivial disarm and abort the only state-changer. + +### 3.5 Compile-session error contract + +`compile_frontend_state_into` currently `compiler_panic`s +("frontend failed for input", `src/api/compile.c:304`) whenever the frontend +returns non-OK — a synthetic panic for what is an ordinary diagnostic +failure (the only occurrence of that string in the repo; nothing asserts on +it). Change it so a frontend `CFREE_ERR` propagates as `CFREE_ERR` with **no** +synthetic panic and **no** `obj_finalize` on the failed builder. Genuine +internal failures still `compiler_panic` from inside `vtable->compile` (CG +layer) and are caught by `cfree_frontend_compile`'s `setjmp`, which runs +cleanups (firing `toy_txn_abort`) and returns `CFREE_ERR`. This cleanly +separates "diagnostics emitted, fail softly" from "invariant broken, unwind". + +### 3.6 Driver `$N` counter + +Decouple the internal thunk symbol name from the user-visible result number: + +- A monotonic **attempt** counter names the thunk (`__cfree_dbg_expr_<attempt>`) + — must be unique per attempt for lookup; never rolled back. +- A separate **result** counter (`$N`) advances only after `stage` + publish + + call all succeed. On any failure the result counter is untouched. + +This makes the failed `1 +` leave `$1` for the next valid input, matching the +golden. + +## 4. Public API changes (breaking) + +- `include/cfree/compile.h`: add `cfree_compile_session_stage`, + `cfree_compile_session_commit`, `cfree_compile_session_abort`; document + `cfree_compile_session_compile` as auto-commit. +- `CfreeFrontendVTable`: add `commit`/`abort` hooks (optional). +- `lang/toy/`: `ToyParser` split into `ToyModule` + per-compile `ToyParser`; + `ToyFn`/`ToyGlobal` lose their `sym` field (→ `ToyFnDecl`/`ToyGlobalDecl`); + `toy_find_fn`/`toy_find_global`/`toy_find_decl_sym` return refs carrying the + per-compile sym; `toy_seed_repl_symbols` becomes the env seeder; remove + `ToyFrontend.poisoned`. +- All in-tree callers of `cfree_compile_session_compile` (cc/as drivers) keep + working unchanged (auto-commit), so churn is limited to the dbg driver. + +## 5. Implementation phases (TDD, red→green) + +Each phase keeps the tree building and the full suite green except the one +target xfail being flipped. + +1. **Harness/contract first.** Confirm `toy-error-recovery` is red for the + right reason; assert the desired stderr (no "frontend failed" line). Keep + it xfail until phase 6. +2. **Session error contract** (§3.5). Drop the synthetic "frontend failed" + panic; `CFREE_ERR` propagates cleanly; skip `obj_finalize` on failure. + No behavior change yet for poisoned Toy (still latches) — verify suite + stays green. +3. **State split** (§3.1). Mechanical extraction of `ToyModule`; durable + helpers read `p->module`. No transaction yet; behavior identical. Green. +4. **Per-compile sym env** (§3.3). Remove `sym` from durable structs; seed + into the env; refs at lookups/emission. Behavior identical. Green. +5. **Transaction** (§3.2) + **vtable commit/abort + session stage/commit/abort** + (§3.4). Arm `toy_txn_begin`/`abort`/`commit`; remove `poisoned`. +6. **Driver wiring** (§3.4, §3.6). `dbg_jit_compile_append_ex` uses + `stage` + commit/abort; fix the `$N` counters. **Flip + `toy-error-recovery` to green.** +7. **New coverage** (§6). Add the transactional cases below; run red→green. +8. **Cleanup pass.** Remove dead code (`toy_add_fn`/`toy_add_global` + non-typed are never called), update `doc/DBG_TODO.md` checkboxes. + +## 6. Test plan + +Preserve (must stay green): `toy-persistent-repl`, `toy-empty-repl`, +`toy-expr-scalar`, `toy-expr-call`, `toy-expr-block`, `toy-repl-source-list`, +all debugger-control cases. + +Flip to green: `toy-error-recovery` (remove xfail). + +Remain xfail (out of scope — separate feature): `toy-structured-expr`. + +Add (new `test/dbg/cases`, each red first): + +- `toy-rollback-toplevel` — failed `jit { fn bad( }` then a good fn compiles + and is callable; the bad fn is absent. +- `toy-rollback-type` — failed snippet that declares a record then errors + leaves the type table clean; re-declaring the record later works. +- `toy-rollback-after-define` — define `f`, a failing snippet that references + `f` plus a syntax error still leaves `f` callable. +- `toy-redefine-function` — defining `twice` twice: publish rejects the + duplicate, and the Toy table is **not** left advertising the rejected + second definition (exercises publish-gated commit). Encodes the chosen + redefine semantics explicitly. +- A direct/unit check (where feasible) that a `compiler_panic` mid-parse + rolls back durable state (panic-path abort), per DBG_TODO robustness items. + +## 7. Risks & mitigations + +- **longjmp double-free** (`compile.c:173-178`): mutate the heap-resident + parser in place; never copy `ToyParser`/its arrays by value; rollback uses + watermarks + typed undo records, not pointer copies. +- **Pointer-into-array invalidation**: `toy_parser_reserve` reallocs; never + cache a `ToyFnDecl*`/`ToyNamedType*` across an append to the same table — + re-resolve by index, or pre-reserve before handing out pointers. +- **Forgot-to-undefer / double-abort**: `txn_open` flag makes commit/abort + idempotent; ensure every return path through `vtable->compile` leaves the + txn in a defined state (open on OK, the deferred cleanup covers panic). +- **`compiler_defer` OOM** returns NULL: treat as "rollback not armed" and + fail the compile rather than proceed unguarded. +- **Borrowed ObjBuilder**: do not free the appended `ObjBuilder` on publish + success; the image and debug view borrow it. +- **Seed resurrecting rolled-back globals**: rollback truncates the module + before the next compile's seed runs, so seed never sees dropped decls. + +## 8. Open decisions (for confirmation) + +1. **Commit gating / API shape** — publish-gated `stage`+`commit`/`abort` + (recommended, atomic end-to-end, fixes redefine inconsistency) vs + compile-gated + `abort`-only (smaller API, leaves a window). +2. **Redefine semantics** — with publish-gated commit, redefining a function + currently fails at publish (duplicate strong global). Options for a + follow-up: (a) keep failing with a clear message; (b) teach the REPL to + supersede a prior definition (hot-reload territory, `doc/HOT_RELOAD.md`). + This plan only guarantees the table stays consistent on the failure. diff --git a/driver/dbg.c b/driver/dbg.c @@ -399,7 +399,8 @@ typedef struct DbgState { uint32_t nbps; uint32_t bps_cap; int next_bp_id; - uint64_t expr_counter; + uint64_t expr_counter; /* user-visible $N; advances only on a successful expr */ + uint64_t expr_attempt; /* unique thunk-symbol id; advances every attempt */ uint64_t source_counter; int has_stop; @@ -1932,7 +1933,10 @@ static int dbg_jit_compile_append_ex(DbgState* s, CfreeLanguage lang, sin.input_kind = input_kind; sin.repl_entry_name = cfree_slice_cstr(repl_entry_name); st = dbg_compile_session_for(s, lang, &session); - if (st == CFREE_OK) st = cfree_compile_session_compile(session, &sin, &ob); + /* Stage the compile: on success the frontend's durable declarations are left + * pending so we only commit them once the object has actually been published + * into the JIT image. A failed compile is already rolled back internally. */ + if (st == CFREE_OK) st = cfree_compile_session_stage(session, &sin, &ob); if (st != CFREE_OK || !ob) { if (ob) cfree_obj_builder_free(ob); driver_errf(DBG_TOOL, "jit compile failed"); @@ -1952,9 +1956,15 @@ static int dbg_jit_compile_append_ex(DbgState* s, CfreeLanguage lang, cfree_link_session_free(link); } if (st != CFREE_OK) { + /* Publish rejected the object (e.g. a duplicate global). Roll back the + * staged declarations so the frontend never advertises a symbol the JIT + * image does not have. */ + cfree_compile_session_abort(session); driver_errf(DBG_TOOL, "jit append failed"); return 1; } + /* Published: make the staged declarations durable. */ + cfree_compile_session_commit(session); if (generated_source_name) s->source_counter = source_id; if (lang == CFREE_LANG_TOY && input_kind == CFREE_FRONTEND_INPUT_REPL_TOPLEVEL && @@ -2201,6 +2211,7 @@ static void dbg_cmd_expr(DbgState* s, const char* expr) { size_t prefix_len; void* entry; uint64_t ret = 0; + uint64_t attempt; uint64_t id; int is_block = 0; @@ -2252,8 +2263,13 @@ static void dbg_cmd_expr(DbgState* s, const char* expr) { } } - id = ++s->expr_counter; - num_len = dbg_u64_dec(num, sizeof(num), id); + /* The thunk symbol name uses a monotonic per-attempt counter so each + * compiled thunk gets a unique linkage name even across failed attempts. + * The user-visible result number ($N) is a separate counter advanced only + * after a full success (compile + publish + call), so a failed snippet does + * not consume a result number. */ + attempt = ++s->expr_attempt; + num_len = dbg_u64_dec(num, sizeof(num), attempt); if (!num_len) { driver_errf(DBG_TOOL, "expression counter overflow"); goto out; @@ -2285,6 +2301,7 @@ static void dbg_cmd_expr(DbgState* s, const char* expr) { goto out; } if (dbg_call_u64_entry(s, entry, NULL, 0, &ret) == 0) { + id = ++s->expr_counter; driver_printf("$%llu = %llu (0x%llx)\n", (unsigned long long)id, (unsigned long long)ret, (unsigned long long)ret); } diff --git a/include/cfree/compile.h b/include/cfree/compile.h @@ -69,6 +69,10 @@ typedef CfreeStatus (*CfreeFrontendCompileFn)( CfreeFrontendState*, const CfreeFrontendCompileOptions*, const CfreeSourceInput*, CfreeObjBuilder* out); typedef void (*CfreeFrontendFreeFn)(CfreeFrontendState*); +/* Transaction hooks for frontends with durable cross-compile state (REPL + * declarations). See the `commit`/`abort` fields below. */ +typedef void (*CfreeFrontendCommitFn)(CfreeFrontendState*); +typedef void (*CfreeFrontendAbortFn)(CfreeFrontendState*); typedef struct CfreeFrontendVTable { CfreeFrontendNewFn new_frontend; @@ -83,6 +87,20 @@ typedef struct CfreeFrontendVTable { * `"s"` entry. */ const CfreeSlice* extensions; uint32_t nextensions; + + /* Optional transaction hooks for incremental/REPL frontends. A compile + * stages new durable declarations; `commit` makes the most recent + * successful compile's declarations permanent, `abort` discards them and + * restores the frontend to its pre-compile state. Both must be idempotent + * (a no-op when there is nothing pending). NULL for frontends without + * durable cross-compile state (asm, c, wasm). The compile session drives + * them: `abort` fires on a failed or panicking compile; + * cfree_compile_session_compile auto-commits on success, while + * cfree_compile_session_stage leaves the transaction open for an explicit + * cfree_compile_session_commit / cfree_compile_session_abort once the + * caller has linked and published the object. */ + CfreeFrontendCommitFn commit; + CfreeFrontendAbortFn abort; } CfreeFrontendVTable; CFREE_API CfreeLanguage cfree_language_for_path(CfreeCompiler*, @@ -101,9 +119,22 @@ typedef struct CfreeCompileSessionOptions { CFREE_API CfreeStatus cfree_compile_session_new(CfreeCompiler*, const CfreeCompileSessionOptions*, CfreeCompileSession** out); +/* Compile and auto-commit on success (batch path: cc/as). On failure the + * frontend transaction is rolled back and *out is left NULL. */ CFREE_API CfreeStatus cfree_compile_session_compile(CfreeCompileSession*, const CfreeSourceInput*, CfreeObjBuilder** out); +/* Compile but leave the frontend transaction OPEN on success, so the caller + * can link and publish the resulting object before deciding the outcome. + * Follow a successful stage with exactly one of cfree_compile_session_commit + * (the object was published) or cfree_compile_session_abort (publish failed). + * On a failed compile the transaction is already rolled back and *out is + * NULL, so neither call is needed. */ +CFREE_API CfreeStatus cfree_compile_session_stage(CfreeCompileSession*, + const CfreeSourceInput*, + CfreeObjBuilder** out); +CFREE_API void cfree_compile_session_commit(CfreeCompileSession*); +CFREE_API void cfree_compile_session_abort(CfreeCompileSession*); CFREE_API void cfree_compile_session_free(CfreeCompileSession*); typedef struct CfreeDepIter CfreeDepIter; diff --git a/lang/c/c.c b/lang/c/c.c @@ -141,4 +141,6 @@ static void c_frontend_free(CfreeFrontendState* frontend) { * `c`/`h` here would only duplicate that fallback. */ const CfreeFrontendVTable cfree_c_frontend_vtable = { c_frontend_new, c_frontend_compile, c_frontend_free, NULL, 0, + /* commit/abort: C has no durable cross-compile state yet */ + NULL, NULL, }; diff --git a/lang/toy/builtins.c b/lang/toy/builtins.c @@ -400,7 +400,7 @@ static CfreeCgTypeId toy_parse_call_builtin(ToyParser* p) { } if (direct) - cfree_cg_call_symbol(p->cg, fn->sym, (uint32_t)nargs, attrs); + cfree_cg_call_symbol(p->cg, toy_fn_cur_sym(p, fn), (uint32_t)nargs, attrs); else cfree_cg_call(p->cg, (uint32_t)nargs, fn_ty, attrs); p->last_type = ret_toy_type; diff --git a/lang/toy/compile.c b/lang/toy/compile.c @@ -5,9 +5,9 @@ typedef struct ToyFrontend { CfreeCompiler* c; - ToyParser parser; + ToyModule module; /* durable cross-snippet declarations */ + ToyParser parser; /* per-compile; borrows &module */ int parser_live; - int poisoned; } ToyFrontend; static int toy_buf_append(ToyFrontend* fe, char** buf, size_t* len, size_t* cap, @@ -87,8 +87,8 @@ oom: static int toy_seed_repl_symbols(ToyParser* p) { size_t i; - for (i = 0; i < p->nfns; ++i) { - ToyFn* fn = &p->fns[i]; + for (i = 0; i < p->module->nfns; ++i) { + ToyFn* fn = &p->module->fns[i]; CfreeCgDecl decl; CfreeCgSym sym; memset(&decl, 0, sizeof decl); @@ -101,10 +101,10 @@ static int toy_seed_repl_symbols(ToyParser* p) { decl.as.func = fn->func_attrs; sym = cfree_cg_decl(p->cg, decl); if (sym == CFREE_CG_SYM_NONE) return 0; - fn->sym = sym; + if (!toy_fn_set_cur_sym(p, i, sym)) return 0; } - for (i = 0; i < p->nglobals; ++i) { - ToyGlobal* g = &p->globals[i]; + for (i = 0; i < p->module->nglobals; ++i) { + ToyGlobal* g = &p->module->globals[i]; CfreeCgDecl decl; CfreeCgSym sym; memset(&decl, 0, sizeof decl); @@ -117,7 +117,7 @@ static int toy_seed_repl_symbols(ToyParser* p) { decl.as.object = g->object_attrs; sym = cfree_cg_decl(p->cg, decl); if (sym == CFREE_CG_SYM_NONE) return 0; - g->sym = sym; + if (!toy_global_set_cur_sym(p, i, sym)) return 0; } return 1; } @@ -149,7 +149,6 @@ static CfreeStatus toy_frontend_compile(CfreeFrontendState* frontend, size_t owned_source_cap = 0; if (!fe || !fe->c || !opts || !input || !out) return CFREE_INVALID; - if (fe->poisoned) return CFREE_ERR; c = fe->c; (void)opts->language_options; /* toy frontend has no per-language options */ @@ -163,7 +162,8 @@ static CfreeStatus toy_frontend_compile(CfreeFrontendState* frontend, if (st != CFREE_OK) goto done_status; if (!fe->parser_live) { - toy_parser_init(&fe->parser, c, cg, source, source_len, input->name.s); + toy_parser_init(&fe->parser, c, cg, &fe->module, source, source_len, + input->name.s); fe->parser.input_kind = opts->input_kind; fe->parser_live = 1; } else { @@ -177,6 +177,11 @@ static CfreeStatus toy_frontend_compile(CfreeFrontendState* frontend, * pointers that the local `p`'s grow path had freed and replaced — * dispose then double-freed via the stale pointer. */ p = &fe->parser; + /* Open the transaction over the durable module before staging anything. A + * failed or panicking compile is rolled back by the session via the abort + * hook; a successful compile leaves the transaction open for the caller to + * commit (after a successful publish) or abort (if publish fails). */ + toy_txn_begin(p); if (opts->input_kind != CFREE_FRONTEND_INPUT_TRANSLATION_UNIT && !toy_seed_repl_symbols(p)) { cfree_cg_free(cg); @@ -185,14 +190,12 @@ static CfreeStatus toy_frontend_compile(CfreeFrontendState* frontend, } if (!toy_parse_program(p) || p->has_error) { - fe->poisoned = 1; cfree_cg_free(cg); st = CFREE_ERR; goto done_status; } if (p->cur.kind != TOK_EOF) { toy_error(p, p->cur.loc, "unexpected token after program end"); - fe->poisoned = 1; cfree_cg_free(cg); st = CFREE_ERR; goto done_status; @@ -209,11 +212,24 @@ done_status: return st; } +static void toy_frontend_commit(CfreeFrontendState* frontend) { + ToyFrontend* fe = (ToyFrontend*)frontend; + if (fe && fe->parser_live) toy_txn_commit(&fe->parser); +} + +static void toy_frontend_abort(CfreeFrontendState* frontend) { + ToyFrontend* fe = (ToyFrontend*)frontend; + if (fe && fe->parser_live) toy_txn_abort(&fe->parser); +} + static void toy_frontend_free(CfreeFrontendState* frontend) { ToyFrontend* fe = (ToyFrontend*)frontend; CfreeHeap* h; if (!fe) return; - if (fe->parser_live) toy_parser_dispose(&fe->parser); + if (fe->parser_live) { + toy_parser_dispose(&fe->parser); + toy_module_dispose(&fe->module, fe->c); + } h = cfree_compiler_context(fe->c)->heap; h->free(h, fe, sizeof(*fe)); } @@ -226,4 +242,6 @@ const CfreeFrontendVTable cfree_toy_frontend_vtable = { toy_frontend_free, toy_extensions, (uint32_t)(sizeof toy_extensions / sizeof toy_extensions[0]), + toy_frontend_commit, + toy_frontend_abort, }; diff --git a/lang/toy/decls.c b/lang/toy/decls.c @@ -483,7 +483,8 @@ int toy_parse_fn(ToyParser* p, int is_extern, int is_pub) { return 1; } - cfree_cg_func_begin_attrs(p->cg, fn_entry->sym, fn_entry->func_attrs); + cfree_cg_func_begin_attrs(p->cg, toy_fn_cur_sym(p, fn_entry), + fn_entry->func_attrs); p->nvars = 0; p->nlabels = 0; diff --git a/lang/toy/expr.c b/lang/toy/expr.c @@ -49,7 +49,7 @@ CfreeCgTypeId toy_push_named_rvalue(ToyParser* p, CfreeSym name) { { ToyGlobal* g = toy_find_global(p, name); if (g) { - cfree_cg_push_symbol_addr(p->cg, g->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, g), 0); if (cfree_cg_type_kind(p->c, g->type) != CFREE_CG_TYPE_RECORD) cfree_cg_load(p->cg, toy_mem_access(p, g->type), (CfreeCgEffAddr){0, 0}); @@ -61,7 +61,7 @@ CfreeCgTypeId toy_push_named_rvalue(ToyParser* p, CfreeSym name) { ToyFn* fn = toy_find_fn(p, name); if (fn) { CfreeCgTypeId ptr_ty = cfree_cg_type_ptr(p->c, fn->type, 0); - cfree_cg_push_symbol_addr(p->cg, fn->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_fn_cur_sym(p, fn), 0); p->last_type = toy_type_register_ptr(p, ptr_ty, fn->toy_type, 0); return ptr_ty; } @@ -463,7 +463,7 @@ static int toy_emit_var_addr(ToyParser* p, CfreeSym name) { { ToyGlobal* g = toy_find_global(p, name); if (g) { - cfree_cg_push_symbol_addr(p->cg, g->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, g), 0); return 1; } } @@ -479,7 +479,7 @@ CfreeCgTypeId toy_emit_var_lvalue(ToyParser* p, CfreeSym name) { { ToyGlobal* g = toy_find_global(p, name); if (g) { - cfree_cg_push_symbol_addr(p->cg, g->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, g), 0); return g->type; } } @@ -739,7 +739,7 @@ static CfreeCgTypeId toy_parse_expr_primary(ToyParser* p) { CfreeCgCallAttrs attrs; memset(&attrs, 0, sizeof attrs); attrs.inline_policy = fn->func_attrs.inline_policy; - cfree_cg_call_symbol(p->cg, fn->sym, (uint32_t)nargs, attrs); + cfree_cg_call_symbol(p->cg, toy_fn_cur_sym(p, fn), (uint32_t)nargs, attrs); } p->last_type = fn->toy_ret; return fn->ret; @@ -791,7 +791,7 @@ static CfreeCgTypeId toy_parse_expr_primary(ToyParser* p) { ToyGlobal* g = toy_find_global(p, name); if (g && (cfree_cg_type_kind(p->c, g->type) == CFREE_CG_TYPE_ARRAY || cfree_cg_type_kind(p->c, g->type) == CFREE_CG_TYPE_RECORD)) { - cfree_cg_push_symbol_addr(p->cg, g->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, g), 0); p->last_type = g->toy_type; return g->type; } @@ -808,7 +808,7 @@ static CfreeCgTypeId toy_parse_expr_primary(ToyParser* p) { { ToyGlobal* g = toy_find_global(p, name); if (g && toy_type_is_slice(p, g->toy_type)) { - cfree_cg_push_symbol_addr(p->cg, g->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, g), 0); p->last_type = g->toy_type; return g->type; } @@ -1099,7 +1099,7 @@ static CfreeCgTypeId toy_parse_expr_unary(ToyParser* p) { ty = v->type; ty_toy = v->toy_type; } else if (g) { - cfree_cg_push_symbol_addr(p->cg, g->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, g), 0); ty = g->type; ty_toy = g->toy_type; } else { @@ -1255,14 +1255,14 @@ static CfreeCgTypeId toy_parse_expr_unary(ToyParser* p) { ToyGlobal* g = toy_find_global(p, name); if (g) { CfreeCgTypeId ptr_ty = cfree_cg_type_ptr(p->c, g->type, 0); - cfree_cg_push_symbol_addr(p->cg, g->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, g), 0); p->last_type = toy_type_register_ptr(p, ptr_ty, g->toy_type, 0); return ptr_ty; } else { ToyFn* fn = toy_find_fn(p, name); if (fn) { CfreeCgTypeId ptr_ty = cfree_cg_type_ptr(p->c, fn->type, 0); - cfree_cg_push_symbol_addr(p->cg, fn->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_fn_cur_sym(p, fn), 0); p->last_type = toy_type_register_ptr(p, ptr_ty, fn->toy_type, 0); return ptr_ty; } diff --git a/lang/toy/internal.h b/lang/toy/internal.h @@ -25,7 +25,6 @@ typedef struct ToyVar { typedef struct ToyFn { CfreeSym name; - CfreeCgSym sym; CfreeCgTypeId type; ToyTypeId toy_type; CfreeCgTypeId ret; @@ -40,7 +39,6 @@ typedef struct ToyFn { typedef struct ToyGlobal { CfreeSym name; - CfreeCgSym sym; CfreeCgTypeId type; ToyTypeId toy_type; CfreeCgSymbolAttrs sym_attrs; @@ -146,6 +144,63 @@ typedef struct ToyTypeTable { size_t cap; } ToyTypeTable; +/* Durable Toy frontend state. Owned by the frontend, it survives across every + * successful REPL snippet and holds the persistent declarations: functions, + * globals, named types, the structural type table, and counters. It holds + * declaration *identity* only — names, types, attrs — never per-object CG + * symbol handles, which are rebuilt per compile in the parser's symbol env. + * + * Each of fns/globals/type_table.types/type_table.named is an append-only + * array; ToyTypeId values are 1-based indices into type_table.types and stay + * valid across snippets because the table is never compacted. A per-compile + * transaction (see ToyParser) stages appends in place and rolls back to the + * watermarks captured at compile start if the snippet fails. */ +typedef struct ToyModule { + ToyFn* fns; + size_t nfns; + size_t cap_fns; + + ToyGlobal* globals; + size_t nglobals; + size_t cap_globals; + + ToyTypeTable type_table; + + uint32_t static_counter; +} ToyModule; + +typedef enum ToyUndoKind { + TOY_UNDO_NAMED, + TOY_UNDO_TYPE, +} ToyUndoKind; + +/* Before-image of a committed module entry that a transaction mutates in + * place (forward-declaration completion). Restored verbatim on rollback. */ +typedef struct ToyUndo { + ToyUndoKind kind; + size_t index; + union { + ToyNamedType named; + ToyType type; + } saved; +} ToyUndo; + +/* Per-compile transaction over the durable module. Watermarks capture the + * committed table sizes at compile start; staged appends past them are + * truncated on rollback. The undo journal restores committed entries that + * were mutated in place. Commit keeps the staged appends (they are already in + * the module) and drops the journal; abort does both undos. */ +typedef struct ToyTxn { + int open; + size_t fns; + size_t globals; + size_t types; + size_t named; + ToyUndo* undo; + size_t nundo; + size_t cap_undo; +} ToyTxn; + typedef enum ToyScopeKind { TOY_SCOPE_LOOP, TOY_SCOPE_SWITCH, @@ -186,19 +241,24 @@ typedef struct ToyParser { CfreeCgTypeId va_list_type; CfreeTarget target; + /* Durable cross-snippet state lives in the module; the parser borrows it + * for one compile and stages/commits/rolls back through it. */ + ToyModule* module; + ToyVar* vars; size_t nvars; size_t cap_vars; - ToyFn* fns; - size_t nfns; - size_t cap_fns; + /* Per-compile symbol environment: the CfreeCgSym each durable fn/global + * resolves to in the *current* CfreeCg object, indexed in lockstep with + * module->fns / module->globals. Seeded from committed decls and extended + * as new decls are staged; never durable identity. */ + CfreeCgSym* fn_syms; + size_t cap_fn_syms; + CfreeCgSym* global_syms; + size_t cap_global_syms; - ToyGlobal* globals; - size_t nglobals; - size_t cap_globals; - - ToyTypeTable type_table; + ToyTxn txn; ToyScope* scopes; size_t nscopes; @@ -217,7 +277,6 @@ typedef struct ToyParser { const char* input_name; uint32_t file_id; int has_error; - uint32_t static_counter; uint32_t expr_island_mask; ToyTypeId last_type; int allow_tail_call_expr; @@ -228,11 +287,26 @@ typedef struct ToyParser { CfreeCgTypeId toy_builtin_type(ToyParser* p, CfreeCgBuiltinType ty); void toy_parser_init(ToyParser* p, CfreeCompiler* c, CfreeCg* cg, - const uint8_t* data, size_t len, const char* input_name); + ToyModule* module, const uint8_t* data, size_t len, + const char* input_name); void toy_parser_reinit(ToyParser* p, CfreeCompiler* c, CfreeCg* cg, const uint8_t* data, size_t len, const char* input_name, CfreeFrontendInputKind input_kind); void toy_parser_dispose(ToyParser* p); +/* Frees the durable module arrays. Called once at frontend teardown. */ +void toy_module_dispose(ToyModule* m, CfreeCompiler* c); + +/* Transaction over the durable module for one compile. begin captures + * watermarks; commit keeps staged appends and drops the journal; abort + * truncates staged appends and replays the undo journal. commit/abort are + * idempotent (no-op unless a transaction is open). The record_* helpers + * snapshot a committed entry before it is mutated in place; they no-op for + * staged entries and for already-recorded ones, and return 0 only on OOM. */ +void toy_txn_begin(ToyParser* p); +void toy_txn_commit(ToyParser* p); +void toy_txn_abort(ToyParser* p); +int toy_txn_record_named(ToyParser* p, size_t index); +int toy_txn_record_type(ToyParser* p, size_t index); void* toy_parser_zalloc(ToyParser* p, size_t count, size_t elem_size, const char* what); void toy_parser_free_mem(ToyParser* p, void* items, size_t size); @@ -365,18 +439,11 @@ int toy_set_named_type_enum_values(ToyParser* p, ToyNamedType* named, const ToyEnumConst* values, size_t nvalues); ToyVar* toy_find_var(ToyParser* p, CfreeSym name); -int toy_add_local(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, - CfreeCgLocal slot, int mutable); int toy_add_local_typed(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, ToyTypeId toy_type, CfreeCgLocal slot, int mutable); -int toy_add_static_local(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, - CfreeCgSym sym, int mutable); int toy_add_static_local_typed(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, ToyTypeId toy_type, CfreeCgSym sym, int mutable); ToyFn* toy_find_fn(ToyParser* p, CfreeSym name); -ToyFn* toy_add_fn(ToyParser* p, CfreeSym name, CfreeCgSym sym, - CfreeCgTypeId type, CfreeCgTypeId ret, - const CfreeCgTypeId* params, size_t nparams, int variadic); ToyFn* toy_add_fn_typed(ToyParser* p, CfreeSym name, CfreeCgSym sym, CfreeCgTypeId type, ToyTypeId toy_type, CfreeCgTypeId ret, ToyTypeId toy_ret, @@ -384,8 +451,6 @@ ToyFn* toy_add_fn_typed(ToyParser* p, CfreeSym name, CfreeCgSym sym, const ToyTypeId* toy_params, size_t nparams, int variadic); ToyGlobal* toy_find_global(ToyParser* p, CfreeSym name); -int toy_add_global(ToyParser* p, CfreeSym name, CfreeCgSym sym, - CfreeCgTypeId type, int mutable); int toy_add_global_typed(ToyParser* p, CfreeSym name, CfreeCgSym sym, CfreeCgTypeId type, ToyTypeId toy_type, int mutable); ToyNamedType* toy_find_named_type(ToyParser* p, CfreeSym name); @@ -403,6 +468,15 @@ void toy_addr_index(ToyParser* p, uint64_t elem_size, CfreeCgTypeId result_ptr_ty); CfreeCgSym toy_find_decl_sym(ToyParser* p, CfreeSym name); +/* Per-compile symbol-environment accessors. *_cur_sym map a durable + * fn/global (a pointer into module->fns / module->globals) to its CfreeCgSym + * in the current object; *_set_cur_sym reserve and record that mapping by + * index (returns 0 on OOM). */ +CfreeCgSym toy_fn_cur_sym(ToyParser* p, const ToyFn* fn); +CfreeCgSym toy_global_cur_sym(ToyParser* p, const ToyGlobal* g); +int toy_fn_set_cur_sym(ToyParser* p, size_t index, CfreeCgSym sym); +int toy_global_set_cur_sym(ToyParser* p, size_t index, CfreeCgSym sym); + int toy_parse_program(ToyParser* p); #endif diff --git a/lang/toy/parser.c b/lang/toy/parser.c @@ -132,7 +132,7 @@ static int toy_copy_record_lvalue_to_var(ToyParser* p, CfreeCgTypeId src_ty, if (dst_var) { toy_push_var_lvalue(p, dst_var); } else { - cfree_cg_push_symbol_addr(p->cg, dst_global->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, dst_global), 0); } cfree_cg_swap(p->cg); cfree_cg_store(p->cg, toy_mem_access(p, field.type), @@ -734,7 +734,7 @@ static int toy_parse_let_stmt(ToyParser* p) { return 0; } } - snprintf(sym_name, sizeof sym_name, ".Ltoy_static_%u", p->static_counter++); + snprintf(sym_name, sizeof sym_name, ".Ltoy_static_%u", p->module->static_counter++); linkage_name = cfree_sym_intern(p->c, cfree_slice_cstr(sym_name)); memset(&decl, 0, sizeof decl); decl.kind = CFREE_CG_DECL_OBJECT; @@ -1367,9 +1367,9 @@ static int toy_parse_return_stmt(ToyParser* p) { } if (fn) { if (must_tail) - cfree_cg_musttail_call_symbol(p->cg, fn->sym, (uint32_t)nargs); + cfree_cg_musttail_call_symbol(p->cg, toy_fn_cur_sym(p, fn), (uint32_t)nargs); else - cfree_cg_tail_call_symbol(p->cg, fn->sym, (uint32_t)nargs); + cfree_cg_tail_call_symbol(p->cg, toy_fn_cur_sym(p, fn), (uint32_t)nargs); } else { if (must_tail) cfree_cg_musttail_call(p->cg, (uint32_t)nargs, fn_ty); @@ -1545,7 +1545,7 @@ static int toy_parse_stmt(ToyParser* p) { toy_push_var_addr(p, v); lhs_ty = v->type; } else if (g) { - cfree_cg_push_symbol_addr(p->cg, g->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, g), 0); lhs_ty = g->type; } else { lhs_ty = CFREE_CG_TYPE_NONE; @@ -1779,7 +1779,7 @@ static int toy_parse_stmt(ToyParser* p) { } return 1; } - cfree_cg_push_symbol_addr(p->cg, g->sym, 0); + cfree_cg_push_symbol_addr(p->cg, toy_global_cur_sym(p, g), 0); cfree_cg_swap(p->cg); if (expr_ty != g->type) cfree_cg_bitcast(p->cg, g->type); cfree_cg_store(p->cg, toy_mem_access(p, g->type), diff --git a/lang/toy/parser_core.c b/lang/toy/parser_core.c @@ -44,8 +44,10 @@ static uint32_t toy_source_file_id(CfreeCompiler* c, const char* name) { } void toy_parser_init(ToyParser* p, CfreeCompiler* c, CfreeCg* cg, - const uint8_t* data, size_t len, const char* input_name) { + ToyModule* module, const uint8_t* data, size_t len, + const char* input_name) { memset(p, 0, sizeof *p); + p->module = module; p->file_id = toy_source_file_id(c, input_name); p->input_name = input_name; toy_lexer_init(&p->lex, data, len, p->file_id); @@ -60,18 +62,8 @@ void toy_parser_init(ToyParser* p, CfreeCompiler* c, CfreeCg* cg, p->nvars = 0; p->cap_vars = 0; p->vars = NULL; - p->nfns = 0; - p->cap_fns = 0; - p->fns = NULL; - p->nglobals = 0; - p->cap_globals = 0; - p->globals = NULL; - p->type_table.named = NULL; - p->type_table.count = 0; - p->type_table.cap = 0; - p->type_table.types = NULL; - p->type_table.ntypes = 0; - p->type_table.cap_types = 0; + /* The durable module is zero-initialized by the frontend; register the + * builtin types into it exactly once, on this first compile. */ toy_type_register_builtins(p); p->nscopes = 0; p->cap_scopes = 0; @@ -86,7 +78,6 @@ void toy_parser_init(ToyParser* p, CfreeCompiler* c, CfreeCg* cg, p->diag = cfree_compiler_context(c)->diag; p->input_name = input_name; p->has_error = 0; - p->static_counter = 0; p->expr_island_mask = 0; p->last_type = TOY_TYPE_NONE; p->allow_tail_call_expr = 0; @@ -125,72 +116,212 @@ void toy_parser_reinit(ToyParser* p, CfreeCompiler* c, CfreeCg* cg, p->input_kind = input_kind; } +/* Frees one heap block sized for the module owner. Mirrors + * toy_parser_free_mem but does not need a live parser, so the durable module + * can be torn down independently of any compile. */ +static void toy_mem_free(CfreeCompiler* c, void* items, size_t size) { + CfreeHeap* h; + if (!items) return; + h = cfree_compiler_context(c)->heap; + h->free(h, items, size ? size : 1u); +} + void toy_parser_dispose(ToyParser* p) { - size_t i; - /* Only free per-element pointer arrays when their companion count is - * non-zero — allocation only happens in that case, so a zero count means - * the pointer field was never written by an allocation. This guards - * against rare paths that leave a pointer slot uninitialized while - * leaving the count at 0; without it, freeing a garbage pointer - * aborts libc malloc with "pointer being freed was not allocated". */ + /* Per-compile scratch only; the durable arrays belong to the module and + * are freed by toy_module_dispose. */ toy_parser_free_mem(p, p->vars, p->cap_vars * sizeof *p->vars); - for (i = 0; i < p->nfns; ++i) { - if (p->fns[i].nparams) { - toy_parser_free_mem(p, p->fns[i].params, - p->fns[i].nparams * sizeof *p->fns[i].params); - toy_parser_free_mem(p, p->fns[i].toy_params, - p->fns[i].nparams * sizeof *p->fns[i].toy_params); - } - } - toy_parser_free_mem(p, p->fns, p->cap_fns * sizeof *p->fns); - toy_parser_free_mem(p, p->globals, p->cap_globals * sizeof *p->globals); - for (i = 0; i < p->type_table.ntypes; ++i) { - if (p->type_table.types[i].nparams) { - toy_parser_free_mem(p, p->type_table.types[i].params, - p->type_table.types[i].nparams * - sizeof *p->type_table.types[i].params); - } - } - toy_parser_free_mem(p, p->type_table.types, - p->type_table.cap_types * sizeof *p->type_table.types); - for (i = 0; i < p->type_table.count; ++i) { - if (p->type_table.named[i].cap_enum_values) { - toy_parser_free_mem(p, p->type_table.named[i].enum_values, - p->type_table.named[i].cap_enum_values * - sizeof *p->type_table.named[i].enum_values); - } - if (p->type_table.named[i].cap_fields) { - toy_parser_free_mem(p, p->type_table.named[i].fields, - p->type_table.named[i].cap_fields * - sizeof *p->type_table.named[i].fields); - } - } - toy_parser_free_mem(p, p->type_table.named, - p->type_table.cap * sizeof *p->type_table.named); toy_parser_free_mem(p, p->scopes, p->cap_scopes * sizeof *p->scopes); toy_parser_free_mem(p, p->labels, p->cap_labels * sizeof *p->labels); toy_parser_free_mem(p, p->goto_targets, p->cap_goto_targets * sizeof *p->goto_targets); + toy_parser_free_mem(p, p->fn_syms, p->cap_fn_syms * sizeof *p->fn_syms); + toy_parser_free_mem(p, p->global_syms, + p->cap_global_syms * sizeof *p->global_syms); + toy_parser_free_mem(p, p->txn.undo, p->txn.cap_undo * sizeof *p->txn.undo); + p->txn.undo = NULL; + p->txn.cap_undo = 0; + p->txn.nundo = 0; + p->txn.open = 0; p->vars = NULL; - p->fns = NULL; - p->globals = NULL; - p->type_table.types = NULL; - p->type_table.named = NULL; p->scopes = NULL; p->labels = NULL; p->goto_targets = NULL; - p->nvars = p->nfns = p->nglobals = 0; - p->type_table.count = 0; - p->type_table.ntypes = 0; + p->fn_syms = NULL; + p->global_syms = NULL; + p->cap_fn_syms = 0; + p->cap_global_syms = 0; + p->nvars = 0; p->nscopes = p->nlabels = 0; - p->cap_vars = p->cap_fns = p->cap_globals = 0; - p->type_table.cap = 0; - p->type_table.cap_types = 0; + p->cap_vars = 0; p->cap_scopes = p->cap_labels = 0; p->cap_goto_targets = 0; p->last_type = TOY_TYPE_NONE; } +void toy_module_dispose(ToyModule* m, CfreeCompiler* c) { + size_t i; + ToyTypeTable* tt = &m->type_table; + /* Only free per-element pointer arrays when their companion count is + * non-zero — allocation only happens in that case, so a zero count means + * the pointer field was never written by an allocation. This guards + * against rare paths that leave a pointer slot uninitialized while + * leaving the count at 0; without it, freeing a garbage pointer aborts + * libc malloc with "pointer being freed was not allocated". */ + for (i = 0; i < m->nfns; ++i) { + if (m->fns[i].nparams) { + toy_mem_free(c, m->fns[i].params, + m->fns[i].nparams * sizeof *m->fns[i].params); + toy_mem_free(c, m->fns[i].toy_params, + m->fns[i].nparams * sizeof *m->fns[i].toy_params); + } + } + toy_mem_free(c, m->fns, m->cap_fns * sizeof *m->fns); + toy_mem_free(c, m->globals, m->cap_globals * sizeof *m->globals); + for (i = 0; i < tt->ntypes; ++i) { + if (tt->types[i].nparams) { + toy_mem_free(c, tt->types[i].params, + tt->types[i].nparams * sizeof *tt->types[i].params); + } + } + toy_mem_free(c, tt->types, tt->cap_types * sizeof *tt->types); + for (i = 0; i < tt->count; ++i) { + if (tt->named[i].cap_enum_values) { + toy_mem_free(c, tt->named[i].enum_values, + tt->named[i].cap_enum_values * sizeof *tt->named[i].enum_values); + } + if (tt->named[i].cap_fields) { + toy_mem_free(c, tt->named[i].fields, + tt->named[i].cap_fields * sizeof *tt->named[i].fields); + } + } + toy_mem_free(c, tt->named, tt->cap * sizeof *tt->named); + m->fns = NULL; + m->globals = NULL; + tt->types = NULL; + tt->named = NULL; + m->nfns = m->nglobals = 0; + tt->count = tt->ntypes = 0; + m->cap_fns = m->cap_globals = 0; + tt->cap = tt->cap_types = 0; +} + +void toy_txn_begin(ToyParser* p) { + ToyModule* m = p->module; + p->txn.open = 1; + p->txn.fns = m->nfns; + p->txn.globals = m->nglobals; + p->txn.types = m->type_table.ntypes; + p->txn.named = m->type_table.count; + p->txn.nundo = 0; /* reuse the journal buffer; cap_undo persists */ +} + +void toy_txn_commit(ToyParser* p) { + if (!p->txn.open) return; + /* Staged appends are already in the module; just drop the journal. */ + p->txn.open = 0; + p->txn.nundo = 0; +} + +int toy_txn_record_named(ToyParser* p, size_t index) { + size_t i; + if (!p->txn.open || index >= p->txn.named) return 1; /* staged or no txn */ + for (i = 0; i < p->txn.nundo; ++i) { + if (p->txn.undo[i].kind == TOY_UNDO_NAMED && p->txn.undo[i].index == index) + return 1; /* keep the earliest before-image */ + } + if (!toy_parser_reserve(p, (void**)&p->txn.undo, &p->txn.cap_undo, + p->txn.nundo + 1u, sizeof *p->txn.undo, + "undo journal")) { + return 0; + } + p->txn.undo[p->txn.nundo].kind = TOY_UNDO_NAMED; + p->txn.undo[p->txn.nundo].index = index; + p->txn.undo[p->txn.nundo].saved.named = p->module->type_table.named[index]; + p->txn.nundo++; + return 1; +} + +int toy_txn_record_type(ToyParser* p, size_t index) { + size_t i; + if (!p->txn.open || index >= p->txn.types) return 1; /* staged or no txn */ + for (i = 0; i < p->txn.nundo; ++i) { + if (p->txn.undo[i].kind == TOY_UNDO_TYPE && p->txn.undo[i].index == index) + return 1; + } + if (!toy_parser_reserve(p, (void**)&p->txn.undo, &p->txn.cap_undo, + p->txn.nundo + 1u, sizeof *p->txn.undo, + "undo journal")) { + return 0; + } + p->txn.undo[p->txn.nundo].kind = TOY_UNDO_TYPE; + p->txn.undo[p->txn.nundo].index = index; + p->txn.undo[p->txn.nundo].saved.type = p->module->type_table.types[index]; + p->txn.nundo++; + return 1; +} + +void toy_txn_abort(ToyParser* p) { + ToyModule* m = p->module; + ToyTypeTable* tt = &m->type_table; + size_t i; + if (!p->txn.open) return; + p->txn.open = 0; + /* 1. Restore committed entries that were mutated in place, newest first. + * The current entry may have allocated a fields/enum/params array since the + * snapshot; free it before restoring the (typically NULL) saved pointer. A + * committed entry is only mutated when completing a forward declaration, so + * the saved array is NULL and there is no realloc-of-saved hazard. */ + while (p->txn.nundo > 0) { + ToyUndo* u = &p->txn.undo[--p->txn.nundo]; + if (u->kind == TOY_UNDO_NAMED) { + ToyNamedType* cur = &tt->named[u->index]; + if (cur->fields != u->saved.named.fields) { + toy_mem_free(p->c, cur->fields, cur->cap_fields * sizeof *cur->fields); + } + if (cur->enum_values != u->saved.named.enum_values) { + toy_mem_free(p->c, cur->enum_values, + cur->cap_enum_values * sizeof *cur->enum_values); + } + *cur = u->saved.named; + } else { + ToyType* cur = &tt->types[u->index]; + if (cur->params != u->saved.type.params) { + toy_mem_free(p->c, cur->params, cur->nparams * sizeof *cur->params); + } + *cur = u->saved.type; + } + } + /* 2. Truncate staged appends, freeing each dropped entry's sub-arrays. */ + for (i = p->txn.fns; i < m->nfns; ++i) { + if (m->fns[i].nparams) { + toy_mem_free(p->c, m->fns[i].params, + m->fns[i].nparams * sizeof *m->fns[i].params); + toy_mem_free(p->c, m->fns[i].toy_params, + m->fns[i].nparams * sizeof *m->fns[i].toy_params); + } + } + m->nfns = p->txn.fns; + m->nglobals = p->txn.globals; /* globals own no sub-arrays */ + for (i = p->txn.types; i < tt->ntypes; ++i) { + if (tt->types[i].nparams) { + toy_mem_free(p->c, tt->types[i].params, + tt->types[i].nparams * sizeof *tt->types[i].params); + } + } + tt->ntypes = p->txn.types; + for (i = p->txn.named; i < tt->count; ++i) { + if (tt->named[i].cap_fields) { + toy_mem_free(p->c, tt->named[i].fields, + tt->named[i].cap_fields * sizeof *tt->named[i].fields); + } + if (tt->named[i].cap_enum_values) { + toy_mem_free(p->c, tt->named[i].enum_values, + tt->named[i].cap_enum_values * sizeof *tt->named[i].enum_values); + } + } + tt->count = p->txn.named; +} + int toy_parser_reserve(ToyParser* p, void** items, size_t* cap, size_t want, size_t elem_size, const char* what) { CfreeHeap* h; diff --git a/lang/toy/symbols.c b/lang/toy/symbols.c @@ -8,12 +8,6 @@ ToyVar* toy_find_var(ToyParser* p, CfreeSym name) { return NULL; } -int toy_add_local(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, - CfreeCgLocal slot, int mutable) { - return toy_add_local_typed(p, name, ty, toy_type_from_cg(p, ty), slot, - mutable); -} - int toy_add_local_typed(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, ToyTypeId toy_type, CfreeCgLocal slot, int mutable) { if (!toy_parser_reserve(p, (void**)&p->vars, &p->cap_vars, p->nvars + 1u, @@ -32,12 +26,6 @@ int toy_add_local_typed(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, return 1; } -int toy_add_static_local(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, - CfreeCgSym sym, int mutable) { - return toy_add_static_local_typed(p, name, ty, toy_type_from_cg(p, ty), sym, - mutable); -} - int toy_add_static_local_typed(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, ToyTypeId toy_type, CfreeCgSym sym, int mutable) { @@ -59,20 +47,12 @@ int toy_add_static_local_typed(ToyParser* p, CfreeSym name, CfreeCgTypeId ty, ToyFn* toy_find_fn(ToyParser* p, CfreeSym name) { size_t i; - for (i = p->nfns; i > 0; --i) { - if (p->fns[i - 1].name == name) return &p->fns[i - 1]; + for (i = p->module->nfns; i > 0; --i) { + if (p->module->fns[i - 1].name == name) return &p->module->fns[i - 1]; } return NULL; } -ToyFn* toy_add_fn(ToyParser* p, CfreeSym name, CfreeCgSym sym, - CfreeCgTypeId type, CfreeCgTypeId ret, - const CfreeCgTypeId* params, size_t nparams, int variadic) { - return toy_add_fn_typed(p, name, sym, type, toy_type_from_cg(p, type), ret, - toy_type_from_cg(p, ret), params, NULL, nparams, - variadic); -} - ToyFn* toy_add_fn_typed(ToyParser* p, CfreeSym name, CfreeCgSym sym, CfreeCgTypeId type, ToyTypeId toy_type, CfreeCgTypeId ret, ToyTypeId toy_ret, @@ -81,11 +61,15 @@ ToyFn* toy_add_fn_typed(ToyParser* p, CfreeSym name, CfreeCgSym sym, int variadic) { ToyFn* fn; size_t i; - if (!toy_parser_reserve(p, (void**)&p->fns, &p->cap_fns, p->nfns + 1u, - sizeof *p->fns, "functions")) { + if (!toy_parser_reserve(p, (void**)&p->module->fns, &p->module->cap_fns, + p->module->nfns + 1u, sizeof *p->module->fns, + "functions")) { return NULL; } - fn = &p->fns[p->nfns]; + /* Record the current-object symbol for this index before touching the slot, + * so an OOM here leaves nothing half-added (nfns is bumped only at the end). */ + if (!toy_fn_set_cur_sym(p, p->module->nfns, sym)) return NULL; + fn = &p->module->fns[p->module->nfns]; fn->params = NULL; fn->toy_params = NULL; if (nparams != 0) { @@ -102,7 +86,6 @@ ToyFn* toy_add_fn_typed(ToyParser* p, CfreeSym name, CfreeCgSym sym, } } fn->name = name; - fn->sym = sym; fn->type = type; fn->toy_type = toy_type != TOY_TYPE_NONE ? toy_type : toy_type_from_cg(p, type); @@ -116,37 +99,32 @@ ToyFn* toy_add_fn_typed(ToyParser* p, CfreeSym name, CfreeCgSym sym, ? toy_params[i] : toy_type_from_cg(p, params[i]); } - p->nfns++; + p->module->nfns++; return fn; } ToyGlobal* toy_find_global(ToyParser* p, CfreeSym name) { size_t i; - for (i = p->nglobals; i > 0; --i) { - if (p->globals[i - 1].name == name) return &p->globals[i - 1]; + for (i = p->module->nglobals; i > 0; --i) { + if (p->module->globals[i - 1].name == name) return &p->module->globals[i - 1]; } return NULL; } -int toy_add_global(ToyParser* p, CfreeSym name, CfreeCgSym sym, - CfreeCgTypeId type, int mutable) { - return toy_add_global_typed(p, name, sym, type, toy_type_from_cg(p, type), - mutable); -} - int toy_add_global_typed(ToyParser* p, CfreeSym name, CfreeCgSym sym, CfreeCgTypeId type, ToyTypeId toy_type, int mutable) { - if (!toy_parser_reserve(p, (void**)&p->globals, &p->cap_globals, - p->nglobals + 1u, sizeof *p->globals, "globals")) { + if (!toy_parser_reserve(p, (void**)&p->module->globals, + &p->module->cap_globals, p->module->nglobals + 1u, + sizeof *p->module->globals, "globals")) { return 0; } - p->globals[p->nglobals].name = name; - p->globals[p->nglobals].sym = sym; - p->globals[p->nglobals].type = type; - p->globals[p->nglobals].toy_type = + if (!toy_global_set_cur_sym(p, p->module->nglobals, sym)) return 0; + p->module->globals[p->module->nglobals].name = name; + p->module->globals[p->module->nglobals].type = type; + p->module->globals[p->module->nglobals].toy_type = toy_type != TOY_TYPE_NONE ? toy_type : toy_type_from_cg(p, type); - p->globals[p->nglobals].mutable = mutable; - p->nglobals++; + p->module->globals[p->module->nglobals].mutable = mutable; + p->module->nglobals++; return 1; } @@ -244,10 +222,45 @@ void toy_addr_index(ToyParser* p, uint64_t elem_size, CfreeCgSym toy_find_decl_sym(ToyParser* p, CfreeSym name) { ToyGlobal* g = toy_find_global(p, name); - if (g) return g->sym; + if (g) return toy_global_cur_sym(p, g); { ToyFn* fn = toy_find_fn(p, name); - if (fn) return fn->sym; + if (fn) return toy_fn_cur_sym(p, fn); } return CFREE_CG_SYM_NONE; } + +CfreeCgSym toy_fn_cur_sym(ToyParser* p, const ToyFn* fn) { + size_t index; + if (!fn) return CFREE_CG_SYM_NONE; + index = (size_t)(fn - p->module->fns); + if (index >= p->module->nfns) return CFREE_CG_SYM_NONE; + return p->fn_syms[index]; +} + +CfreeCgSym toy_global_cur_sym(ToyParser* p, const ToyGlobal* g) { + size_t index; + if (!g) return CFREE_CG_SYM_NONE; + index = (size_t)(g - p->module->globals); + if (index >= p->module->nglobals) return CFREE_CG_SYM_NONE; + return p->global_syms[index]; +} + +int toy_fn_set_cur_sym(ToyParser* p, size_t index, CfreeCgSym sym) { + if (!toy_parser_reserve(p, (void**)&p->fn_syms, &p->cap_fn_syms, index + 1u, + sizeof *p->fn_syms, "fn symbol env")) { + return 0; + } + p->fn_syms[index] = sym; + return 1; +} + +int toy_global_set_cur_sym(ToyParser* p, size_t index, CfreeCgSym sym) { + if (!toy_parser_reserve(p, (void**)&p->global_syms, &p->cap_global_syms, + index + 1u, sizeof *p->global_syms, + "global symbol env")) { + return 0; + } + p->global_syms[index] = sym; + return 1; +} diff --git a/lang/toy/types.c b/lang/toy/types.c @@ -3,14 +3,14 @@ #include "internal.h" static ToyType* toy_type_slot(ToyParser* p, ToyTypeId id) { - if (id == TOY_TYPE_NONE || id > p->type_table.ntypes) return NULL; - return &p->type_table.types[id - 1u]; + if (id == TOY_TYPE_NONE || id > p->module->type_table.ntypes) return NULL; + return &p->module->type_table.types[id - 1u]; } static ToyTypeId toy_type_find(ToyParser* p, const ToyType* key) { size_t i; - for (i = 0; i < p->type_table.ntypes; ++i) { - ToyType* ty = &p->type_table.types[i]; + for (i = 0; i < p->module->type_table.ntypes; ++i) { + ToyType* ty = &p->module->type_table.types[i]; if (ty->kind == key->kind && ty->cg == key->cg && ty->name == key->name && ty->base == key->base && ty->elem == key->elem && ty->pointee == key->pointee && ty->ret == key->ret && @@ -30,14 +30,14 @@ static ToyTypeId toy_type_find(ToyParser* p, const ToyType* key) { static ToyTypeId toy_type_add(ToyParser* p, const ToyType* type) { ToyTypeId existing = toy_type_find(p, type); if (existing != TOY_TYPE_NONE) return existing; - if (!toy_parser_reserve(p, (void**)&p->type_table.types, - &p->type_table.cap_types, p->type_table.ntypes + 1u, - sizeof *p->type_table.types, "toy types")) { + if (!toy_parser_reserve(p, (void**)&p->module->type_table.types, + &p->module->type_table.cap_types, p->module->type_table.ntypes + 1u, + sizeof *p->module->type_table.types, "toy types")) { return TOY_TYPE_NONE; } - p->type_table.types[p->type_table.ntypes] = *type; - p->type_table.ntypes++; - return (ToyTypeId)p->type_table.ntypes; + p->module->type_table.types[p->module->type_table.ntypes] = *type; + p->module->type_table.ntypes++; + return (ToyTypeId)p->module->type_table.ntypes; } static ToyTypeId toy_type_register(ToyParser* p, ToyTypeKind kind, @@ -446,8 +446,8 @@ ToyTypeId toy_type_from_cg(ToyParser* p, CfreeCgTypeId cg) { CfreeCgTypeKind kind; size_t i; if (cg == CFREE_CG_TYPE_NONE) return TOY_TYPE_NONE; - for (i = 0; i < p->type_table.ntypes; ++i) { - ToyType* existing = &p->type_table.types[i]; + for (i = 0; i < p->module->type_table.ntypes; ++i) { + ToyType* existing = &p->module->type_table.types[i]; if (existing->cg == cg && (existing->kind == TOY_TYPE_NOMINAL_RECORD || existing->kind == TOY_TYPE_TUPLE_RECORD || existing->kind == TOY_TYPE_ENUM)) { @@ -517,11 +517,13 @@ ToyTypeId toy_type_register_named_record(ToyParser* p, CfreeSym name, CfreeCgTypeId cg, int is_tuple) { ToyTypeKind kind = is_tuple ? TOY_TYPE_TUPLE_RECORD : TOY_TYPE_NOMINAL_RECORD; size_t i; - for (i = 0; i < p->type_table.ntypes; ++i) { - ToyType* type = &p->type_table.types[i]; + for (i = 0; i < p->module->type_table.ntypes; ++i) { + ToyType* type = &p->module->type_table.types[i]; if (type->kind == kind && type->name == name) { - if (type->cg == CFREE_CG_TYPE_NONE && cg != CFREE_CG_TYPE_NONE) + if (type->cg == CFREE_CG_TYPE_NONE && cg != CFREE_CG_TYPE_NONE) { + if (!toy_txn_record_type(p, i)) return TOY_TYPE_NONE; type->cg = cg; + } return (ToyTypeId)(i + 1u); } } @@ -569,8 +571,8 @@ ToyTypeId toy_type_register_slice(ToyParser* p, CfreeCgTypeId elem_cg, size_t i; if (elem == TOY_TYPE_NONE || elem_cg == CFREE_CG_TYPE_NONE) return TOY_TYPE_NONE; - for (i = 0; i < p->type_table.ntypes; ++i) { - ToyType* existing = &p->type_table.types[i]; + for (i = 0; i < p->module->type_table.ntypes; ++i) { + ToyType* existing = &p->module->type_table.types[i]; if (existing->kind == TOY_TYPE_SLICE && existing->elem == elem) return (ToyTypeId)(i + 1u); } @@ -606,7 +608,7 @@ ToyTypeId toy_type_register_func(ToyParser* p, CfreeCgTypeId cg, ToyTypeId ret, memcpy(type.params, params, nparams * sizeof *type.params); } id = toy_type_add(p, &type); - if (id == TOY_TYPE_NONE || p->type_table.types[id - 1u].params != type.params) + if (id == TOY_TYPE_NONE || p->module->type_table.types[id - 1u].params != type.params) toy_parser_free_mem(p, type.params, nparams * sizeof *type.params); return id; } @@ -742,18 +744,18 @@ int toy_record_field_index(ToyParser* p, CfreeCgTypeId record_ty, ToyNamedType* toy_find_named_type(ToyParser* p, CfreeSym name) { size_t i; - for (i = p->type_table.count; i > 0; --i) { - if (p->type_table.named[i - 1].name == name) - return &p->type_table.named[i - 1]; + for (i = p->module->type_table.count; i > 0; --i) { + if (p->module->type_table.named[i - 1].name == name) + return &p->module->type_table.named[i - 1]; } return NULL; } ToyNamedType* toy_find_named_type_by_type(ToyParser* p, CfreeCgTypeId type) { size_t i; - for (i = p->type_table.count; i > 0; --i) { - if (p->type_table.named[i - 1].type == type) - return &p->type_table.named[i - 1]; + for (i = p->module->type_table.count; i > 0; --i) { + if (p->module->type_table.named[i - 1].type == type) + return &p->module->type_table.named[i - 1]; } return NULL; } @@ -771,31 +773,41 @@ int toy_add_named_type(ToyParser* p, CfreeSym name, CfreeCgTypeId type, toy_type_register_named_record(p, name, type, kind == TOY_NAMED_TUPLE); if (toy_type == TOY_TYPE_NONE && type != CFREE_CG_TYPE_NONE) return 0; if (existing && existing->type == CFREE_CG_TYPE_NONE) { + /* Completing a forward declaration mutates an existing entry in place; if + * it was committed before this compile, snapshot it for rollback. */ + if (!toy_txn_record_named( + p, (size_t)(existing - p->module->type_table.named))) { + return 0; + } existing->type = type; existing->toy_type = toy_type; existing->kind = kind; existing->base_type = base_type; return 1; } - if (!toy_parser_reserve(p, (void**)&p->type_table.named, &p->type_table.cap, - p->type_table.count + 1u, sizeof *p->type_table.named, + if (!toy_parser_reserve(p, (void**)&p->module->type_table.named, &p->module->type_table.cap, + p->module->type_table.count + 1u, sizeof *p->module->type_table.named, "named types")) { return 0; } - memset(&p->type_table.named[p->type_table.count], 0, - sizeof p->type_table.named[p->type_table.count]); - p->type_table.named[p->type_table.count].name = name; - p->type_table.named[p->type_table.count].type = type; - p->type_table.named[p->type_table.count].toy_type = toy_type; - p->type_table.named[p->type_table.count].kind = kind; - p->type_table.named[p->type_table.count].base_type = base_type; - p->type_table.count++; + memset(&p->module->type_table.named[p->module->type_table.count], 0, + sizeof p->module->type_table.named[p->module->type_table.count]); + p->module->type_table.named[p->module->type_table.count].name = name; + p->module->type_table.named[p->module->type_table.count].type = type; + p->module->type_table.named[p->module->type_table.count].toy_type = toy_type; + p->module->type_table.named[p->module->type_table.count].kind = kind; + p->module->type_table.named[p->module->type_table.count].base_type = base_type; + p->module->type_table.count++; return 1; } int toy_set_named_type_fields(ToyParser* p, ToyNamedType* named, const ToyRecordFieldInfo* fields, size_t nfields) { + if (!toy_txn_record_named(p, + (size_t)(named - p->module->type_table.named))) { + return 0; + } if (!toy_parser_reserve(p, (void**)&named->fields, &named->cap_fields, nfields, sizeof *named->fields, "record fields")) { return 0; @@ -808,6 +820,10 @@ int toy_set_named_type_fields(ToyParser* p, ToyNamedType* named, int toy_set_named_type_enum_values(ToyParser* p, ToyNamedType* named, const ToyEnumConst* values, size_t nvalues) { if (!named) return 0; + if (!toy_txn_record_named(p, + (size_t)(named - p->module->type_table.named))) { + return 0; + } if (!toy_parser_reserve(p, (void**)&named->enum_values, &named->cap_enum_values, nvalues, sizeof *named->enum_values, "enum values")) { diff --git a/lang/wasm/wasm.c b/lang/wasm/wasm.c @@ -57,6 +57,8 @@ const CfreeFrontendVTable cfree_wasm_frontend_vtable = { wasm_frontend_free, wasm_extensions, (uint32_t)(sizeof wasm_extensions / sizeof wasm_extensions[0]), + NULL, /* commit: wasm has no durable cross-compile state */ + NULL, /* abort */ }; CFREE_API int cfree_wasm_wat_to_wasm(CfreeCompiler* c, const CfreeSlice* input, diff --git a/src/api/compile.c b/src/api/compile.c @@ -47,6 +47,8 @@ const CfreeFrontendVTable cfree_asm_frontend_vtable = { asm_frontend_free, asm_extensions, (uint32_t)(sizeof asm_extensions / sizeof asm_extensions[0]), + NULL, /* commit: asm has no durable cross-compile state */ + NULL, /* abort */ }; static SrcLoc no_loc(void) { @@ -152,12 +154,10 @@ static const CfreeFrontendVTable* frontend_for_language(Compiler* c, } static void validate_bytes(Compiler* c, const CfreeSourceInput* in); -static void compile_frontend_state_into(Compiler* c, - const CfreeFrontendVTable* vtable, - CfreeFrontendState* frontend, - const CfreeFrontendCompileOptions* opts, - const CfreeSourceInput* input, - ObjBuilder* ob); +static CfreeStatus compile_frontend_state_into( + Compiler* c, const CfreeFrontendVTable* vtable, CfreeFrontendState* frontend, + const CfreeFrontendCompileOptions* opts, const CfreeSourceInput* input, + ObjBuilder* ob); static CfreeStatus cfree_frontend_new(CfreeCompiler* c, CfreeLanguage lang, CfreeFrontend** out) { @@ -188,11 +188,24 @@ static CfreeStatus cfree_frontend_new(CfreeCompiler* c, CfreeLanguage lang, return CFREE_OK; } +static void cfree_frontend_commit(CfreeFrontend* frontend) { + if (frontend && frontend->vtable && frontend->vtable->commit) { + frontend->vtable->commit(frontend->state); + } +} + +static void cfree_frontend_abort(CfreeFrontend* frontend) { + if (frontend && frontend->vtable && frontend->vtable->abort) { + frontend->vtable->abort(frontend->state); + } +} + static CfreeStatus cfree_frontend_compile( CfreeFrontend* frontend, const CfreeFrontendCompileOptions* opts, const CfreeSourceInput* input, CfreeObjBuilder* out) { Compiler* c; PanicSave saved; + CfreeStatus st; if (!frontend || !frontend->c || !frontend->vtable || !frontend->state || !opts || !input || !out) { @@ -202,18 +215,27 @@ static CfreeStatus cfree_frontend_compile( c = (Compiler*)frontend->c; compiler_panic_save(c, &saved); if (setjmp(c->panic)) { + /* A genuine internal panic (CG/backend) longjmp'd here. Run cleanups, then + * roll back any durable frontend state staged during this compile, and + * propagate as a soft error. Ordinary diagnostic failures do NOT take this + * path: the frontend returns CFREE_ERR below without panicking. */ compiler_run_cleanups(c); + cfree_frontend_abort(frontend); compiler_panic_restore(c, &saved); return CFREE_ERR; } validate_bytes(c, input); metrics_scope_begin(c, "compile.tu"); metrics_count(c, "compile.input_bytes", (u64)input->bytes.len); - compile_frontend_state_into(c, frontend->vtable, frontend->state, opts, input, - (ObjBuilder*)out); + st = compile_frontend_state_into(c, frontend->vtable, frontend->state, opts, + input, (ObjBuilder*)out); metrics_scope_end(c, "compile.tu"); + /* On a soft diagnostic failure, roll back the staged transaction here so the + * frontend is left exactly as it was before this compile. On success the + * transaction is left open for the caller to commit or abort. */ + if (st != CFREE_OK) cfree_frontend_abort(frontend); compiler_panic_restore(c, &saved); - return CFREE_OK; + return st; } static void cfree_frontend_free(CfreeFrontend* frontend) { @@ -256,9 +278,14 @@ CfreeStatus cfree_compile_session_new(CfreeCompiler* c, return CFREE_OK; } -CfreeStatus cfree_compile_session_compile(CfreeCompileSession* s, - const CfreeSourceInput* input, - CfreeObjBuilder** out) { +/* Shared compile path. On failure the frontend transaction has already been + * rolled back by cfree_frontend_compile and *out is NULL. On success, when + * commit_on_success is set (the batch path), the transaction is committed + * before returning; otherwise it is left open for the caller to resolve. */ +static CfreeStatus compile_session_run(CfreeCompileSession* s, + const CfreeSourceInput* input, + CfreeObjBuilder** out, + int commit_on_success) { ObjBuilder* ob; CfreeFrontendCompileOptions opts; CfreeStatus st; @@ -277,10 +304,31 @@ CfreeStatus cfree_compile_session_compile(CfreeCompileSession* s, obj_free(ob); return st; } + if (commit_on_success) cfree_frontend_commit(s->frontend); *out = (CfreeObjBuilder*)ob; return CFREE_OK; } +CfreeStatus cfree_compile_session_compile(CfreeCompileSession* s, + const CfreeSourceInput* input, + CfreeObjBuilder** out) { + return compile_session_run(s, input, out, /*commit_on_success=*/1); +} + +CfreeStatus cfree_compile_session_stage(CfreeCompileSession* s, + const CfreeSourceInput* input, + CfreeObjBuilder** out) { + return compile_session_run(s, input, out, /*commit_on_success=*/0); +} + +void cfree_compile_session_commit(CfreeCompileSession* s) { + if (s && s->frontend) cfree_frontend_commit(s->frontend); +} + +void cfree_compile_session_abort(CfreeCompileSession* s) { + if (s && s->frontend) cfree_frontend_abort(s->frontend); +} + void cfree_compile_session_free(CfreeCompileSession* s) { Heap* h; if (!s) return; @@ -289,27 +337,27 @@ void cfree_compile_session_free(CfreeCompileSession* s) { h->free(h, s, sizeof(*s)); } -static void compile_frontend_state_into(Compiler* c, - const CfreeFrontendVTable* vtable, - CfreeFrontendState* frontend, - const CfreeFrontendCompileOptions* opts, - const CfreeSourceInput* input, - ObjBuilder* ob) { +static CfreeStatus compile_frontend_state_into( + Compiler* c, const CfreeFrontendVTable* vtable, CfreeFrontendState* frontend, + const CfreeFrontendCompileOptions* opts, const CfreeSourceInput* input, + ObjBuilder* ob) { CfreeStatus st; metrics_scope_begin(c, "compile.frontend"); st = vtable->compile(frontend, opts, input, ob); metrics_scope_end(c, "compile.frontend"); - if (st != CFREE_OK) { - compiler_panic(c, no_loc(), "frontend failed for input: %.*s", - SLICE_ARG(input->name)); - } + /* Ordinary diagnostic failure: fail softly with the status the frontend + * already reported. No synthetic fatal, and do not finalize a half-built + * object. Genuine internal failures panic from inside vtable->compile and + * never reach here. */ + if (st != CFREE_OK) return st; metrics_scope_begin(c, "compile.obj_finalize"); obj_finalize(ob); metrics_scope_end(c, "compile.obj_finalize"); metrics_count(c, "compile.obj_sections", obj_section_count(ob)); metrics_count(c, "compile.obj_relocs", obj_reloc_total(ob)); + return CFREE_OK; } /* ============================================================ diff --git a/src/link/link_jit.c b/src/link/link_jit.c @@ -1411,7 +1411,14 @@ CfreeStatus cfree_jit_publish(CfreeJit* jit, const CfreeJitPublishOptions* opts, c = jit->c; compiler_panic_save(c, &saved); if (setjmp(c->panic)) { + /* A failed append (e.g. duplicate global) panics from + * jit_append_obj_inner. compiler_run_cleanups fires the borrowed link + * session's deferred linker_cleanup, which frees its Linker; null the + * session's references so the caller's cfree_link_session_free does not + * double-free. Mirrors link_session_guard's recovery. */ compiler_run_cleanups(c); + link->linker = NULL; + link->image = NULL; compiler_panic_restore(c, &saved); return CFREE_ERR; } diff --git a/test/dbg/cases/toy-error-recovery/stderr b/test/dbg/cases/toy-error-recovery/stderr @@ -1,2 +1,2 @@ -<dbg-jit.toy>:2:15: error: expected expression +<dbg-jit.toy>:2:14: error: expected expression dbg: jit compile failed diff --git a/test/dbg/cases/toy-error-recovery/xfail b/test/dbg/cases/toy-error-recovery/xfail @@ -1 +0,0 @@ -Toy frontend state is poisoned instead of rolled back after a failed snippet. diff --git a/test/dbg/cases/toy-redefine-function/args b/test/dbg/cases/toy-redefine-function/args @@ -0,0 +1,2 @@ +--language +toy diff --git a/test/dbg/cases/toy-redefine-function/expected b/test/dbg/cases/toy-redefine-function/expected @@ -0,0 +1,3 @@ +cfree dbg — 'h' for help, 'q' to quit +$1 = 42 (0x2a) +$2 = 42 (0x2a) diff --git a/test/dbg/cases/toy-redefine-function/stderr b/test/dbg/cases/toy-redefine-function/stderr @@ -0,0 +1,2 @@ +fatal: cfree_jit_append_obj: duplicate global 'twice' +dbg: jit append failed diff --git a/test/dbg/cases/toy-redefine-function/stdin b/test/dbg/cases/toy-redefine-function/stdin @@ -0,0 +1,5 @@ +jit { fn twice(v: i64): i64 { return v * 2; } } +twice(21) +jit { fn twice(v: i64): i64 { return v * 3; } } +twice(21) +q diff --git a/test/dbg/cases/toy-rollback-after-define/args b/test/dbg/cases/toy-rollback-after-define/args @@ -0,0 +1,2 @@ +--language +toy diff --git a/test/dbg/cases/toy-rollback-after-define/expected b/test/dbg/cases/toy-rollback-after-define/expected @@ -0,0 +1,2 @@ +cfree dbg — 'h' for help, 'q' to quit +$1 = 10 (0xa) diff --git a/test/dbg/cases/toy-rollback-after-define/stderr b/test/dbg/cases/toy-rollback-after-define/stderr @@ -0,0 +1,2 @@ +<dbg-jit-2.toy>:1:29: error: expected expression +dbg: jit compile failed diff --git a/test/dbg/cases/toy-rollback-after-define/stdin b/test/dbg/cases/toy-rollback-after-define/stdin @@ -0,0 +1,4 @@ +jit { fn f(): i64 { return 10; } } +jit { fn g(): i64 { return f() + ; } } +f() +q diff --git a/test/dbg/cases/toy-rollback-toplevel/args b/test/dbg/cases/toy-rollback-toplevel/args @@ -0,0 +1,2 @@ +--language +toy diff --git a/test/dbg/cases/toy-rollback-toplevel/expected b/test/dbg/cases/toy-rollback-toplevel/expected @@ -0,0 +1,4 @@ +cfree dbg — 'h' for help, 'q' to quit +$1 = 42 (0x2a) +$2 = 42 (0x2a) +$3 = 42 (0x2a) diff --git a/test/dbg/cases/toy-rollback-toplevel/stderr b/test/dbg/cases/toy-rollback-toplevel/stderr @@ -0,0 +1,2 @@ +<dbg-jit-2.toy>:1:29: error: expected expression +dbg: jit compile failed diff --git a/test/dbg/cases/toy-rollback-toplevel/stdin b/test/dbg/cases/toy-rollback-toplevel/stdin @@ -0,0 +1,7 @@ +jit { fn twice(v: i64): i64 { return v * 2; } } +twice(21) +jit { fn bad(): i64 { return 1 + ; } } +twice(21) +jit { fn add(a: i64, b: i64): i64 { return a + b; } } +add(20, 22) +q diff --git a/test/dbg/cases/toy-rollback-type/args b/test/dbg/cases/toy-rollback-type/args @@ -0,0 +1,2 @@ +--language +toy diff --git a/test/dbg/cases/toy-rollback-type/expected b/test/dbg/cases/toy-rollback-type/expected @@ -0,0 +1,2 @@ +cfree dbg — 'h' for help, 'q' to quit +$1 = 42 (0x2a) diff --git a/test/dbg/cases/toy-rollback-type/stderr b/test/dbg/cases/toy-rollback-type/stderr @@ -0,0 +1,2 @@ +<dbg-jit-1.toy>:1:62: error: expected expression +dbg: jit compile failed diff --git a/test/dbg/cases/toy-rollback-type/stdin b/test/dbg/cases/toy-rollback-type/stdin @@ -0,0 +1,5 @@ +jit { record Point { x: i64, y: i64, } fn bad(): i64 { return 1 + ; } } +jit { record Point { x: i64, y: i64, } } +jit { fn sum(): i64 { let p: Point = Point { x: 40, y: 2 }; return p.x + p.y; } } +sum() +q