commit 2686dfe936ae06310cd00dd61166417df8b3b8a7
parent 1806e4076ddd06706caa95e512c54844869694b9
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Sat, 30 May 2026 18:25:20 -0700
doc: file-based incremental linking design (obj+link internals)
Diffstat:
1 file changed, 281 insertions(+), 99 deletions(-)
diff --git a/doc/INCREMENTAL_OBJLINK.md b/doc/INCREMENTAL_OBJLINK.md
@@ -66,12 +66,13 @@ Plus: `link_resolve_at`/`link_resolve_extend` are panic stubs
`LinkRelocApply` records that *produce* those writes are preserved as data first
(invariant, internal `src/link/link.h:234-246`).
-**Benchmark (true shape).** `tmp/projects/lua`: 35 `.c` files; the Makefile
-compiles **32 objects** (CORE_O=20 + LIB_O=12) into `liblua.a`, then links **two**
-executables — `lua` and `luac` — that share the archive. So the substrate must
-model (a) archive members as link inputs and (b) one edited TU fanning out to
-multiple final images. `sqlite-amalg` (1 huge TU) and `yyjson` (1 TU) exercise
-the single-TU degenerate case.
+**Benchmark shape.** The substrate must model (a) archive members as link inputs
+and (b) one edited TU fanning out to multiple final images. The acceptance suite
+uses a small **synthetic** multi-TU fixture for exactly this (§19.2) — a handful
+of core TUs archived into a static library, linked into two executables that
+share it — rather than vendoring a third-party project as a test dependency.
+Real codebases (amalgamations like sqlite, multi-TU libraries like lua) remain
+useful as *later* wall-clock perf targets, but are deliberately not test deps.
---
@@ -215,7 +216,8 @@ typedef struct LinkSession {
u64 cursor[SEG_NBUCKETS]; /* append cursor per class (from JIT) */
u64 limit[SEG_NBUCKETS]; /* reserved ceiling per class */
LinkFreeList free[SEG_NBUCKETS]; /* vacated slots, first-fit reuse */
- u32 slack_pct; /* per-atom reserve, default 10% */
+ u32 code_slack_pct; /* per-code-atom reserve; modest (code can relocate) */
+ u32 data_slack_pct; /* per-data-atom reserve; generous (data-grow forces fallback) */
/* atom placement table, keyed by content_id; the persisted core (§10) */
LinkAtomPlace* atoms; u32 natoms;
} LinkSession;
@@ -291,10 +293,14 @@ control flow.
## 8. Placement, slack, and the move-on-grow primitive
**Slack.** Today sections are contiguous with only alignment padding
-(`link_layout.c:340-348`). Under `--incremental`, reserve per-atom slack
-(`slack_pct`, gold's `--incremental-patch=n` analog) so overwrite-in-place is the
-common case. A two-level free-list (one of free file blocks, one per segment
-bucket) recycles vacated slots, first-fit.
+(`link_layout.c:340-348`). Under `--incremental`, reserve per-atom slack so
+overwrite-in-place is the common case. **Code and data get separate, tunable
+budgets** (decision §20.2): `code_slack_pct` is modest because code atoms can
+relocate cheaply (§8.1), while `data_slack_pct` is more generous because a data
+atom that outgrows its slot forces a full-link fallback (data can't be thunked).
+Both default sensibly and are overridable via a link option (gold's
+`--incremental-patch=n` style). A two-level free-list (one of free file blocks,
+one per segment bucket) recycles vacated slots, first-fit.
**The move primitive — swappable.** When an atom moves, callers must still reach
it without their bytes changing. Abstract this as one hook with two
@@ -412,11 +418,14 @@ off) are canonical and reproducible.
- **Stability (falsifiable):** after a patch, `nm`/`addr2line` on an *unchanged*
symbol must return the identical vaddr as before. Enforced by
overwrite-in-slack / append-to-free-slot, never compact.
-- **Determinism audit (prerequisite for dedup, not correctness):** confirm that
- identical `(source, flags, target)` yields byte-identical objects — audit
- symbol ordering and `pool_intern` first-access order in obj emit. With
- content/name keying (§10) a nondeterministic order only costs cache dedup, not
- a wrong patch; but byte-stability is still wanted so two machines agree.
+- **Determinism (decision §20.4 — lock with a test, keep content-keying):** obj
+ emission is *already* byte-deterministic — sections/symbols/relocs emit in
+ insertion order, `.strtab` dedups by linear search, and there are no
+ timestamps, embedded addresses, hash-map iteration, or threading in the emit
+ path (`src/obj/elf/emit.c:298,386,505`). Lock this with a regression test (two
+ compiles ⇒ identical bytes), which enables cross-machine / shared-cache dedup.
+ Content/name keying (§10) remains the *correctness* backbone: if a future
+ change ever reintroduces nondeterminism, it degrades dedup, never correctness.
- **Reloc re-derivation:** never store an absolute `write_vaddr`; always
`atom.vaddr + offset_within_atom` (principle 4).
@@ -424,27 +433,61 @@ off) are canonical and reproducible.
## 13. Debug info (DWARF) consistency
-- A moved atom's `.debug_info`/`.debug_line`/`.debug_aranges` address ranges
- change → reapply that atom's debug relocs (re-derived like §9). Unchanged
- atoms' debug stays byte-stable because their addresses do.
-- v1 stance: rebuild only the *changed TU's* debug sections, `O(change)`. An
- in-slack overwrite that does not move the atom leaves addresses (and therefore
- `.debug_line` byte content) unchanged — free, but see the open question on
- line-number-only shifts (§20).
-- `addr2line` and `cfree dbg` re-read debug from the patched image. The JIT path
- invalidates a cached view by generation counter; a file consumer re-reads the
- file, so the build-id change (§11) is the staleness signal.
+**Decision §20.1: regenerate the changed TU's debug on any body change.**
+Investigation (`src/debug/debug_emit.c:823-905`, `:650-703`) established that
+debug update is **`O(changed TU)`, not `O(atom)`**, and that skipping it is
+*incoherent*:
+
+- cfree emits **one monolithic `.debug_line` program per CU** and a single
+ `.debug_info` CU whose DIEs reference each other by intra-CU `DW_FORM_ref4`
+ offsets. You cannot splice one function's line rows or patch one subprogram DIE
+ in isolation — a change shifts subsequent offsets across the CU.
+- A body change rewrites the instruction→line mapping **regardless of whether
+ the atom moved** (an in-slack overwrite has the same vaddr but different
+ instruction offsets). So "keep `.debug_line` because the address didn't change"
+ is wrong — there is no correct skip.
+- Therefore: on any changed atom, **re-emit that TU's full `.debug_*`**. This is
+ `O(changed TU)`, which is fine — unchanged TUs' debug is byte-stable because
+ their atoms keep their addresses, and one TU's debug regen is cheap relative to
+ the rest of the patch. Address-bearing fields (`DW_AT_low_pc`, `.debug_aranges`,
+ `.debug_rnglists`) re-derive from current placements (§9); `DW_AT_high_pc` is a
+ size, not an address.
+- `addr2line` and `cfree dbg` re-read debug from the patched image; the build-id
+ change (§11) is the staleness signal for file consumers.
+- *Future option (not pursued now):* per-function CUs / split line programs would
+ make debug `O(atom)`, but the gain is marginal versus the per-TU regen cost.
---
## 14. Multi-format & multi-arch
-- **ELF first.** The atom + slack + move-primitive model is format-agnostic, but
- Mach-O carries whole-image structures (chained fixups, `LC_DYLD_INFO`,
- code-signature, `LC_UUID`) that resist in-place patching; enumerate which load
- commands must be regenerated before attempting Mach-O incremental. COFF later.
- Persisted state is side-band CAS for all three (§10), so no per-format on-disk
- incremental metadata.
+- **No format is fundamentally unsuitable (decision §20.6); difference is
+ machinery, so: ELF first, then COFF, then Mach-O.** The atom + slack +
+ move-primitive core is format-agnostic; the persisted state is side-band CAS
+ for all three (§10). Per-format cost:
+ - **ELF — least machinery (first).** Relocations are side data (`.rela.*`)
+ applied at emit, the symtab is flat fixed-offset records (patch the changed
+ entry in place), section headers are file-only, signing is optional. An atom
+ move touches only its own symbol's `st_value` and the relocs targeting it.
+ - **COFF/PE — incremental-friendly (second).** PE is the canonical incremental
+ target (MSVC `/INCREMENTAL` + `.ilk`): imported calls indirect through the
+ IAT, base relocs are simple per-page RVA lists, debug lives in a separate
+ PDB, and Authenticode is optional. The practical gate is cfree's COFF
+ maturity, not the format. (Reasoned from the PE/MSVC precedent, not a code
+ dive of cfree's COFF link.)
+ - **Mach-O — heaviest but feasible (last).** `__LINKEDIT` folds loader fixup
+ metadata into compact whole-image structures co-designed with a *mandatory*
+ dyld: chained fixups (`LC_DYLD_CHAINED_FIXUPS`), the export trie, the
+ indirect symtab. Each needs its own incremental updater — but they are
+ **bounded, not `O(image)`**: a code-atom move updates only the chained-fixup
+ *slots pointing at it* (chain `next`-links don't move unless pointer slots
+ move), and the symtab patch is `O(changed symbols)` like ELF (an earlier
+ survey overstated these as whole-image). The one real floor is **mandatory
+ code signing** on Apple Silicon: every patch must re-sign, and the
+ CodeDirectory is a per-4KiB-page hash array — naively `O(image)`, but
+ cacheable to `O(changed pages)` by retaining unchanged pages' hashes.
+ - Until a format's updater lands, that format falls back to the fast in-process
+ full link.
- **Per-arch surface is small:** only (a) the move primitive's island/cell shape
and (b) the branch-into-island/cell reloc kind. aa64 has the jit-stub shape to
reuse; x64 (`src/obj/x64/link.c:40`) and rv64 each have a trampoline shape to
@@ -496,28 +539,66 @@ out of caching; Toy's **batch/file** compile conforms like any other frontend.
## 16. The interface boundary the build system consumes
-The separate build-system plan (build graph, cache, watch) calls only this
-public surface; it never touches `src/link` internals.
+**Incrementality is not a parallel API — it is the existing `CfreeLinkSession`
+made fully mutable.** A full link is the degenerate cold case (no prior state,
+nothing replaced); an incremental relink seeds prior state and replaces the
+changed inputs. The build system always drives *the same* session, and
+`resolve` internally decides patch-vs-full and reports which — there is no
+separate "incremental" entry point to keep in sync with the full-link path. This
+matches the internal direction (`link_resolve` is "inputs → image";
+`link_resolve_at`/`extend`, `link.c:629,638`, make that `resolve` extend-capable).
+
+Changes from today's surface (`new`/`add_obj…`/`resolve`/`emit`/`jit`/`free`,
+`link.h:189-207`) — all **additive**:
```c
-/* include/cfree/object.h */
+/* include/cfree/object.h — object identity */
CfreeStatus cfree_obj_content_id(CfreeObjBuilder*, uint8_t out[CFREE_BLAKE2B_LEN]);
-/* include/cfree/link.h — new incremental session surface */
-typedef enum { CFREE_LINK_PATCHED, CFREE_LINK_FELL_BACK_FULL } CfreeLinkOutcome;
+/* include/cfree/link.h */
+typedef enum { CFREE_LINK_FULL, /* cold: no prior state */
+ CFREE_LINK_PATCHED, /* incremental fast path applied */
+ CFREE_LINK_FELL_BACK_FULL /* was incremental, gate forced full */
+} CfreeLinkOutcome;
-CfreeStatus cfree_link_session_open_incremental(CfreeLinkSession*,
- const void* persisted, size_t persisted_len); /* NULL = cold */
+/* add_* gain a stable input handle so a live session can mutate one slot;
+ the handle is optional (NULL) for the cold/file path. */
+CfreeStatus cfree_link_session_add_obj(CfreeLinkSession*, CfreeObjBuilder*,
+ CfreeLinkInputId* out /*nullable*/);
CfreeStatus cfree_link_session_replace_input(CfreeLinkSession*, CfreeLinkInputId,
- CfreeObjBuilder* changed); /* by content */
-CfreeStatus cfree_link_session_patch_emit(CfreeLinkSession*, CfreeWriter* image,
- CfreeWriter* persisted_out, CfreeLinkOutcome* outcome);
+ CfreeObjBuilder* changed);
+CfreeStatus cfree_link_session_remove_input(CfreeLinkSession*, CfreeLinkInputId);
+
+/* seed prior incremental state (opaque bytes); unset/empty => cold full link. */
+CfreeStatus cfree_link_session_set_prior_state(CfreeLinkSession*, const CfreeSlice*);
+
+/* resolve() and emit() are the SAME calls, now incremental-aware:
+ resolve reconciles the CURRENT input set against prior state BY CONTENT (§10)
+ — unchanged atoms reuse placement, changed atoms patch, non-local edits fall
+ back to a full re-resolve. Idempotent: safe to re-call after mutations. */
+/* CfreeStatus cfree_link_session_resolve(CfreeLinkSession*); (existing) */
+/* CfreeStatus cfree_link_session_emit(CfreeLinkSession*, CfreeWriter*); (existing) */
+
+/* emit the new persisted incremental state (opaque); query the last outcome. */
+CfreeStatus cfree_link_session_serialize_state(CfreeLinkSession*, CfreeWriter*);
+CfreeStatus cfree_link_session_outcome(CfreeLinkSession*, CfreeLinkOutcome* out);
```
-The build system supplies changed objects (it decides *which* via its cache),
-gets back the patched image, the new persisted blob, and — crucially — the
-**outcome** so it knows whether the fast path applied or the link fell back. The
-object content id lets it detect "this TU's object is byte-identical, skip it."
+- **Cold full link (today, unchanged):** `new → add_obj… → resolve → emit`.
+- **Incremental relink:** `new → set_prior_state(blob) → add_obj…/replace_input →
+ resolve → emit + serialize_state`, then read `outcome`.
+
+Because reconciliation is **by content hash** (§10, decision §20.3), the
+cross-process path needs no stable-id continuity: a fresh session seeded with
+prior state matches re-added inputs to prior placements by content.
+`replace_input`/`remove_input` are a live-session (daemon) convenience for
+slot-precise mutation; they still match by content underneath.
+
+**Ownership (decision §20.5):** the build system owns the persisted blob's key,
+CAS storage, and lifetime; the session only reads it via `set_prior_state` and
+writes it via `serialize_state` as opaque bytes through `CfreeWriter`. libcfree
+does no file IO/CAS (both are driver-only — `driver/dist`; libcfree reads bytes
+via `Compiler.env->file_io`, `src/link/link.h:150`).
---
@@ -532,73 +613,174 @@ A patch is all-or-nothing from the consumer's view:
---
-## 18. Implementation sequence
+## 18. Implementation sequence (acceptance-test-first, red → green)
+
+The **first phase builds the acceptance suite** (§19), which encodes "done for
+ELF" as an executable spec. It starts fully red; each milestone below drives a
+named set of its scenarios green. Each milestone also has its own narrow
+red-green unit cycles (listed inline); the acceptance scenarios are the
+integration capstones.
+
+**Phase 0 (first) — author the ELF acceptance suite, RED.**
+Land the additive public surface (§16) as not-implemented stubs —
+`_set_prior_state` / `_replace_input` / `_remove_input` / `_serialize_state` /
+`_outcome`, the input-id out-param on `add_obj`, the `CfreeLinkOutcome` enum, and
+`cfree_obj_content_id` — each returning a "not implemented" status, with
+`resolve`/`emit` initially doing only the cold full link (mirroring the existing
+`link_resolve_at`/`extend` panic stubs, `link.c:629,638`), so the suite compiles
+and links. Then write `test/link-incremental/` with the §19 scenarios A–F, the
+synthetic fixture build, and the `link_resolve` whole-program instrumentation. Every
+scenario is red. This nails the spec before any implementation and is the red
+baseline. (No parallel "incremental" surface — §16: the one mutable session.)
**M0 — atom identity & obj indices (no behavior change).**
`obj_content_id` / `obj_atom_content_id`, per-atom reloc index, symbol-by-name
-hash, deterministic round-trip + a determinism test. Wire `CfreeFrontendCaps`
-and the contract (C deps via `CfreeDepIter`; trivial for others).
+hash, deterministic round-trip, `CfreeFrontendCaps` (C deps via `CfreeDepIter`;
+trivial for others). → **turns Scenario E (determinism) green** and provides the
+`obj_content_id` the harness keys on. Narrow: one-byte body edit ⇒ exactly that
+atom's content id changes, others stable.
**M1 — `LinkSession` + append-only extend (Stage A).**
-Introduce `LinkSession`, implement `link_resolve_extend` for append-only against
-a file image, reusing JIT cursor/slack *placement* but **falling back, not
-panicking**. Persisted blob round-trips (§10).
+Introduce `LinkSession`; implement the append-only subset of
+`link_resolve_extend` against a file image, reusing JIT cursor/slack *placement*
+but **falling back, not panicking**; persisted-blob round-trip (§10). → **turns
+Scenario F (no-op relink) green**. Narrow: appended object whose code calls an
+initial function links; appended duplicate-strong-def falls back (not panic);
+unresolved ref is transactional (image unchanged).
**M2 — patch changed atoms in slack (Stage B, no move yet).**
-Per-atom diff, overwrite-in-slack, reapply that atom's relocs (§9), per-segment
-build-id (§11), the soundness gate + transactional rollback (§7.3, §17). Atoms
-that would grow past capacity ⇒ fall back (no move primitive yet).
+Per-atom diff, overwrite-in-slack, reapply the changed atom's relocs (§9),
+per-segment build-id (§11), regenerate the changed TU's debug (§13), the
+soundness gate + transactional rollback (§7.3, §17). Atoms that would grow past
+capacity fall back here (no move primitive yet). → **turns Scenarios A (in-slack
+edit), C (fallback), and D (multi-output) green.**
**M3 — move-on-grow via thunk (`LinkMoveOps` = thunk).**
-Free-list, grow-relocate code atoms, jump islands (reuse jit-stub shape), data
-slack + fall-back-on-data-grow. ELF/aa64 then ELF/x64.
+Free-list, relocate grown code atoms, jump islands (reuse the
+`link_layout_jit_stubs` shape, `link_reloc_layout.c:429`), separate code/data
+slack with data-grow → fallback. → **turns Scenario B (grow past slack) green.**
+At the end of M3 the full §19 suite is green on **ELF/aa64 + ELF/x64 — this is
+"done for ELF."**
-**M4 — converge on GOT-cell (`LinkMoveOps` = got), if/when hot reload needs it.**
-`--incremental` codegen mode for cross-unit calls + movable data, reserved GOT
-slack + free-list. Shares the primitive with `doc/HOT_RELOAD.md`.
+**M4 (deferred) — converge on GOT-cell (`LinkMoveOps` = got).**
+Built when hot reload is scheduled (decision §20.3), designed to serve both
+paths: a `--incremental` codegen mode for cross-unit calls + movable data,
+reserved GOT slack + free-list. Not required for the §19 suite.
-Mach-O/COFF and rv64 patching follow M3/M4 per §14.
+Then COFF, then Mach-O updaters (§14); the rv64 patch path follows aa64/x64.
---
-## 19. Test plan (narrow, per-arch, red-green)
-
-Prefer targeted runs; redirect output to a file and read it (project rules).
-
-- **M0:** compile `tmp/projects/lua/src/ltable.c` twice ⇒ identical
- `obj_content_id` (determinism). Edit one function body ⇒ exactly that atom's
- content id changes, others stable. aa64 + x64 only.
-- **M1:** initial object + appended object where appended code calls an initial
- function; appended duplicate-strong-def ⇒ fall back (not panic); unresolved
- ⇒ transactional, image unchanged.
-- **M2:** build `liblua.a` + `lua`; patch one in-slack function body ⇒ unchanged
- symbols keep vaddrs (`nm` diff), binary runs (`test/lib` `exec_target`), and
- `link_resolve` whole-program path was *not* taken (instrument a counter, dump
- to file). Negative: add a new global ⇒ fall back; weak↔strong flip ⇒ fall back;
- new archive pull-in ⇒ fall back.
-- **M3:** grow a function past its slack ⇒ it relocates, an island appears at the
- old slot, callers' bytes are byte-identical, result runs. Grow a *global* past
- data slack ⇒ fall back. `addr2line` an unchanged function after a patch ⇒
- correct file:line.
-- **Multi-output:** edit a core TU shared by `lua` and `luac` ⇒ both images
- patch (or both fall back) consistently.
+## 19. ELF definition of done: outcome & acceptance suite
+
+This is the executable specification authored **first** (Phase 0, §18) and the
+north-star M0–M3 drive to green. ELF/aa64 + ELF/x64 only (per the "narrow runs"
+rule). It is `test/link-incremental/`.
+
+### 19.1 Outcome — what a fully-built ELF implementation produces
+
+Under `--incremental` (`-O0`/`-O1`), the build system seeds the session with prior
+state, replaces the changed input(s) (`cfree_link_session_replace_input`, or
+re-adds inputs on the cold path), then calls the same `resolve` → `emit` +
+`serialize_state` (§16):
+
+- **In-slack body edit:** the changed atom's bytes overwrite in place, only its
+ relocs re-derive/reapply, the changed TU's `.debug_*` regenerates, only the
+ changed segment's build-id subhash recomputes. Outcome `CFREE_LINK_PATCHED`;
+ every other atom and every unchanged vaddr is byte-identical.
+- **Grow past slack:** the atom relocates to a free-list slot, a jump island is
+ left at its old address, **callers' bytes do not change**; only the moved
+ atom's relocs re-derive. Still `PATCHED`, still `O(change)`.
+- **Non-local edit:** added/removed global, weak↔strong flip, new archive
+ pull-in, COMDAT-ownership flip, TLS/import-size change, or slack exhaustion
+ (incl. any data-atom grow) ⇒ `CFREE_LINK_FELL_BACK_FULL`: a correct full
+ in-process link. **Never a wrong binary.**
+
+Guarantees: address stability (unchanged symbols keep their vaddr), debug
+correctness after a patch (`addr2line`/`cfree dbg`), byte-deterministic objects
+(release builds with `--incremental` off are the reproducible artifact), and
+consistent multi-output (a core-TU edit patches/falls-back for both apps). Cost:
+a one-function in-slack edit takes the link from `O(all objects)` to `O(one atom
++ its relocs + one TU's debug + one segment rehash)` — for the synthetic fixture
+(§19.2), from "relink both executables over the whole archive" to "patch one
+atom."
+
+### 19.2 Harness & instrumentation
+
+- Run the patched binary via `test/lib` `exec_target`/`exec_kernel`.
+- Instrument the whole-program `link_resolve` entry with a counter dumped to a
+ file (read it back; don't re-run). A `PATCHED` outcome must **not** increment
+ it; a fallback must.
+- **Fixture — a synthetic multi-TU codebase** under
+ `test/link-incremental/fixture/` (hand-written, deterministic, no third-party
+ deps; freestanding-style apps that return a *computed* status, like the smoke
+ harness, so the only archive in play is the fixture's own). Five core TUs
+ archived into `libcore.a`, linked into two executables that share it:
+ - `arith.c` — leaf callees `arith_add` / `arith_mul`.
+ - `table.c` — `tbl_get` (the **in-slack** edit target) and `tbl_sum` (the
+ **grow** target); both call `arith_*` cross-TU (so callers live in other TUs).
+ - `state.c` — data globals `g_config[8]`, `g_name[]` (the **data** atoms for
+ data-slack / data-grow).
+ - `weakdef.c` — `feature_level()` defined **weak** (the **weak→strong** target).
+ - `optional.c` — `opt_helper()`, an archive member **referenced by no one
+ initially** (so it isn't pulled — the **archive-pull-in** target).
+ - `app_a.c`, `app_b.c` — two `main`s with overlapping core use, each linking
+ `app_*.o` + `libcore.a` → executables `app_a`, `app_b` (the multi-output
+ shape; `table.c` is shared by both).
+- Build the fixture once with `--incremental`; capture each app's baseline `nm`
+ vaddr map, the persisted-state blob, and each app's expected computed status
+ (run via `test/lib` `exec_target`).
+- Prefer targeted runs; redirect output to a file (project rules).
+
+### 19.3 Acceptance scenarios (each assertion falsifies one claim)
+
+| Scenario | Action | Assertions | Green at |
+|---|---|---|---|
+| **A — in-slack edit** | edit `tbl_get`'s body within its slack | `outcome==PATCHED`; whole-program `link_resolve` counter **did not** increment; `nm` diff: **every** symbol keeps its vaddr; `app_b` (a cross-TU caller) runs and its status reflects the edit; `addr2line` correct for an unchanged *and* the edited function; only the changed segment's bytes differ; build-id changed | M2 |
+| **B — grow past slack** | grow `tbl_sum` ~10 KB past its slot | `PATCHED`; `tbl_sum` moved to a new vaddr; jump island at its **old** vaddr; **`app_a`'s caller bytes byte-identical** to baseline; both apps run | M3 |
+| **C — soundness gate** | c1 add `int g_extra;` to `state.c` · c2 flip `feature_level` weak→strong · c3 edit `table.c` to call `opt_helper` (pulls `optional.c`) · c4 grow `g_config` past `data_slack` | each ⇒ `FELL_BACK_FULL`; binary matches a from-scratch full link | M2 |
+| **D — multi-output** | edit `table.c` (shared by both `app_a` and `app_b`) | both images patch or fall back **consistently**; both run | M2 |
+| **E — determinism** | compile `table.c` twice | identical `obj_content_id` **and** identical bytes; one-byte edit ⇒ only that atom's id changes | M0 |
+| **F — no-op relink** | `replace_input` with a byte-identical object | no atoms diff ⇒ image unchanged, near-zero link work | M1 |
+
+### 19.4 The two gates that define correctness
+
+**Scenario A's vaddr-stability assertion** (a patch must move nothing it
+shouldn't) and **Scenario C** (non-local edits must fall back, never silently
+mislink) are the two ways this feature can be wrong. Both must be green before
+"done for ELF" is claimed.
---
-## 20. Open questions / decisions
-
-1. **DWARF on in-slack overwrite:** accept that an overwrite that does not move
- the atom leaves `.debug_line` byte-identical (free) even if source *line
- numbers* shifted within the body — or always re-emit the atom's `.debug_line`
- on any body change (correct, slightly slower)? (§13)
-2. **Data movement under thunk mode:** v1 forbids moving data (slack + fall
- back). Is the slack budget for data atoms tunable per project, or fixed?
-3. **GOT convergence trigger:** build M4 only when hot reload needs the shared
- cell, or proactively to unify the two paths sooner? (§8.2)
-4. **Determinism guarantee strength:** require byte-stable objects (enables
- cross-machine dedup) or only content-keyed correctness (§12)?
-5. **Persisted-blob lifetime/keying:** the link action id is a build-system
- concern (§16) — confirm the boundary: does the build system own the CAS key,
- or does the link session?
-6. **Mach-O/COFF scope:** confirm ELF-only for v1 (§14); enumerate Mach-O
- whole-image structures before committing to patch them.
+## 20. Decisions (resolved)
+
+All six original open questions were investigated against the code and resolved
+with the user. Recorded here as the binding design choices; the relevant sections
+above were updated to match.
+
+1. **DWARF (§13) — regenerate the changed TU's debug on any body change.** The
+ "keep stale `.debug_line`" option is *incoherent* (a body change rewrites the
+ line mapping regardless of move) and debug is per-CU monolithic, so update is
+ `O(changed TU)`, not `O(atom)`. Per-function CUs for `O(atom)` debug are a
+ noted future option, not pursued now.
+2. **Slack (§6, §8) — separate, tunable code/data budgets.** `code_slack_pct`
+ modest (code relocates cheaply via the move primitive); `data_slack_pct`
+ generous (a data-grow forces a full-link fallback since data can't be
+ thunked). Both default sensibly, overridable via a link option.
+3. **GOT timing (§3, §8.2, §18) — defer; thunk-first.** Ship thunk-on-grow for
+ file-incremental now (zero codegen change). Build the shared indirection-cell
+ primitive when hot reload is scheduled, designing it then to serve both.
+ Rationale: hot reload needs its *own* per-function slots regardless
+ (`doc/HOT_RELOAD.md:34-48,144-157`), thunk-on-grow doesn't advance it, and
+ neither is implemented — so unifying now is speculative.
+4. **Determinism (§12) — lock with a test, keep content-keying.** Objects are
+ already byte-deterministic (`src/obj/elf/emit.c:298,386,505`); a regression
+ test locks it (enabling cross-machine dedup), while content/name keying stays
+ the correctness backbone so any future drift degrades dedup, not correctness.
+5. **Persisted-state keying (§16) — build system owns key/storage/lifetime; the
+ session emits opaque bytes** (+ exposes the blob's content id). Keeps libcfree
+ IO/CAS-free, matching the driver-only CAS boundary.
+6. **Format scope (§14) — ELF first; COFF then Mach-O as follow-on milestones.**
+ None of the three is fundamentally unsuitable; the difference is how much
+ format-specific machinery each needs. Formats without an updater yet fall back
+ to the fast in-process full link.