kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

Linker (planned work)

This roadmap covers where the kit linker is headed beyond the static and JIT linking it does today. It is dominated by incremental linking: two related but distinct workstreams — append-only growth of a live JIT image (the kit dbg / kit emu consumer) and file-based incremental object linking (the build-system consumer, the "m2" redesign). Both rest on the same linker invariants — address stability, durable non-destructive relocation records, content-keyed reuse — and both fall back to a correct full link whenever a change cannot be proven local. For the linker's current architecture, passes, and invariants see ../LINK.md; for how a resolved image runs in process see ../JIT.md; for the object substrate see ../OBJ.md; for the build-system layer that consumes the file-incremental interface see ../BUILD.md and the distribution CAS in ../DISTRIBUTE.md.

Why incremental, and the shared invariants

The full link is always available and always correct. Incremental linking is an accelerator gated on a soundness check: a correct-but-slow result always beats a fast-but-wrong one. Three invariants hold across both workstreams and must never be violated by any incremental path:

Workstream 1 — append-only incremental JIT link

Grow one live KitJit image with additional compiled objects while keeping every previously published runtime address stable. New code may reference old symbols; old debugger surfaces (kit_jit_lookup, kit_jit_addr_to_sym, symbol iteration, breakpoints, PC translation, the JIT debug view) must see new symbols. This is explicitly not hot reload: existing code is never replaced or repatched (see ../DBG.md for the debugger and the separate hot-reload design).

Done (baseline)

The in-process append path is implemented and is the foundation the rest builds on: the kit_jit_publish surface (an append/replace batch driven by a KitLinkSession, reporting a bumped generation); append cursors with reserved RX/R/RW/TLS slack over one contiguous master VA reservation committed page-by-page; transactional rollback of cursors and symbol/section/reloc counts on failure; generation-bumped invalidation of the cached kit_jit_view; symbol resolution against the existing image, the append batch, and the external resolver with duplicate-strong detection; and a dbg REPL that drives compile → append → DWARF refresh with the worker stopped so line-table replacement never races a running thread.

Remaining

Workstream 2 — file-based incremental object link (the "m2" redesign)

The goal is "instant" relinks for dev builds: after editing one translation unit in a project of N TUs, the link cost should be O(changed atoms + their relocations), not O(whole program). Compile cost (caching, dependency scanning, the build graph and watch/daemon modes) is the build system's problem and is out of scope here — this workstream is the obj/link substrate that layer stands on. Incremental link is a -O0/-O1 dev feature; release builds (--incremental off) always full-link, clean, and remain the canonical reproducible artifact.

Done (baseline) — "Done for ELF"

The first cut landed on ELF as the reference format, with the acceptance suite (test/link-incremental/) green on ELF/aa64 + ELF/x64: atom content identity, per-atom reloc/symbol indices, the LinkSession with per-segment cursors/slack/free-list, append-only extend, patch-in-slack, the soundness gate with transactional rollback, per-segment build-id, per-changed-TU debug regen, and move-on-grow via thunk. This is the starting point; the rest of this section is what is not yet built.

The m2 redesign — design intent

The redesign's central decision: incrementality is not a parallel API — it is the existing link session made fully mutable. A full link is the degenerate cold case (no prior state, nothing replaced); an incremental relink seeds prior state and replaces the changed inputs. The build system always drives the same session, and resolve internally decides patch-vs-full and reports which via an outcome enum (FULL / PATCHED / FELL_BACK_FULL). There is no separate "incremental" entry point that could drift from the full-link path. This directly matches the internal direction where link_resolve is "inputs → image" and the link_resolve_at / link_resolve_extend surface makes that resolve extend-capable.

The atom is the patch unit — one function or one data object. Under --incremental, frontends emit one section per function/global (a -ffunction-sections/-fdata-sections equivalent) so each atom is independently placeable; kit already lays out kept atoms as individual LinkSections. Each atom gets a BLAKE2b content id over its canonical form (bytes || align || flags || canonical(relocs)), the diff key.

The soundness gate

Reuse is correct only when the change cannot alter symbol resolution. The edit is local only when the changed object's interface (defined global names + bindings, COMMON sizes/aligns, set of undefs) is unchanged and no archive pull-in changes; anything that can shift layout or resolution — symbol-set/binding flips, new archive members, COMDAT/COMMON merge changes, TLS-size shifts, import-set changes, slack/free-list exhaustion (data is never thunked), or layout-affecting flags — forces a fall-back. On fall-back the half-mutated session is discarded via the LinkPatchTxn watermark and a correct full link runs; the JIT append path's duplicate-global preflight is the precedent, but it panics, so converting "detect non-local" into "roll back + full link" is the new control flow at the heart of the redesign. See ../LINK.md for the full trigger set and rollback mechanics.

The move-on-grow primitive (swappable)

When an atom outgrows its slot it must move, and callers must still reach it without their bytes changing. This is abstracted behind a single LinkMoveOps.atom_moved hook with two implementations; the rest of the design (atoms, slack, free-list, persisted session, the gate) is identical either way.

Persisted incremental state

Side-band and content-addressed — not ELF-embedded incremental sections, because kit is multi-format. Store one blob in the existing driver/dist BLAKE2b CAS, recording per input and per atom: object + atom content ids, the LinkAtomPlace table (vaddr / file_offset / size / capacity / bucket), symbol→vaddr bindings keyed by name, relocations in relative+symbolic form, and free-list + per-segment cursor state. The session reads/writes it as opaque bytes through KitWriter; the build system owns the key, CAS storage, and lifetime — libkit stays IO/CAS-free.

Remaining work

Frontend contract and debug-info consistency

All frontends converge to ObjBuilder and join the shared path at obj_finalize, so the machinery attaches once, frontend-agnostically — Toy, asm, and WASM get incremental link with no frontend-specific code. To be incrementally safe a frontend must produce deterministic output for identical (source, flags, target, deps), declare its external dependency set (C reuses KitDepIter; single-source frontends report none), use stable source-derived symbol names, and expose a frontend_id + schema_version that salts the build-system key. Toy's durable-module REPL path is not a pure function of source, so it folds the module snapshot into the input key or opts out of caching; Toy's batch/file compile conforms like any other frontend.

On debug info: on any changed atom, re-emit that TU's full .debug_*. kit emits one monolithic .debug_line program and one .debug_info CU with intra-CU DW_FORM_ref4 offsets, so a function's rows cannot be spliced in isolation; and a body change rewrites the instruction→line mapping even when the atom did not move, so "keep stale .debug_line" is incoherent. Per-TU regen is O(changed TU), cheap relative to the rest of the patch, and unchanged TUs' debug stays byte-stable because their atoms keep their addresses. Per-function CUs for O(atom) debug are a future option, not pursued now. See ../DWARF.md.

Acceptance: definition of done per format

The executable spec lives in test/link-incremental/, authored test-first (red → green). Its synthetic multi-TU fixture (core TUs archived into a static library linked into two executables that share it; no third-party deps) covers an in-slack body edit (PATCHED, every vaddr stable, whole-program link_resolve counter does not increment), a grow-past-slack edit (PATCHED, atom moves, jump island at the old address, caller bytes byte-identical), the soundness gate (each non-local edit ⇒ FELL_BACK_FULL matching a from-scratch link), multi-output consistency, determinism, and a no-op relink. The two gates that define correctness are vaddr-stability on a patch and fall-back on a non-local edit; both must be green before a format is "done." ELF/aa64 + ELF/x64 are done; COFF, Mach-O, and the rv64 patch path each repeat this bar. See ../TESTING.md.