kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

kit Interfaces

Modularity and clean interfaces are a top project priority. This document is the operational companion to DESIGN.md: where DESIGN.md tells the layering narrative, this doc is the interface inventory and the interface-review checklist. It catalogues every boundary worth reviewing — the public API, the backend/codegen contracts, the internal subsystem seams, the core utilities, and the frontend-to-library edge — names the responsibility each one carries, and gives a checklist to apply when adding to or changing one.

The aim is to make the boundaries legible so they stay clean: an interface that nobody can locate is an interface nobody will defend.


Boundary map

From the outside in (see DESIGN.md for the full narrative):

driver/                 CLI policy + host I/O. Built with -Iinclude only:
  │                     the public surface is all it can reach. No -Isrc.
  └─ lang/              Frontends (c, cpp, toy, wasm). Compiled INTO libkit.
       │                Public-API consumers: c/cpp/toy build with NO -Isrc and
       │                reach nothing internal. Only wasm builds with -Isrc, to
       │                reach the shared wasm module model (src/wasm/wasm.h).
       └─ include/kit/   PUBLIC BOUNDARY. The library's entire external contract.
            └─ src/api/     Composition layer: public handles <-> internal subsystems.
                 └─ src/    Internal subsystems. Share private headers; expose
                            nothing outward except through include/kit/.

The boundary that the build enforces. The hard layering line is the driver, not lang/. The driver is compiled with -Iinclude (plus -Ilang solely to reach a frontend's public c/c.h for the JIT REPL) and deliberately without -Isrc, so internal headers (core/..., link/..., cg/...) are physically unreachable from it. That makes the driver the first true consumer of libkit's public API and the thing that proves the public surface is sufficient.

Frontends in lang/ are a softer boundary, but in practice they hold the public line. The C, cpp, and toy frontends build with no -Isrc at all, so internal headers are physically unreachable from them — they are pure include/kit/ consumers, exactly like the driver. (The C frontend's ABI lowering lives in its own lang/c/abi/ module built on the public CG type API and kit_compiler_target; despite the TargetABI-flavored history it does not touch src/abi/.)

The lone exception is the wasm frontend, which builds with -Isrc to reach src/wasm/wasm.h. That is not a frontend punching above the public API: the wasm module model is a Tier-3 shared internal subsystem with three peer consumers — the frontend (lang/wasm), the wasm object format (src/obj/wasm), and the wasm codegen backend (src/arch/wasm) — and the frontend reaches it through its single boundary header, the same way the other two do. It stays internal rather than public because the module IR is large and unstable; the public wasm.h is a deliberately different, narrow surface (host-import binding). Everything else the wasm frontend touches is public — e.g. lang/wasm/cg.c allocates through the public <kit/support/arena.h> mirror, not src/core/arena.h. Any new frontend reach into src/ beyond this one edge is a signal the public API is missing something — add it to include/kit/, don't grow the edge.

Invariants (keep them true):


Tier 1 — Public API (include/kit/)

The library's entire stable contract: nineteen headers in include/kit/ plus two in include/kit/support/. No umbrella header — each consumer includes what it uses.

Header Purpose Key opaque type(s) Primary consumer
core.h Foundational substrate: compiler lifecycle, target triple, slices, status codes, host vtables (KitHeap/KitWriter/KitDiagSink/KitContext), symbol interning. KitCompiler everyone
config.h Build-time component enable flags (arch / obj-format / language / subsystem / tool). Preprocessor-only. build
compile.h High-level source->object compilation; frontend registration vtable; dep iteration. KitCompileSession, KitDepIter driver, frontends
cg.h Code-generation API (the largest contract): a stack-machine typed IR over KitCg. Types/ABI, functions, control flow, memory, arithmetic, calls, intrinsics, inline+file asm, static data. KitCg frontends
frontend.h Frontend front door: re-exports cg.h / compile.h / source.h / support/arena.h, plus the panic boundary (kit_frontend_run), metrics scopes, and fatal helpers. Including this alone is enough to write and register a frontend. frontends
source.h Source registry: stable file IDs + include-edge recording. frontends
preprocess.h Standalone C preprocessor entry. driver
object.h Format-neutral object model: builder + read-only inspection; section/symbol/reloc enums. KitObjBuilder, KitObjFile cg, link, jit, disasm, dwarf
link.h Linker: byte/object/archive/DSO inputs, linker-script model, emit or JIT. KitLinkSession, KitLinkScript driver, jit
jit.h JIT image: mapped pages, symbol resolution, publish/append/replace, object view. KitJit runtime, dbg
interp.h Threaded-bytecode interpreter over the optimizer IR; host-identity and emu/guest configurations. KitInterpProgram run --no-jit, emu
dbg.h In-process JIT execution control: breakpoints, stepping, regs/mem, signal host. KitJitSession debuggers
dwarf.h DWARF5 consumer: PC<->line, type/var/subprogram queries, structural iterators. KitDebugInfo, KitDwarfType debuggers, dumpers
disasm.h Disassembly of byte ranges and objects, with symbol/reloc annotation. KitDisasmIter objdump, dbg
emu.h User-mode guest-ELF emulator (per-block JIT). KitEmu emu tool
arch.h Arch-agnostic register / unwind-frame metadata helpers. dbg, dwarf, disasm
archive.h POSIX ar reader/writer + symbol index. KitArIter ar/ranlib
asm_emit.h Emit assembled object bytes as GAS text. objdump
wasm.h WebAssembly host-import resolver/binder. KitWasmInstance wasm runners
support/arena.h Public bump allocator (narrowed mirror of src/core/arena.h). KitArena frontends
support/hashmap.h Header-only KIT_HASHMAP_DEFINE template + hash fns. — (macro) frontends

Public-tier notes:


Tier 2 — Backend / codegen contract (internal)

The codegen path is tiered: each tier is a distinct contract struct (a vtable) that a backend or layer fills in. A frontend records into the highest tier; the bytes come out of the lowest. The native backends — aarch64, x64, and rv64 — are all live and all satisfy the same contracts. aarch64 is the reference implementation; x64 and rv64 are full peers built on the shared NativeTarget / NativeOps / NativeFrame substrate rather than bespoke per-arch frame and lowering code.

Tier Header Contract type What implements it Role
ABI src/abi/abi.h (+ abi_internal.h) TargetABI per-ABI TUs (aapcs64, aapcs64_windows, sysv_x64, rv64, wasm32, apple_arm64, apple_x64, win64_x64) calling-convention + storage-layout queries; abi_new(Compiler) selects the implementation by (arch, obj-format)
Arch registry src/arch/arch.h ArchImpl one singleton per arch, via arch_lookup(kind) discovery + dispatch to backend/decode/emu/link/dbg/dwarf surfaces; CFI defaults
Semantic CG src/cg/cgtarget.h CgTarget native_direct_target (-O0) or opt_cgtarget (-O>=1), wrapping a per-arch CgTarget frontend-facing typed lowering, pre-regalloc
-O0 adapter src/cg/native_direct_target.h NativeDirectTarget + NativeOps shared adapter, parameterized by each arch's NativeOps adapts a NativeTarget to CgTarget for the direct -O0 path
Physical emit src/arch/native_target.h NativeTarget aa64/x64/rv64 *_native_target_new() hard-register, machine-code emission + frame/CFI
Frame model (shared) src/cg/native_frame.h NativeFrame shared native_frame.c, embedded by all three native backends arch-neutral frame-slot bookkeeping the NativeTarget impls delegate to
Machine code src/arch/mc.h MCEmitter one generic impl, mc_new(Compiler, ObjBuilder) section/label/reloc/CFI byte emission for all machine-code archs

Per-arch entry points — the surface each backend exposes to the rest of the compiler. The native archs each expose the same pair; rv64 additionally exports its raw word/halfword emit helpers for the assembler path:

Arch Header Entry points
aa64 (reference) src/arch/aa64/aa64.h aa64_native_target_new, aa64_native_direct_ops
x64 src/arch/x64/x64.h x64_native_target_new, x64_native_direct_ops
rv64 src/arch/rv64/rv64.h rv64_native_target_new, rv64_native_direct_ops, rv64_emit32/16
c_target src/arch/c_target/{c_emit,ir_emit}.h C-source emission backend (standalone CGBackend, no ArchImpl)
wasm src/arch/wasm/* wasm emission backend

Backend-tier notes:

Shared native frame model (src/cg/native_frame.h)

Every native backend lays out a stack frame the same way at the bookkeeping level, so that bookkeeping was lifted into one shared module. NativeFrame owns the arch-neutral parts; each NativeTarget embeds one and keeps the ISA/ABI specifics. All three native backends embed a NativeFrame.

The split — what NativeFrame owns vs. what stays in the backend:

Owned by NativeFrame (arch-neutral) Stays in the backend (ISA/ABI-specific)
Slot table + cumulative-offset arithmetic (native_frame_slot_alloc/_at) Coordinate transform from off to anchor-relative disp (fp/s0/rbp; aa64 top- vs bottom-record)
frame_final gate (no new slots after the prologue) Prologue/epilogue + slim-variant instruction encoding
Used-callee-save set derived from optimizer per-class masks (native_frame_set_callee_saves/_collect_saves) Callee-save placement (aa64 reserves slots here; rv64/x64 compute offsets below the locals)
max_outgoing tracking (native_frame_note_outgoing) Deferred-patch application, variadic register-save stores
Vararg save-area size from the ABI's va_list layout (native_frame_va_save_bytes)

Why it is shared: the slot arithmetic, the no-grow-after-prologue gate, and the derivation of the used-callee-save set from the optimizer's per-class masks are identical across the three archs. Consolidating them also folds the three per-arch vararg-save magic numbers (rv64 64, x64 176, aa64 64+128) into a single ABI-driven native_frame_va_save_bytes query, in line with the no-magic-numbers rule. Frame-slot handles are 1-indexed with NATIVE_FRAME_SLOT_NONE as the sentinel; callers and the shared allocator must agree on that.


Tier 3 — Internal subsystem boundaries

Each subsystem exposes a single shared header (its boundary) and may keep an *_internal.h private to its own TUs. The boundary header is what other subsystems are allowed to include; the internal header is not.

Subsystem Boundary header Internal header Role
obj src/obj/obj.h format headers private Format-neutral object model (ObjBuilder, sections, symbols, relocs) plus read side; the hub cg/link/jit/disasm/dwarf depend on.
↳ formats src/obj/{elf,macho,coff}/*.h, format.h, reloc_apply.h Per-format emit/read behind ObjFormatImpl; link_reloc_apply for relocation.
link src/link/link.h (+ link_arch.h) link_internal.h Byte/object/archive/DSO inputs, symbol resolution (single-shot and incremental), ELF/JIT output; kit_jit_from_image.
opt src/opt/opt.h (+ ir.h) opt_internal.h SSA construction, CFG passes, register allocation, MIR lowering; opt_cgtarget_new(Compiler, CgTarget, level) wraps a backend target.
cg src/cg/{ir,ir_recorder,type}.h internal.h IR recording and the codegen type system (cg_type_*).
debug src/debug/debug.h (+ dwarf_defs.h) debug_internal.h, dwarf_internal.h DWARF producer: types, subprograms, line program, emit.
emu <kit/emu.h> (public face) src/emu/emu.h Guest-ELF emulator; format hooks via ObjFormatEmuOps.
dbg <kit/dbg.h> (public face) src/dbg/dbg.h JIT execution control; the real contract is the public header.
asm src/asm/asm.h (+ asm_lex.h) asm_helpers.h shared asm_parse(Compiler, AsmLexer, MCEmitter); driver helpers.
jit src/jit/tlv_thunk.h Mach-O TLV thunk; the rest of JIT runs through LinkImage.
wasm src/wasm/wasm.h Module model / codec / WAT / validate, in public Kit types. Shared by three peers: the wasm frontend (lang/wasm), object format (src/obj/wasm), and codegen backend (src/arch/wasm).
api src/api/lang_registry.h lang_registry_init(Compiler) wires the enabled frontends.

Subsystem-tier notes:


Tier 4 — Core utilities (src/core/)

Foundational data structures. They enforce two project rules at the type level: no global state (everything takes an explicit Heap* / Arena* / Compiler*) and no VLAs.

Header Purpose Takes explicit allocator? Public mirror
core.h Type aliases, Compiler struct, panic/defer machinery Compiler holds context partial, via include/kit/core.h
arena.h Bump allocator; reset frees all Heap* include/kit/support/arena.h (narrowed)
pool.h Symbol interning (Sym canonical IDs) Heap*
buf.h Chunked byte buffer with patch/seek Heap*
vec.h Doubling-growth vector (VEC_GROW) Heap*
segvec.h Segmented append-only array, stable pointers (SEGVEC_DEFINE) Heap*
hashmap.h Alias to the public template n/a include/kit/support/hashmap.h
heap.h Heap abstraction + JIT exec-mmap helper wraps KitHeap KitHeap in core.h
strbuf.h Bounded text builder, caller-owned buffer none (caller buffer)
slice.h Fat-pointer byte view (alias of KitSlice) Arena* for dup KitSlice in core.h
bytes.h LE/BE integer serialize helpers none
diag.h Diagnostic-sink convenience wrappers none (delegates)
metrics.h Telemetry dispatch to optional callbacks none (reads Compiler)
sha256.h Streaming SHA-256 none
util.h MIN/MAX/ALIGN_*/CONTAINER_OF macros none

Core-tier notes:


Tier 5 — Frontend <-> library boundary

Frontends live in lang/, are API consumers, and register per-KitCompiler. Each implements KitFrontendVTable (include/kit/compile.h): a constructor, a compile function that turns a source slice into a KitObjBuilder, a destructor, and the list of file extensions it claims. A frontend is registered with kit_register_frontend(compiler, language, vtable); src/api/lang_registry.h::lang_registry_init auto-wires the enabled KIT_LANG_* frontends at compiler construction.

What frontends consume is overwhelmingly the public API, reached through a single front door: <kit/frontend.h> re-exports cg.h, compile.h, source.h, and support/arena.h (and transitively core.h + object.h), so a frontend TU includes just that one header. The remaining direct public includes are the ones outside the facade — support/hashmap.h for hashed tables, and preprocess.h / jit.h / wasm.h where a specific frontend or runner needs them — plus the single documented frontend→subsystem edge noted in the boundary map (the wasm frontend reaching src/wasm/wasm.h).

Frontend Public entry Notable internal headers
C (lang/c/) kit_c_frontend_vtable (c.h) none in src/type/, decl/, sem/, parse/, abi/ are frontend-local modules over the public CG API
cpp (lang/cpp/) shared by C; pp/pp.h, lex/lex.h cpp_support.h, pp/pp_priv.h
toy (lang/toy/) kit_toy_frontend_vtable (toy.h) internal.h, lexer.h
wasm (lang/wasm/) kit_wasm_frontend_vtable (wasm.h) src/wasm/wasm.h — the shared module-model subsystem; the one frontend→src/ edge

Frontend-tier notes:


Interface review checklist

Apply this to any interface (header / vtable) you add or change. Tier-1 (public) and Tier-2 (backend contract) changes warrant the full list; lower tiers can be lighter.

Boundary & layering

Surface shape

State & ownership

Errors & contracts

Vtable / backend contracts (Tier 2)

Stability & docs