kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

kit Design

kit is a freestanding C11 compiler multi-tool, written in C11. This document is the front door to the design docs: it states what kit is, the principles that shape it, the layered architecture, the primary data flows, and an index of every sibling design doc. It is a map, not a manual — API signatures and struct layouts live in the headers under include/kit/; per-subsystem detail lives in the docs indexed at the end.

What kit is

A single multi-call binary (kit) that bundles a complete C toolchain plus the machinery to JIT, debug, and emulate what it produces. Capabilities:

Design principles

Layered architecture

From outside in, each layer depends only on the layer beneath it:

driver/            CLI policy + host I/O. Includes ONLY <kit/*.h>.
  lang/            Frontends (c, cpp, toy, wasm). API consumers; ONLY
                   <kit/*.h> + their own private headers.
    include/kit/ PUBLIC BOUNDARY. The library's entire stable contract.
      src/api/     Composition: public handles <-> internal subsystems.
        src/...    Internal subsystems. Share private headers among their own
                   TUs; expose nothing except through include/kit/.

The layering invariant: driver/ and lang/ include only <kit/*.h> — never a src/ header. Anything a frontend or tool needs is promoted into the public headers; reaching into src/ is a layering violation. Subsystem *_internal.h headers stay private to their own translation units.

Key abstractions

Primary data flows

1. C source -> object

driver cc -> KitContext + KitCompiler -> kit_compile_* (src/api/compile.c)
  -> registered C frontend (lang/c): lex -> preprocess -> parse/type/decl
  -> KitCg (public CG API)
  -> CgTarget  ( -O0 NativeDirect | -O1 opt wrapper )
  -> NativeTarget -> MCEmitter
  -> ObjBuilder -> object writer

The driver loads source bytes and picks options; src/api/compile.c creates an ObjBuilder and dispatches to the frontend registered for the input language. Assembly (.s) takes a shortcut: the asm frontend feeds the MCEmitter/ ObjBuilder path directly, bypassing KitCg because it is already target-level.

2. File link -> executable

objects / object bytes / archives / DSO inputs
  -> kit_link_* (src/api/link.c -> src/link/)
  -> object/archive readers -> symbol resolution -> layout
  -> relocation -> executable (or incremental patch) writer

The linker owns archive member selection, symbol resolution, section/segment layout, relocation (per-arch fixups behind ArchImpl), build/image-id handling, and final emission, for any enabled object format.

3. Run / JIT in-process

source/object inputs -> compile/link to a JIT LinkImage
  -> kit_link_jit  (KitExecMem from KitJitHost maps + protects pages)
  -> KitJit / KitJitSession
  -> run (invoke entry) | dbg (breakpoints, stepping, regs/mem via KitDbgHost)

The JIT shares the same compile/object/relocation machinery as file output; only the final sink differs. Mapping executable memory and installing TLS are delegated to the host through KitJitHost. run --no-jit instead attaches a bytecode InterpProgram (src/interp/) and executes the entry through the interpreter while still using the JIT image for real data/extern addresses.

4. Emulate a guest ELF

guest ELF bytes -> emu ELF loader (src/emu/)
  -> decode/lift guest basic blocks
  -> CgTarget -> JIT image
  -> emu runtime (syscall + memory model)

The emulator is a user-mode ELF runner that translates guest basic blocks into the same backend/JIT infrastructure used for native JIT, executing them under a guest memory and syscall model.

State and ownership

The host owns storage and side effects (heap, file I/O, executable memory, TLS, debugger OS hooks); libkit owns compilation, object construction, linking, JIT layout, and relocation policy. Public APIs take explicit options and handles; internal state hangs off KitCompiler, the subsystem handles, or frontend context structs. Compile inputs are caller-owned byte buffers that must outlive the call; builders returned by compile are owned by the compiler until consumers finish; object/archive/DSO bytes handed to link calls are borrowed for the call unless an API states otherwise.

Documentation index

Doc Covers
DESIGN.md This map: what kit is, principles, layering, data flows, index.
INTERFACES.md Interface inventory and review checklist across all tiers (public, backend, subsystem, core, frontend).
FRONTENDS.md The lang/ frontends — C (preprocess/parse/type/decl), cpp, toy, wasm — and the frontend vtable contract.
CODEGEN.md The KitCg public CG API and the tiered CgTarget -> NativeDirect/opt -> NativeTarget lowering path.
IR.md The recording/optimizer IR: instructions, types, and how CG operations become analyzable functions.
ARCH.md Per-arch backends (aarch64/x86-64/riscv64), ArchImpl dispatch, MCEmitter, register files, and fixups.
ASM.md The standalone + inline assembler, GAS-subset syntax, and the shared emitter.
OPT.md The -O1 optimizer: SSA construction, register allocation, combine/DCE, and replay into the backend.
INTERPRETER.md The bytecode interpreter over the optimizer IR used by run --no-jit.
OBJ.md The format-neutral object model and ELF/Mach-O/COFF/Wasm read/write behind ObjFormatImpl.
LINK.md Linking: symbol resolution, layout, relocation, linker scripts, and incremental linking.
JIT.md The JIT image model, executable-memory and TLS host hooks, and publish/append/replace.
EMU.md The user-mode guest-ELF emulator and its per-block JIT translation.
DWARF.md DWARF debug-info production and the consumer used by the debugger and dumpers.
DBG.md The debugger: breakpoints, single-step, displaced execution, register/memory access.
CBACKEND.md The portable C-source backend (src/arch/c_target/).
WASM.md The WebAssembly backend, object form, and host-import binding.
DISTRIBUTE.md Signed .kpkg packaging and the content-addressed store (src/dist/, <kit/cas.h> / <kit/package.h>, cas/pkg tools).
DRIVER.md The multi-call binary, tool registry, and command-line policy.
RUNTIME.md The freestanding headers and compiler-rt/libc-style support in rt/.
BUILD.md The build system and KIT_*_ENABLED component gating.
TESTING.md The test suites and harnesses under test/.
CODE_SIZE.md Line counts per component (per-format/per-target split from core).

Planned work and roadmaps live under doc/plan/.