kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit bde9b5848be4f7d9b2587abaaf7c4ad02417143c
parent 8d3bb285e71aed150d99db5019df5876c1f245a0
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Thu,  4 Jun 2026 06:49:20 -0700

doc/plan: design for build-exe/build-lib/build-obj (replacing compile)

Kit-native build verbs that compile a mixed-language source set in memory
and link/archive in one shot, replacing the no-link 'compile' tool. Captures
the --group flag-scoping grammar, the global-vs-scopable taxonomy, -X<lang>
frontend-flag routing, hybrid naming (kit --emit/-o + Zig -static/-dynamic),
the link_engine/archive_engine extraction plan, migration, and the resolved
design decisions.

Diffstat:
Adoc/plan/BUILD_COMMANDS.md | 292+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mdoc/plan/README.md | 1+
2 files changed, 293 insertions(+), 0 deletions(-)

diff --git a/doc/plan/BUILD_COMMANDS.md b/doc/plan/BUILD_COMMANDS.md @@ -0,0 +1,292 @@ +# kit build commands + +Forward-looking roadmap for the kit-native build verbs — `build-exe`, +`build-lib`, `build-obj` — that **replace** the `compile` tool. They compile a +mixed-language set of sources entirely in memory and produce a final artifact +(executable / static or shared library / object) in one invocation, with full +control over both the per-source compile and the whole-build link. Design doc +when shipped: [../DRIVER.md](../DRIVER.md). + +Distinct from [BUILD.md](BUILD.md) (the CAS-backed incremental build +*coordinator*) and from [../BUILD.md](../BUILD.md) (kit's own Makefile build). +This is about the driver's single-shot build commands. + +## Motivation + +Today the kit-native compile path splits awkwardly: + +- **`compile`** (driver/cmd/compile.c) resolves exactly one frontend (by `-x` + or suffix), forwards frontend-specific flags (e.g. wasm `-mfeature=`), and + emits objects / `.s` / portable C / IR — but **never links** and rejects + `.o`/`.a` inputs. Linking means writing intermediate objects to disk and + invoking `ld`/`cc`/`run` separately. +- **`cc`** (driver/cmd/cc.c) already compiles a *polyglot* source set + (`.c .s .S .toy .wat .wasm`, language resolved per-file) to in-memory + `KitObjBuilder*`, links them with byte-loaded `.o`/`.a`/`.so` via a single + `KitLinkSession`, and emits an executable or shared library — **with no + intermediate files**. But `cc` is deliberately a GCC-compatible C driver: its + flag surface is a GCC subset and it does **not** expose frontend-specific + flags. + +So the in-memory, no-temp-files, polyglot compile+link pipeline already exists +and is proven inside `cc`'s link path (driver/cmd/cc.c `cc_run_link_exe`). What +is missing is a **kit-native front door** to that pipeline: one that is polyglot, +forwards per-language frontend flags, exposes the full link-flag surface, and +lets the caller scope compile flags to individual sources or groups of sources — +without pretending to be `gcc`. + +The public API already supports every output we need: + +| Artifact | API | +|----------|-----| +| executable | `KitLinkSession` + `KIT_LINK_OUTPUT_EXE` → `kit_link_session_emit` | +| shared library | `KitLinkSession` + `KIT_LINK_OUTPUT_SHARED` | +| combined relocatable object | `KitLinkSession` + `KIT_LINK_OUTPUT_RELOCATABLE` | +| single object | `kit_obj_builder_emit` (include/kit/object.h) | +| static archive | `kit_obj_builder_emit` each member, then `kit_ar_write` (include/kit/archive.h) | + +This work is therefore **almost entirely a driver-layer reorganization**, not new +core machinery: lift `cc`'s link path into a shared engine, add a kit-native +argument grammar on top, and retire `compile`. + +## The command set + +A Zig-inspired trio. Every command is polyglot, compiles in memory, and writes +no intermediate files. + +| Command | Produces | Backend | +|---------|----------|---------| +| `kit build-exe` | executable | link session, `OUTPUT_EXE` | +| `kit build-lib` | static `.a` (default) or shared library (`-dynamic`) | `kit_ar_write` / `OUTPUT_SHARED` | +| `kit build-obj` | a single object; or `--emit=asm\|c\|ir`; or `-fsyntax-only` check | one `KitObjBuilder`, or `OUTPUT_RELOCATABLE` for multi-source | + +`build-obj` is the full replacement for `compile`: it keeps `--emit=obj|asm|c|ir`, +`-fsyntax-only`, the single-frontend-or-polyglot source handling, and frontend +flag forwarding — and it gains the ability to **combine several sources into one +relocatable object** (`ld -r` style) via `KIT_LINK_OUTPUT_RELOCATABLE`. The +standalone `kit check` (cc.c `driver_check`) and `cc`'s own `--emit=`/`-S` are +unaffected and remain available. + +The three share ~90% of their code; like `cc`/`check` they live in one file +(`driver/cmd/build.c`) with three thin entry points (`driver_build_exe`, +`driver_build_lib`, `driver_build_obj`) over a shared parse+run parameterized by +output kind. + +## Command-line grammar + +### Two flag tiers + +1. **Global / per-output flags** apply to the whole build and may appear + anywhere outside a group. These are everything that must agree across the + link, plus the optimization and debug knobs (per the decision below): + + - `-target TRIPLE` / `--target=`, and target-feature flags + - `-O0|-O1|-O2`, `-g` + - `-fPIC|-fPIE`, `-fvisibility=hidden|default` + - `-ffunction-sections`, `-fdata-sections` + - all **link** flags: `-l`, `-L`, `-e`, `-T`, `-static`/`-dynamic`, + `-pie`/`-no-pie`, `--build-id=`, `-Wl,…`, soname/rpath, subsystem, … + - all **output** flags: `-o`, `--emit=`, `-S`, `-fsyntax-only` + - `-Werror`, `-fmax-errors=N` + +2. **Scopable flags** may appear globally (baseline for every source) *and* + inside a `--group` (override for that group's sources only). The scopable set + is intentionally small — only what is genuinely per-translation-unit: + + - preprocessor: `-I`, `-isystem`, `-D`, `-U` + - language selection: `-x LANG` + - frontend-specific: `-X<lang> FLAG` (see below) + +Placing a global flag inside a `--group` is a **usage error** with a pointed +diagnostic (e.g. `-O is a per-output flag; place it before any --group`). This +keeps the rule a one-liner: *outside a group = whole build; inside a group = +those sources.* + +### Groups + +``` +--group [scopable flags…] -- source [source…] +``` + +Each `--group` bundles scopable overrides with the sources listed up to the next +`--group` or the end of arguments. The `--` separates the group's flags from its +sources. Sources listed **outside** any group ("bare" sources) receive only the +global flags. + +Inheritance and precedence within a group, relative to the global baseline: + +- **Include dirs** (`-I`/`-isystem`): group dirs are prepended to the global + search path (searched first), global dirs still apply. +- **Defines** (`-D`/`-U`): additive; a group `-D` of an already-defined name + overrides it for that group. +- **Language** (`-x`): a group `-x` overrides suffix resolution for that group. +- **Frontend flags** (`-X<lang>`): a group's apply only to that group's sources + of `<lang>`; global `-X<lang>` applies to all sources of `<lang>`. + +Link order is the left-to-right order of source/object/archive appearance; a +group contributes its sources at the group's position. Bare inputs (`.o`/`.a`/ +`.so`) keep their command-line position for the linker. + +### Per-language frontend flags: `-X<lang>` + +`compile` could forward leftover flags unambiguously because it resolved exactly +one frontend. A polyglot build cannot, so frontend flags are explicitly +language-scoped: + +``` +-X<lang> FLAG # e.g. -Xwasm -mfeature=simd128 +``` + +`-X<lang>` consumes exactly one following token and routes it to that frontend's +`kit_frontend_parse_options` (the same entry `compile` uses). Repeatable. +`<lang>` is `c|asm|toy|wasm`. Works both globally and inside a group. (Current +kit frontend flags are single-token; a multi-token form is a future extension if +ever needed.) + +### Naming conventions (hybrid) + +Keep kit/`cc`'s established output vocabulary; adopt Zig's clearer link-kind +selectors: + +- **Output form**: `--emit=obj|asm|c|ir` and `-o PATH` (unchanged from `compile`). + `-S` is sugar for `--emit=asm`. +- **Link kind**: `-static` / `-dynamic` instead of `-shared`. `build-lib` + defaults to a static `.a`; `-dynamic` makes a shared library. `build-exe` + defaults to the target's normal dynamic linking; `-static` produces a fully + static executable. `-shared` is accepted on `build-lib` as a **hidden alias** + for `-dynamic` (eases `cc`/`gcc` muscle memory) but is omitted from help, which + steers to `-dynamic`. + +### Output defaults + +- `build-exe`: `-o` optional; default `a.out` (`a.exe` on Windows). +- `build-lib`: `-o` **required** (no single obvious base name across N sources); + shared output respects soname/`--version`. +- `build-obj`: single source → default `<base>.o` (as `compile` does today, via + the equivalent of `compile_default_out`); multiple sources → `-o` required and + output is one relocatable object; `--emit=c` still requires `-o`; `--emit=ir` + still requires `-O1+`. + +`-o -` writes the emit to stdout for **all** emit forms (obj/asm/c/ir), +reusing the existing `driver_stdout_writer` that `cc` uses — natural for +pipelines (e.g. `build-obj --emit=ir -o - kernel.wat | less`). Binary objects to +a tty are unusual but harmless and not specially rejected. + +A single `-target` governs the whole build — mixing targets in one invocation is +an error (one link, one machine). + +## Worked examples + +```sh +# Polyglot executable: C + a hand-written asm TU + a Wasm module, in memory. +kit build-exe -target aarch64-linux-gnu -O2 -o app \ + main.c util.c \ + --group -DFAST -Iinc/fast -- hot1.c hot2.c \ + --group -Xwasm -mfeature=simd128 -- kernel.wat \ + prebuilt.o -Llib -lfoo + +# Static library from mixed sources (default kind). +kit build-lib -O2 -o libmix.a a.c b.toy c.s + +# Shared library with a soname. +kit build-lib -dynamic -fPIC -Wl,-soname=libmix.so.1 -o libmix.so.1 a.c b.c + +# Combine three TUs into one relocatable object (ld -r). +kit build-obj -O1 -o combined.o a.c b.c c.c + +# Inspect: emit IR for a Wasm module compiled with a frontend feature flag. +kit build-obj -O1 --emit=ir -Xwasm -mfeature=simd128 -o k.ir kernel.wat + +# Check only, no output. +kit build-obj -fsyntax-only main.c util.c +``` + +## Implementation plan + +The work is a factor-out + new-grammar exercise. Proposed file moves: + +1. **`driver/lib/link_engine.{h,c}`** — lift the body of cc.c `cc_run_link_exe` + into a reusable step. Input: a populated link plan (in-memory `KitObjBuilder*` + list, byte-loaded objects/archives/DSOs, an ordered `KitLinkInputOrder` + list, and a filled `KitLinkSessionOptions`). It opens the writer, builds the + `KitLinkSession`, adds inputs in order, and emits. `cc_run_link_exe` becomes a + thin caller, so `cc` and `build-*` share one link path (no behavior change to + `cc`). The runtime-archive insertion (`libkit_rt.a`), hosted-libc wiring + (driver/lib/hosted), and `-l`/`-L` resolution (driver/lib/lib_resolve) are + already factored and are reused as-is. + +2. **`driver/lib/archive_engine.{h,c}`** (small) — `driver_archive_emit(objs[], + names[], n, writer)`: `kit_obj_builder_emit` each member to bytes, then + `kit_ar_write`. Used by `build-lib` (static) and reusable by a future `ar` + pipeline. + +3. **`driver/cmd/build.c`** — the new grammar and the three entry points. Reuses + `driver_compile_run` (driver/lib/compile_engine.h) for the per-source compile, + `DriverCflags` (driver/lib/cflags) for `-I/-D/-U`, and + `driver_target_features_*`. New here: the `--group … --` parser, the + global-vs-scoped validation, the `-X<lang>` router, and per-group cflag/ + frontend-option contexts (one `DriverCflags` baseline plus per-group deltas). + +4. **`driver/main.c`** — register `build-exe`/`build-lib`/`build-obj` in + `driver_tools[]`, gated by new `KIT_TOOL_BUILD_*_ENABLED` flags + (include/kit/config.h); add them to the default install group. Remove the + `compile` entry and its `KIT_TOOL_COMPILE_ENABLED` gate. + +5. **Remove `driver/cmd/compile.c`** and its help. Its capabilities are fully + covered by `build-obj`. + +### Per-group compile state + +The compile loop already builds one `KitObjBuilder*` per source through a shared +`KitCompiler`. The only new state is per-group compile options: each source +carries (a) a `KitPreprocessOptions` derived from global cflags + the group's +cflag delta, (b) a resolved `KitLanguage` (group `-x` or suffix), and (c) the +`lang_extra` from that group's `-X<lang>` flags. This mirrors how `compile` +already calls `kit_frontend_parse_options` per frontend — now keyed per group. + +## Migration + +- **Tests**: `test/toy/run.sh` and any harness invoking `kit compile` move to + `kit build-obj` (same flags: `--emit=`, `-x`, `-fsyntax-only`, frontend flags). + The toy corpus exercises CG via the toy frontend → `build-obj`. +- **Config/install**: drop `KIT_TOOL_COMPILE_ENABLED`; add + `KIT_TOOL_BUILD_EXE_ENABLED` / `_LIB_` / `_OBJ_`. Update the `install` default + tool set and the centralized tool table in main.c. +- **Docs**: update [../DRIVER.md](../DRIVER.md) and the project `CLAUDE.md` code + map (the `compile` bullet → the three `build-*` bullets) when this ships. +- **`cc` unaffected**: it keeps its GCC-compatible surface; it just calls the + shared `link_engine` instead of its inlined copy. + +## Future work (post-v1) + +- **`@file` response files** and **attach-by-name overrides** + (`-Con GLOB : FLAGS`) are deferred. Both layer onto the `--group` grammar + later without breaking it (`@file` is pure argv preprocessing; attach is + additive). Add `@file` first if build-system drivers hit command-line length + limits — it is net-new (no existing expander in the driver) but small and + standard (gcc/ld/ar). +- **JIT/`run` reuse** — `KIT_LINK_OUTPUT_JIT` already backs `kit run`; a future + `build-exe --run` could share the same `link_engine` plan. + +## Verification notes + +- **Relocatable-object combine** — `build-obj` multi-source + (`KIT_LINK_OUTPUT_RELOCATABLE`) must match `ld -r` for symbol visibility and + common symbols. Cover with tests against the existing relocatable-link path + before release; this is the one v1 feature whose semantics need confirming + rather than just wiring. + +## Decisions (2026-06-04) + +| Decision | Choice | +|----------|--------| +| Replace `compile`? | Yes — trio `build-exe`/`build-lib`/`build-obj`; `build-obj` subsumes `compile`. | +| Flag scoping syntax | Explicit `--group [flags] -- sources` blocks. Outside = global/per-output, inside = scoped; a group of one = per-source. | +| Global (per-output) flags | `-O`, `-g`, `-fPIC/-fPIE`, `-fvisibility` are all global (plus `-target`, all link, all output flags). | +| Scopable-in-group set | `-I/-isystem/-D/-U`, `-x`, `-X<lang>` frontend flags. | +| Naming conventions | Hybrid: keep kit `--emit=`/`-o`; adopt Zig `-static`/`-dynamic` for link kind. | +| Inspection / check home | `build-obj` keeps `--emit=asm\|c\|ir`, `-fsyntax-only`, and gains multi-source → relocatable `.o`. | +| `-shared` on `build-lib` | Accepted as a hidden alias for `-dynamic` (not shown in help). | +| `build-obj` multi-source | Relocatable combine ships in v1, gated by `ld -r` parity tests. | +| `-o -` to stdout | Supported for all emit forms (obj/asm/c/ir) via `driver_stdout_writer`. | +| v1 input ergonomics | `--group` grammar only; `@file` and attach-by-name deferred to post-v1. | diff --git a/doc/plan/README.md b/doc/plan/README.md @@ -17,3 +17,4 @@ shrinks to whatever remains open. | [BOOTSTRAP.md](BOOTSTRAP.md) | The 3-stage self-build reproducibility goal and the open `-O1` issues blocking it. | [../BUILD.md](../BUILD.md) | | [IMAGE_INSPECT.md](IMAGE_INSPECT.md) | Extending object inspection to executables and shared libraries. | [../OBJ.md](../OBJ.md) | | [BUILD.md](BUILD.md) | A new content-addressed build coordinator (Bazel/Nix-style incremental builds layered on the CAS) — storage state machine, caching algorithm, recipe protocol. Distinct from `../BUILD.md` (kit's own Makefile build). | — (new subsystem) | +| [BUILD_COMMANDS.md](BUILD_COMMANDS.md) | The kit-native `build-exe`/`build-lib`/`build-obj` verbs that replace `compile`: polyglot, in-memory compile+link with `--group` flag scoping and full link-flag control. Distinct from `BUILD.md` (the CAS coordinator). | [../DRIVER.md](../DRIVER.md) |