plan: build system plan v1 - kit

commit 97146e78460c001d48c9dea8ed0f8144b8280537
parent e618f928841783b78b52452854d70240ab4dbd3f
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Tue,  2 Jun 2026 13:50:32 -0700

plan: build system plan v1

Diffstat:
A doc/plan/BUILD.md  | 441 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
M doc/plan/README.md  | 1 +

2 files changed, 442 insertions(+), 0 deletions(-)
diff --git a/doc/plan/BUILD.md b/doc/plan/BUILD.md
@@ -0,0 +1,441 @@
+# Content-addressed build coordinator
+
+> **Status: design, not yet built.** This is a forward-looking roadmap for a new
+> subsystem layered on the content store in [../DISTRIBUTE.md](../DISTRIBUTE.md)
+> (`<cfree/cas.h>`). It specifies the on-disk storage state machine, the
+> coordinator's in-memory state and caching algorithm, and the command protocol
+> recipes use to talk back to the coordinator.
+>
+> **Naming, to avoid confusion:** this is *not* [../BUILD.md](../BUILD.md), which
+> documents how cfree itself is built (the Makefile, `CFREE_*_ENABLED` component
+> gating, and the 3-stage bootstrap). This doc designs a *user-facing*,
+> Bazel/Nix-style incremental build engine that compiles arbitrary projects by
+> running recipes and caching their outputs by content. The two are unrelated
+> beyond sharing the word "build."
+
+## What it is
+
+A **build coordinator** is a long-lived process that turns a *build request*
+(build target `T` under configuration `C`) into a materialized **output tree** on
+disk, doing the least work necessary. It does this by running **recipes** —
+opaque executables, typically shell scripts — that produce output directories,
+and by caching every recipe result keyed by the exact set of inputs that
+produced it. A second request whose inputs are unchanged returns the cached
+output without re-running anything; a request whose *direct dependencies'*
+outputs are unchanged (even if some deep source changed) can still skip its own
+recipe.
+
+The whole design rests on two properties:
+
+- **Content identity.** Every input and every output has a stable id: the
+  BLAKE2b-256 of its bytes (`cfree_blob_info`, `cfree_cas_*`). Sameness is hash
+  equality, never timestamps.
+- **Determinism.** A recipe is assumed to be a pure function of its *declared*
+  inputs: identical declared inputs ⇒ byte-identical output tree. Caching is only
+  as correct as this assumption (see [Determinism and hermeticity](#determinism-and-hermeticity)).
+
+It reuses the existing CAS wholesale: source-file bytes, output directories, and
+trace bodies are all CAS blobs and trees. The coordinator adds two things the
+content store does not have — a *mutable* per-target trace index, and the
+resolution algorithm that drives it.
+
+## Concepts and vocabulary
+
+| Term | Meaning |
+|------|---------|
+| **Configuration** `C` | An immutable key→value map snapshotted for the life of one build. Supplies tunables (target triple, opt level, feature flags) to recipes. |
+| **Target** `T` | A named unit of work, e.g. `//app:server`. The build definition maps each target name to exactly one recipe + argv. |
+| **Recipe** | The executable the coordinator runs to produce a target's output. Identified by `recipe-id` = hash of (recipe bytes ‖ argv). |
+| **Output tree** | The directory a recipe produces, captured as a CAS tree (`tree-id`). The unit a build request returns. |
+| **Base inputs** | The *leaves* of the dependency graph, dynamically requested by recipes: **config values**, **source files**, and **globs**. |
+| **Target dep** | A dependency of one target on the *output tree* of another target. The graph's interior edges. |
+| **Shallow trace** | A record of one build: its direct base inputs + its direct target deps (by `tree-id`) → its output `tree-id`. |
+| **Deep trace** | A record of one build: the *transitive* closure of base inputs (target deps recursively expanded to their leaves) → its output `tree-id`. |
+| **Tree cache** | On-disk *materialized* output directories, keyed by `tree-id`, ready to hand back as a filesystem path. |
+
+The three base-input kinds are exactly the dynamically-requested leaves: a recipe
+asks the coordinator for a config value, the bytes of a source file, or the
+expansion of a glob, and each request is logged as a dependency so a later
+rebuild knows precisely what to re-check. A glob's "value" is the sorted set of
+matched paths *and* their content — see [glob](#glob-pattern) below.
+
+### Why two trace kinds
+
+The deep trace answers *"has anything in my entire world changed?"* in one pass
+over a flat input set — no graph walk. When it matches, the output is determined
+and we skip straight to materialization.
+
+The shallow trace answers the harder case the deep trace gives up on: *some deep
+source changed, but did it actually change my dependencies' outputs?* A
+formatting-only edit to a header, a comment change, a touched-but-identical file
+— these perturb base inputs without perturbing the dep output that consumes
+them. The shallow path rebuilds the *direct* deps (cheaply, via their own
+traces), compares their resulting `tree-id`s against what the trace recorded, and
+if they match, the recipe can still be skipped. This is the payoff of recording
+deps by output identity rather than by input identity.
+
+## On-disk storage
+
+The build store sits beside (or contains) a CAS. Everything content-addressed is
+immutable and self-verifying; exactly one class of object is mutable.
+
+```
+<store>/
+  cas/                         # the shared content store, per DISTRIBUTE.md
+    blob/<pp>/<blob-id>        #   raw file bytes
+    tree/<pp>/<tree-id>        #   canonical directory manifests
+    ...                        #   (chunk/, index/ as in DISTRIBUTE.md)
+  build/
+    trace/<pp>/<trace-id>      # IMMUTABLE canonical trace bodies (deep + shallow)
+    target/<pp>/<target-key>   # MUTABLE per-target trace set (the only mutable state)
+    cache/<pp>/<tree-id>/      # tree cache: materialized output directories
+    tmp/                       # staging for atomic writes + recipe sandboxes
+```
+
+`<pp>` is the first two lowercase hex chars of the relevant id (the DISTRIBUTE.md
+convention). `trace-id` = BLAKE2b-256 of the canonical trace body.
+`target-key` = `BLAKE2b("cfree build target v1" ‖ target-name)` — note it depends
+on the target *name only*, never on config or recipe-id, so a target's index
+entry is stable while its inputs churn.
+
+### Trace bodies (immutable, content-addressed)
+
+Both kinds are strict, byte-stable, INI-style text in the family of
+`cfree-tree 1` / `cfree-package 3`: a version line, ordered scalar fields, then
+sorted sections. Unknown keys/sections, duplicate rows, and non-canonical
+ordering are parse errors (a parse error is treated as *absent*, never as a
+match — fail safe). All ids are lowercase hex. Config values, source bytes, and
+glob expansions are stored *by hash*, never inline, so a body's size is
+independent of its inputs' sizes.
+
+**Shallow trace** — direct inputs and direct deps:
+
+```
+cfree-build-shallow 1
+target //app:server
+recipe <recipe-id>
+output <output-tree-id>
+[config]                       ; sorted by key
+opt <value-hash>
+triple <value-hash>
+[source]                       ; sorted by path
+src/main.c <blob-id>
+[glob]                         ; sorted by pattern
+src/*.c <glob-result-hash>
+[dep]                          ; sorted by dep target-name
+//lib:core <dep-output-tree-id>
+//lib:util <dep-output-tree-id>
+```
+
+**Deep trace** — transitive base-input closure, *no* `[dep]` section (deps are
+dissolved into their leaves):
+
+```
+cfree-build-deep 1
+target //app:server
+recipe <recipe-id>
+output <output-tree-id>
+[config]                       ; union over the whole closure, sorted, deduped
+opt <value-hash>
+triple <value-hash>
+[source]
+src/main.c <blob-id>
+lib/core.c <blob-id>           ; pulled in transitively from //lib:core
+[glob]
+src/*.c <glob-result-hash>
+```
+
+- `<value-hash>` = `BLAKE2b(config-value-bytes)`. The literal value is recoverable
+  from the config snapshot; the trace stores only the hash so it is uniform and
+  size-stable. A *requested-but-unset* key is recorded with a reserved
+  sentinel hash so that later *adding* the key invalidates the trace.
+- `<blob-id>` = the source file's CAS blob id (`cfree_blob_info`).
+- `<glob-result-hash>` = `BLAKE2b` of the canonical listing of `(path, blob-id)`
+  pairs the pattern matched (sorted by path). It changes if any matched file is
+  added, removed, or edited — so a single glob row covers the existence *and*
+  content of its whole match set, and files read through a glob need no separate
+  `[source]` rows.
+- `recipe-id` is recorded as an ordinary input and verified on lookup, so editing
+  a recipe (or its argv, or which recipe a target maps to) invalidates traces the
+  same way a source edit does.
+
+### Target record (mutable, the only mutable object)
+
+`build/target/<pp>/<target-key>` lists this target's candidate traces,
+newest-first, capped:
+
+```
+cfree-build-record 1
+target //app:server
+deep <trace-id>
+deep <trace-id>
+shallow <trace-id>
+shallow <trace-id>
+```
+
+Each row points at a trace body in `build/trace/`. On every successful resolution
+the coordinator prepends the fresh trace, de-duplicates, and truncates each kind
+to a small cap (`CFREE_BUILD_RECORD_CAP`, e.g. 8) — a bounded MRU window. A trace
+older than the window ages out and becomes GC-eligible. Multiple candidates exist
+because the same target may have been built under several distinct input
+combinations recently (e.g. two config values flipped back and forth).
+
+### Storage state machine (atomicity and crash safety)
+
+The store must never hand back a wrong output after a crash. The rules:
+
+1. **Content objects** (`cas/blob`, `cas/tree`, `build/trace`, and a materialized
+   `build/cache/<tree-id>/`) are written into `build/tmp/`, fsync'd, then
+   **atomically renamed** to their final content-keyed path. A half-written object
+   only ever exists under `tmp/` and is orphaned, never under its content key.
+   Re-deriving the same content re-creates the identical path — writes are
+   idempotent, so a racing or retried producer is harmless.
+
+2. **The target record is the only ordering-sensitive state**, and it is updated
+   by **read-modify-write into `tmp/` then atomic rename**. A reader therefore
+   always sees a complete prior version or a complete next version, never a torn
+   one. A crash loses *at most* the most recent record update; losing a record
+   entry is *safe* — the next build re-derives and rewrites it. A corrupt record
+   (failed `cfree-build-record 1` parse) is treated as empty.
+
+3. **Recipe sandboxes** live in `build/tmp/run-<n>/`. On success the output subdir
+   is ingested into the CAS (`cfree_cas_add_tree_from_dir`) and the sandbox is
+   removed; on failure or crash it is orphaned and swept later.
+
+**Invariant.** Anything reachable under a content key (`blob/`, `tree/`,
+`trace/`) is complete and matches its key. A target record may *dangle* — point
+at a trace that GC removed — and readers tolerate that by treating the missing
+candidate as absent. The store degrades toward "rebuild," never toward "wrong
+answer."
+
+**Concurrency.** Two coordinators sharing a store are safe for content writes
+(idempotent atomic renames). Target-record updates are last-writer-wins; both
+writers wrote *valid* records, so the only loss is MRU ordering, which costs at
+most a rebuild. An optional per-target-key advisory lock around the
+read-modify-write removes even that.
+
+**GC** (out of scope to implement now, in scope to design for): mark-and-sweep
+rooted at the live target records → reachable trace bodies → referenced
+trees/blobs; sweep unreachable `trace/`, `cas/`, and `cache/`. The tree cache and
+CAS are independently size-boundable by LRU since both are re-derivable.
+
+## Build coordinator state and caching
+
+### In-memory state
+
+All state hangs off one `BuildCoordinator` context (no globals — the project
+rule). For the life of the process it memoizes every base-input probe and every
+target resolution, so a diamond in the graph is built once and a source file is
+hashed once:
+
+| Field | Contents | Invalidated |
+|-------|----------|-------------|
+| `config` | The immutable configuration snapshot (key→value). | Never (per process). |
+| `source_hashes` | `path → blob-id` (with stat for sanity). | Never (per process). |
+| `glob_results` | `pattern → (sorted paths, glob-result-hash)`. | Never (per process). |
+| `targets` | `target-name → resolution state`: `in-progress` \| `done(output-tree-id, deep-input-set, materialized-path)`. | Never (per process). |
+| `cas` | Open `CfreeCas*` handle over `cas/`. | — |
+| `store` | Paths + the target-record reader/writer over `build/`. | — |
+
+"Cached over the life of the coordinator process" (the request brief) means
+exactly these memo tables: the coordinator assumes the workspace does not change
+*under it* mid-build. Each table also doubles as cycle/dedup control — a
+`target-name` found `in-progress` when requested again is a dependency cycle and
+is an error.
+
+A resolved target yields two things its parent needs: the **output `tree-id`**
+(to satisfy a `need`) and the **deep-input-set** (its deep trace's leaf closure,
+to fold into the parent's own deep trace).
+
+### Resolution algorithm
+
+A request for `(T, C)` runs three phases. `C` is fixed for the process; the
+config-dependence of `T` is *recorded in and verified against* its traces, so the
+coordinator keys nothing by config and a change to an *unconsumed* config key
+never busts `T`.
+
+```
+resolve(T):
+  if T in targets:                      # memoized / cycle check
+     return targets[T] or error("cycle")
+  mark T in-progress
+
+  # ---- Phase 1: deep-trace fast path ----------------------------------
+  for each deep trace D of T (newest first):
+     if refresh(D.base_inputs) all match current state:   # rehash sources,
+        return materialize(D.output)                       # re-run globs,
+                                                           # recompare config,
+                                                           # recheck recipe-id
+  # ---- Phase 2: shallow-trace path ------------------------------------
+  for each shallow trace S of T (newest first):
+     if any of S.direct_base_inputs changed: continue      # this trace can't hold
+     ok = true
+     for each (dep, recorded_tree_id) in S.deps:
+        cur = resolve(dep)                                  # recurse
+        if cur.output_tree_id != recorded_tree_id: ok = false; break
+     if ok:
+        write_deep_trace(T, union of S.direct_base_inputs
+                            and each dep's deep-input-set)  # refresh the deep trace
+        return materialize(S.output)
+  # ---- Phase 3: run the recipe ----------------------------------------
+  return run_recipe(T)
+```
+
+with the materialization ladder shared by every cache hit:
+
+```
+materialize(tree-id):
+  if tree-id in build/cache/  : return that path            # already on disk
+  if tree-id in cas/          : restore into build/cache/, return path
+  else                        : return run_recipe(T)        # CAS evicted: rebuild
+```
+
+This is precisely the brief's flow. Reading it back against the request:
+
+- *"Check if a deep trace is available; if so, refresh all inputs."* — Phase 1.
+  `refresh` consults the memo tables, so after the first target touches a file or
+  glob the rest of the closure is free.
+- *"If inputs have not changed, check tree cache → CAS restore → else run."* — the
+  `materialize` ladder. "Run" is the last resort only when the bytes were
+  GC'd/evicted; we already know *what* the output is, we just no longer *have* it.
+- *"If inputs changed, or no deep trace, check shallow trace; if none, rerun."* —
+  fall-through from Phase 1 into Phase 2, and Phase 2's fall-through into Phase 3.
+- *"If shallow trace available, rebuild target deps; if dep output hashes match
+  the trace and other inputs match, attempt CAS restore; if any differ, rerun."*
+  — the Phase 2 loop. "Other inputs match" is the `direct_base_inputs` guard;
+  "dep output hashes match" is the per-dep `tree-id` compare.
+
+The non-obvious move is **Phase 2 succeeding after a base input changed.** A
+changed source busts the *deep* trace (Phase 1 gives up), but if that source only
+feeds a dep whose recipe maps it to an unchanged output (`tree-id` identical),
+the dep compare in Phase 2 still passes and `T`'s own recipe is skipped. The
+deep trace is the cheap "nothing moved" check; the shallow trace is the "did the
+churn actually reach me?" check. On a Phase-2 hit we also write a *fresh* deep
+trace from the now-known closure, so the next request gets the Phase-1 fast path
+back.
+
+### Running a recipe
+
+```
+run_recipe(T):
+  sandbox = build/tmp/run-<n>/ ;  out = sandbox/out/
+  spawn recipe with env: { CFREE_BUILD_SOCK, CFREE_BUILD_OUT=out,
+                           CFREE_BUILD_TARGET=T, workspace root }
+  service the recipe's protocol requests, logging each as a dep (next section)
+  on nonzero exit: propagate failure, write NO trace
+  on success:
+     output-tree-id = cfree_cas_add_tree_from_dir(out)
+     write_shallow_trace(T, direct inputs logged this run, dep tree-ids,
+                         output-tree-id)
+     write_deep_trace   (T, transitive base-input closure, output-tree-id)
+     prepend both to T's target record (dedup, truncate to cap)
+     install out in build/cache/<output-tree-id>/
+     return { output-tree-id, deep-input-set, path }
+```
+
+Writing *both* trace kinds on every real build is what lets a later request take
+whichever path fits the change it sees.
+
+## Recipe protocol
+
+Recipes communicate back to the coordinator over a transport left unspecified
+here — a pipe on a well-known fd, a unix socket named in `$CFREE_BUILD_SOCK`, or
+an in-process callback table for recipes that are cfree library calls rather than
+subprocesses. The command *set* is fixed; the wire is not.
+
+Every request both **returns a value** and **logs a dependency**. The logged
+dependency is what lands in the shallow trace and (after expansion) the deep
+trace. The contract that makes caching correct: **an input the recipe reads but
+does not request is invisible to the cache** — so every input must flow through a
+command.
+
+| Command | Returns | Dependency logged |
+|---------|---------|-------------------|
+| `config-get <key>` | the config value (or *unset*) | config dep: `(key, value-hash)`, sentinel hash if unset |
+| `source <path>` | `blob-id` + an absolute path to read | source dep: `(path, blob-id)` |
+| `glob <pattern>` | sorted list of matching paths | glob dep: `(pattern, glob-result-hash)` |
+| `need <target>...` | each dep's output `tree-id` + a materialized path | target dep edge: `(dep, output-tree-id)` |
+| `log <text>` | — | none (diagnostic passthrough) |
+
+- **`config-get`** records the value's hash, so changing a *consumed* key
+  invalidates while changing an unconsumed key does not. If a changed consumed
+  key causes the recipe to branch and request a *new* key next time, the old trace
+  already fails to match on the changed key, so the new key-set is discovered on
+  the rebuild — the scheme is self-correcting and never needs to predict the
+  input set ahead of time.
+- **`source`** hands back a path inside the workspace (or, in a hermetic variant,
+  a path staged into the sandbox) plus the blob-id the recipe's read is pinned to.
+- **`glob`** records the whole match set's content via `glob-result-hash`; the
+  recipe then reads the returned paths without further declaration.
+- **`need`** is the dynamic-dependency primitive. The coordinator `resolve()`s
+  each target recursively (Phases 1–3), returns its output `tree-id` and a
+  readable materialized path, and records the edge by *output identity*. This is
+  what the shallow path re-checks. `need` is also where cycles are caught.
+- **Outputs** need no command: the recipe writes under `$CFREE_BUILD_OUT` and the
+  coordinator snapshots that directory on exit. (An optional `declare-output
+  <path>` could assert an expected output for earlier error reporting.)
+
+## Determinism and hermeticity
+
+Cache correctness is exactly the assumption *output = f(declared inputs)*,
+deterministically. The design's guard rails:
+
+- **Declared-inputs-are-complete.** Reading an undeclared file is a hermeticity
+  violation. It cannot be *detected* without sandboxing/FS tracing, but the
+  protocol makes declaration the only way to get an input, and the brief is to
+  enforce it later (a sandbox that denies undeclared reads, or an `strace`/FUSE
+  layer that records them). Until then it is a contract recipes must honor.
+- **Determinism.** Timestamps, RNG, network fetches, and `$RANDOM`-style
+  nondeterminism break the model: the second build's captured tree differs from
+  the trace's recorded output, and the cache would serve a stale-but-believed-
+  current result. Mitigations: declare such inputs (a "now" config value, a
+  pinned URL+hash), or mark a target **no-cache** so it always runs Phase 3.
+- **Verify mode.** A diagnostic mode re-runs a recipe whose trace says "unchanged"
+  and compares the fresh `tree-id` to the recorded one; a mismatch flags a
+  nondeterministic or under-declared recipe. This is the recommended way to audit
+  recipes before trusting their traces.
+
+## Worked example
+
+`//app:server` depends on `//lib:core`; the recipe globs `src/*.c`, reads
+`src/main.c`, and consumes config `opt`.
+
+1. **Cold build.** No record. Phase 3 runs both recipes. `//lib:core` builds,
+   yields `tree L0`. `//app:server` runs, logs `config opt`, `glob src/*.c`,
+   `source src/main.c`, `need //lib:core → L0`; yields `tree A0`. Shallow + deep
+   traces written for both. `build/cache/A0/` materialized, path returned.
+2. **No-op rebuild.** Phase 1 deep trace of `//app:server` refreshes `opt`,
+   `src/*.c`, `src/main.c`, and the *transitively folded* `lib/core.c` — all
+   memoized, all match. `A0` is in the tree cache. Returns instantly, nothing
+   re-run, no graph walk.
+3. **Comment-only edit to `lib/core.c`.** `lib/core.c`'s blob-id changed → the
+   `//app:server` deep trace fails Phase 1 (its folded `[source]` moved). Phase 2:
+   direct base inputs (`opt`, `src/*.c`, `src/main.c`) are unchanged, so probe the
+   one dep — `resolve(//lib:core)`. That recurse's *own* recipe re-runs (its
+   source moved) but emits a byte-identical `tree L0` (comment stripped). The dep
+   compare `L0 == L0` passes → `//app:server`'s recipe is **skipped**, `A0`
+   restored, and a refreshed deep trace written so step 2's fast path returns.
+4. **Flip `opt`.** `//app:server`'s deep and shallow traces both fail on the
+   `config opt` row → Phase 3 re-runs its recipe. `//lib:core` (which never
+   consumed `opt`) is untouched by Phase 1 and returns `L0` from cache.
+
+## Open questions / future work
+
+- **Transport.** Pipe vs unix socket vs in-process table; framing and the exact
+  request/response encoding. Deliberately unspecified above.
+- **Hermeticity enforcement.** Sandbox that denies undeclared reads; FS-trace
+  fallback to *detect* (not prevent) them and warn.
+- **Remote CAS.** The `materialize` ladder's "restore from CAS" naturally extends
+  to a remote/shared cache (the DISTRIBUTE.md external-object fetch hints are the
+  precedent) so a clean checkout can download outputs instead of rebuilding.
+- **Compact deep traces.** A flat transitive leaf set can be large for deep
+  graphs. An alternative stores a merkle of per-dep deep-input hashes (composing
+  deep traces by reference) at the cost of a shallow walk on refresh — a
+  size/latency trade to revisit if flat traces bloat.
+- **GC policy.** Concrete roots, schedule, and LRU bounds for `trace/`, the CAS,
+  and the tree cache.
+- **Parallelism.** Independent `need`s can resolve concurrently; the memo table
+  must then carry futures (`in-progress` becomes "await") rather than a flag.
+- **Public surface.** Whether this lands as a libcfree subsystem behind a public
+  header (the `cfree/cas.h` precedent) with a thin `cfree build` CLI, gated by a
+  `CFREE_BUILD_ENABLED` flag.
diff --git a/doc/plan/README.md b/doc/plan/README.md
@@ -16,3 +16,4 @@ shrinks to whatever remains open.
 | [ARCH.md](ARCH.md) | Remaining native-backend completeness for x64/rv64 relative to the aa64 reference, and per-call cost follow-ups. | [../ARCH.md](../ARCH.md) |
 | [BOOTSTRAP.md](BOOTSTRAP.md) | The 3-stage self-build reproducibility goal and the open `-O1` issues blocking it. | [../BUILD.md](../BUILD.md) |
 | [IMAGE_INSPECT.md](IMAGE_INSPECT.md) | Extending object inspection to executables and shared libraries. | [../OBJ.md](../OBJ.md) |
+| [BUILD.md](BUILD.md) | A new content-addressed build coordinator (Bazel/Nix-style incremental builds layered on the CAS) — storage state machine, caching algorithm, recipe protocol. Distinct from `../BUILD.md` (cfree's own Makefile build). | — (new subsystem) |

	kit kit
	git clone https://git.ryansepassi.com/git/kit.git
	Log \| Files \| Refs \| README

A	doc/plan/BUILD.md	\|	441	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
M	doc/plan/README.md	\|	1	+