plan: kit build v2 - kit

commit 96608783797124dd00d823fc222bddf3ffa0891e
parent 17b1cc89e8496f6ccdf10e84326b174494792a1e
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Tue,  2 Jun 2026 14:57:07 -0700

plan: kit build v2

Diffstat:
M doc/plan/BUILD.md  | 680 ++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------

1 file changed, 450 insertions(+), 230 deletions(-)
diff --git a/doc/plan/BUILD.md b/doc/plan/BUILD.md
@@ -2,9 +2,10 @@
 
 > **Status: design, not yet built.** This is a forward-looking roadmap for a new
 > subsystem layered on the content store in [../DISTRIBUTE.md](../DISTRIBUTE.md)
-> (`<kit/cas.h>`). It specifies the on-disk storage state machine, the
-> coordinator's in-memory state and caching algorithm, and the command protocol
-> recipes use to talk back to the coordinator.
+> (`<kit/cas.h>`). It specifies the configuration model, the on-disk storage
+> state machine, the coordinator's in-memory state and caching algorithm, the
+> command protocol recipes use to talk back, and how builds share work over a
+> network (remote CAS + signed traces).
 >
 > **Naming, to avoid confusion:** this is *not* [../BUILD.md](../BUILD.md), which
 > documents how kit itself is built (the Makefile, `KIT_*_ENABLED` component
@@ -16,14 +17,13 @@
 ## What it is
 
 A **build coordinator** is a long-lived process that turns a *build request*
-(build target `T` under configuration `C`) into a materialized **output tree** on
-disk, doing the least work necessary. It does this by running **recipes** —
-opaque executables, typically shell scripts — that produce output directories,
-and by caching every recipe result keyed by the exact set of inputs that
-produced it. A second request whose inputs are unchanged returns the cached
-output without re-running anything; a request whose *direct dependencies'*
-outputs are unchanged (even if some deep source changed) can still skip its own
-recipe.
+(target `T` under configuration `C`) into a materialized **output tree** on disk,
+doing the least work necessary. It does this by running **recipes** — opaque
+executables, typically shell scripts — that produce output directories, and by
+caching every recipe result keyed by the exact set of inputs that produced it. A
+second request whose inputs are unchanged returns the cached output without
+re-running anything; a request whose *direct dependencies'* outputs are unchanged
+(even if some deep source changed) can still skip its own recipe.
 
 The whole design rests on two properties:
 
@@ -32,47 +32,91 @@ The whole design rests on two properties:
   equality, never timestamps.
 - **Determinism.** A recipe is assumed to be a pure function of its *declared*
   inputs: identical declared inputs ⇒ byte-identical output tree. Caching is only
-  as correct as this assumption (see [Determinism and hermeticity](#determinism-and-hermeticity)).
+  as correct as this assumption (see [Determinism](#determinism-and-hermeticity)).
 
-It reuses the existing CAS wholesale: source-file bytes, output directories, and
-trace bodies are all CAS blobs and trees. The coordinator adds two things the
-content store does not have — a *mutable* per-target trace index, and the
-resolution algorithm that drives it.
+It reuses the existing CAS wholesale: source-file bytes, output directories,
+config snapshots, and trace bodies are all CAS blobs and trees. The coordinator
+adds two things the content store does not have — a *mutable* per-target trace
+index, and the resolution algorithm that drives it.
 
 ## Concepts and vocabulary
 
 | Term | Meaning |
 |------|---------|
-| **Configuration** `C` | An immutable key→value map snapshotted for the life of one build. Supplies tunables (target triple, opt level, feature flags) to recipes. |
-| **Target** `T` | A named unit of work, e.g. `//app:server`. The build definition maps each target name to exactly one recipe + argv. |
-| **Recipe** | The executable the coordinator runs to produce a target's output. Identified by `recipe-id` = hash of (recipe bytes ‖ argv). |
+| **Target** `T` | A named unit of work, e.g. `//app:server`. A build definition maps each target name to one recipe + local argv. |
+| **Recipe** | The executable the coordinator runs to produce a target's output. Identified by `recipe-id` = `BLAKE2b(recipe file bytes)`. |
+| **Configuration** | A key→value map of tunables (target triple, opt level, feature flags). Two scopes — *propagated* and *local* — see [below](#configuration-model). Identified by `config-id` = `BLAKE2b(canonical map)`. |
 | **Output tree** | The directory a recipe produces, captured as a CAS tree (`tree-id`). The unit a build request returns. |
 | **Base inputs** | The *leaves* of the dependency graph, dynamically requested by recipes: **config values**, **source files**, and **globs**. |
-| **Target dep** | A dependency of one target on the *output tree* of another target. The graph's interior edges. |
-| **Shallow trace** | A record of one build: its direct base inputs + its direct target deps (by `tree-id`) → its output `tree-id`. |
-| **Deep trace** | A record of one build: the *transitive* closure of base inputs (target deps recursively expanded to their leaves) → its output `tree-id`. |
+| **Target dep** | A dependency of one target on the *output tree* of another. The graph's interior edges, created by `need`. |
+| **Shallow trace** | A record of one build: its direct base inputs + its direct target deps (by output `tree-id`) → its output `tree-id`. |
+| **Deep trace** | A record of one build: the *transitive* closure of source/glob leaves + the root `config-id` → its output `tree-id`. |
 | **Tree cache** | On-disk *materialized* output directories, keyed by `tree-id`, ready to hand back as a filesystem path. |
 
 The three base-input kinds are exactly the dynamically-requested leaves: a recipe
 asks the coordinator for a config value, the bytes of a source file, or the
 expansion of a glob, and each request is logged as a dependency so a later
-rebuild knows precisely what to re-check. A glob's "value" is the sorted set of
-matched paths *and* their content — see [glob](#glob-pattern) below.
+rebuild knows precisely what to re-check.
+
+## Configuration model
+
+Configuration has two scopes, distinguished by visibility:
+
+- **Propagated configuration** flows down the whole subtree. The top-level
+  request supplies it; a recipe reads a value with `config-get <key>` (logged as a
+  config dep); and a `need` may **overlay** it for the sub-build it triggers. A
+  target's *effective* propagated config = its inherited config with overlays
+  applied along the `need` path from the root. It is canonicalized to byte-stable
+  text and content-addressed: `config-id = BLAKE2b(canonical map)`, stored as a
+  CAS blob so any recorded `config-id` resolves back to the actual map.
+
+- **Local configuration** is the target's **argv**, fixed by the build
+  definition. It is visible *only* to that one target's recipe (delivered as the
+  process's actual argv) and is **not** propagated to deps. Treating argv as
+  config keeps the model uniform: everything a recipe sees is configuration, some
+  inherited, some local. Local config is recorded per trace but is not part of
+  `config-id`. It too is serialized to a canonical CAS blob, `argv-id`.
+
+**Serialized for replay, not just hashed.** Both the effective config map and the
+argv vector are persisted *by value* as canonical, content-addressed CAS blobs
+(`config-id`, `argv-id`) — never reduced to an opaque one-way hash. A trace
+references them by id, and they are part of its reachable closure (GC-rooted, and
+bundled with [shared traces](#shared-traces-trusted-as-signed-packages)). This is
+what lets the shallow path **replay every `need` request exactly** — reconstruct
+the dep's target name, its full effective config with overlays applied, and its
+argv — even when driving from an imported shallow trace with no access to the
+original build definition or workspace. Source files and globs are the opposite
+case: kept by hash only, because they are *verified* (did this still hash the
+same?), never replayed, and their bytes already live in the CAS.
+
+The **resolution identity** of a build is therefore `(target-name, config-id)`,
+where `config-id` is the effective *propagated* config. Local argv is determined
+by the target name (via the build definition), so it does not widen the key. Two
+requests `(T, C1)` and `(T, C2)` — the same target under different propagated
+config — are genuinely distinct builds; both may be in flight at once.
+
+`recipe-id` covers only the recipe *file bytes*. Two other things that can change
+a target's behavior — *which* recipe it maps to, and its argv — live in the
+**build definition**, which the coordinator reads to resolve `T`. That read is
+itself a tracked source dependency of `T`, so editing a target's stanza
+(repointing its recipe, or changing argv) invalidates its traces through the
+ordinary source-leaf mechanism.
 
 ### Why two trace kinds
 
-The deep trace answers *"has anything in my entire world changed?"* in one pass
-over a flat input set — no graph walk. When it matches, the output is determined
-and we skip straight to materialization.
+The deep trace answers *"is this exact configuration unchanged and has any source
+moved?"* in one pass over a flat leaf set — no graph walk. When it matches, the
+output is determined and we skip straight to materialization. It is the inner
+dev-loop fast path: edit code, rebuild under the same config.
 
-The shallow trace answers the harder case the deep trace gives up on: *some deep
-source changed, but did it actually change my dependencies' outputs?* A
-formatting-only edit to a header, a comment change, a touched-but-identical file
-— these perturb base inputs without perturbing the dep output that consumes
-them. The shallow path rebuilds the *direct* deps (cheaply, via their own
-traces), compares their resulting `tree-id`s against what the trace recorded, and
-if they match, the recipe can still be skipped. This is the payoff of recording
-deps by output identity rather than by input identity.
+The shallow trace handles everything the deep path defers — a changed source that
+might not actually move a dep's output, and *any* configuration change. It
+rebuilds the *direct* deps (cheaply, via their own traces, threading the new
+config down through `need`), compares their resulting `tree-id`s against what it
+recorded, and if they match, the recipe is still skipped. This is the payoff of
+recording deps by *output* identity rather than by input identity: a subtree that
+ignores a changed config key, or absorbs a comment-only edit, produces the same
+output tree and short-circuits its parent's recipe.
 
 ## On-disk storage
 
@@ -82,8 +126,8 @@ immutable and self-verifying; exactly one class of object is mutable.
 ```
 <store>/
   cas/                         # the shared content store, per DISTRIBUTE.md
-    blob/<pp>/<blob-id>        #   raw file bytes
-    tree/<pp>/<tree-id>        #   canonical directory manifests
+    blob/<pp>/<blob-id>        #   raw file bytes (sources, config maps, argv vectors, trace bodies)
+    tree/<pp>/<tree-id>        #   canonical directory manifests (output trees)
     ...                        #   (chunk/, index/ as in DISTRIBUTE.md)
   build/
     trace/<pp>/<trace-id>      # IMMUTABLE canonical trace bodies (deep + shallow)
@@ -93,10 +137,12 @@ immutable and self-verifying; exactly one class of object is mutable.
 ```
 
 `<pp>` is the first two lowercase hex chars of the relevant id (the DISTRIBUTE.md
-convention). `trace-id` = BLAKE2b-256 of the canonical trace body.
+convention). `trace-id` = `BLAKE2b` of the canonical trace body.
 `target-key` = `BLAKE2b("kit build target v1" ‖ target-name)` — note it depends
 on the target *name only*, never on config or recipe-id, so a target's index
-entry is stable while its inputs churn.
+entry is a stable handle while its inputs churn. Config maps and trace bodies are
+ordinary CAS blobs; only `target/` records and the `cache/` materializations are
+build-specific.
 
 ### Trace bodies (immutable, content-addressed)
 
@@ -108,56 +154,74 @@ match — fail safe). All ids are lowercase hex. Config values, source bytes, an
 glob expansions are stored *by hash*, never inline, so a body's size is
 independent of its inputs' sizes.
 
-**Shallow trace** — direct inputs and direct deps:
+**Shallow trace** — direct inputs and direct deps, for fine-grained rechecking:
 
 ```
 kit-build-shallow 1
 target //app:server
 recipe <recipe-id>
 output <output-tree-id>
-[config]                       ; sorted by key
-opt <value-hash>
-triple <value-hash>
+config <config-id>             ; this target's effective propagated config (full map blob)
+argv <argv-id>                 ; this target's local config (serialized vector blob)
+[config]                       ; CONSUMED propagated key NAMES, sorted (values live in the map)
+opt
 [source]                       ; sorted by path
 src/main.c <blob-id>
+build.kit <blob-id>            ; the build definition stanza is a tracked source
 [glob]                         ; sorted by pattern
 src/*.c <glob-result-hash>
-[dep]                          ; sorted by dep target-name
-//lib:core <dep-output-tree-id>
-//lib:util <dep-output-tree-id>
+[dep]                          ; sorted by (dep-name, dep-config-id) — each row replays one `need`
+//lib:core <dep-config-id> <dep-output-tree-id>
 ```
 
-**Deep trace** — transitive base-input closure, *no* `[dep]` section (deps are
-dissolved into their leaves):
+**Deep trace** — root config-id plus the transitive source/glob closure, *no*
+`[dep]` and *no* per-key `[config]` section:
 
 ```
 kit-build-deep 1
 target //app:server
 recipe <recipe-id>
 output <output-tree-id>
-[config]                       ; union over the whole closure, sorted, deduped
-opt <value-hash>
-triple <value-hash>
-[source]
+root-config <config-id>        ; this target's effective propagated config (full map blob)
+argv <argv-id>                 ; local config (serialized vector blob)
+[source]                       ; union over the whole closure, sorted, deduped
 src/main.c <blob-id>
 lib/core.c <blob-id>           ; pulled in transitively from //lib:core
 [glob]
 src/*.c <glob-result-hash>
 ```
 
-- `<value-hash>` = `BLAKE2b(config-value-bytes)`. The literal value is recoverable
-  from the config snapshot; the trace stores only the hash so it is uniform and
-  size-stable. A *requested-but-unset* key is recorded with a reserved
-  sentinel hash so that later *adding* the key invalidates the trace.
-- `<blob-id>` = the source file's CAS blob id (`kit_blob_info`).
-- `<glob-result-hash>` = `BLAKE2b` of the canonical listing of `(path, blob-id)`
-  pairs the pattern matched (sorted by path). It changes if any matched file is
-  added, removed, or edited — so a single glob row covers the existence *and*
-  content of its whole match set, and files read through a glob need no separate
-  `[source]` rows.
-- `recipe-id` is recorded as an ordinary input and verified on lookup, so editing
-  a recipe (or its argv, or which recipe a target maps to) invalidates traces the
-  same way a source edit does.
+- `config <config-id>` and `argv <argv-id>` reference the **serialized** effective
+  config map and argv vector — canonical CAS blobs, kept by value and bundled with
+  shared traces — so the target's invocation and every `need` it issues replay
+  *exactly*, not merely check (see [Configuration model](#configuration-model)).
+- `[config]` lists the **names** of consumed propagated keys (consumed whether set
+  *or* unset); the values live in the `config-id` map this trace references. A key
+  *absent* from that map was consumed while unset, so later *adding* it busts the
+  trace — no sentinel value needed. Matching compares, for each consumed key, the
+  request's value against the built map's value (absent == unset).
+- `<blob-id>` = the source file's CAS blob id (`kit_blob_info`); source and glob
+  leaves are kept by hash because they are *verified*, never replayed.
+- `<glob-result-hash>` = `BLAKE2b` of the canonical `(path, blob-id)` listing the
+  pattern matched (sorted by path). It changes if any matched file is added,
+  removed, or edited — so one glob row covers the existence *and* content of its
+  whole match set, and files read through a glob need no separate `[source]` rows.
+- `[dep]` rows carry the **dep's config-id** (and output `tree-id`); together with
+  the dep's own serialized config/argv this re-resolves each dep under the exact
+  propagated config it used, overlays included.
+
+**Why the deep trace needs only `root-config` + source/glob leaves.** If the
+request's `config-id` equals `root-config` *and* every source/glob leaf still
+matches the live workspace, then every recipe in the closure has byte-identical
+inputs — so every `need` overlay it computes is identical, every effective config
+downstream is identical, and by determinism the output is identical. Any input
+that could perturb a downstream overlay (a source a recipe branches on, a config
+value) is itself either a recorded source/glob leaf or folded into `root-config`,
+so nothing escapes the check. Using the whole `root-config` (rather than just
+consumed keys) is a deliberate, conservative simplification: an *unconsumed*
+config change busts the deep fast path, but the [shallow path](#resolution-algorithm)
+recovers — it re-runs only the recipes whose output actually changed, so unrelated
+subtrees re-verify by hash without re-running.
 
 ### Target record (mutable, the only mutable object)
 
@@ -175,158 +239,222 @@ shallow <trace-id>
 
 Each row points at a trace body in `build/trace/`. On every successful resolution
 the coordinator prepends the fresh trace, de-duplicates, and truncates each kind
-to a small cap (`KIT_BUILD_RECORD_CAP`, e.g. 8) — a bounded MRU window. A trace
-older than the window ages out and becomes GC-eligible. Multiple candidates exist
-because the same target may have been built under several distinct input
-combinations recently (e.g. two config values flipped back and forth).
+to a small cap (`KIT_BUILD_RECORD_CAP`, e.g. 8) — a bounded MRU window. Older
+traces age out and become GC-eligible. Multiple candidates exist because the same
+target may have been built under several distinct input/config combinations
+recently (e.g. two config values flipped back and forth).
 
 ### Storage state machine (atomicity and crash safety)
 
 The store must never hand back a wrong output after a crash. The rules:
 
-1. **Content objects** (`cas/blob`, `cas/tree`, `build/trace`, and a materialized
-   `build/cache/<tree-id>/`) are written into `build/tmp/`, fsync'd, then
-   **atomically renamed** to their final content-keyed path. A half-written object
-   only ever exists under `tmp/` and is orphaned, never under its content key.
-   Re-deriving the same content re-creates the identical path — writes are
+1. **Content objects** (`cas/blob`, `cas/tree`, `build/trace`, config maps, and a
+   materialized `build/cache/<tree-id>/`) are written into `build/tmp/`, fsync'd,
+   then **atomically renamed** to their final content-keyed path. A half-written
+   object only ever exists under `tmp/` and is orphaned, never under its content
+   key. Re-deriving the same content re-creates the identical path — writes are
    idempotent, so a racing or retried producer is harmless.
 
-2. **The target record is the only ordering-sensitive state**, and it is updated
-   by **read-modify-write into `tmp/` then atomic rename**. A reader therefore
-   always sees a complete prior version or a complete next version, never a torn
-   one. A crash loses *at most* the most recent record update; losing a record
-   entry is *safe* — the next build re-derives and rewrites it. A corrupt record
-   (failed `kit-build-record 1` parse) is treated as empty.
+2. **The target record is the only ordering-sensitive state**, updated by
+   **read-modify-write into `tmp/` then atomic rename**. A reader always sees a
+   complete prior or complete next version, never a torn one. A crash loses *at
+   most* the most recent record update; losing a record entry is *safe* — the next
+   build re-derives and rewrites it. A corrupt record (failed `kit-build-record 1`
+   parse) is treated as empty.
 
 3. **Recipe sandboxes** live in `build/tmp/run-<n>/`. On success the output subdir
    is ingested into the CAS (`kit_cas_add_tree_from_dir`) and the sandbox is
    removed; on failure or crash it is orphaned and swept later.
 
 **Invariant.** Anything reachable under a content key (`blob/`, `tree/`,
-`trace/`) is complete and matches its key. A target record may *dangle* — point
-at a trace that GC removed — and readers tolerate that by treating the missing
+`trace/`) is complete and matches its key. A target record may *dangle* — point at
+a trace that GC removed — and readers tolerate that by treating the missing
 candidate as absent. The store degrades toward "rebuild," never toward "wrong
 answer."
 
 **Concurrency.** Two coordinators sharing a store are safe for content writes
 (idempotent atomic renames). Target-record updates are last-writer-wins; both
-writers wrote *valid* records, so the only loss is MRU ordering, which costs at
-most a rebuild. An optional per-target-key advisory lock around the
-read-modify-write removes even that.
-
-**GC** (out of scope to implement now, in scope to design for): mark-and-sweep
-rooted at the live target records → reachable trace bodies → referenced
-trees/blobs; sweep unreachable `trace/`, `cas/`, and `cache/`. The tree cache and
-CAS are independently size-boundable by LRU since both are re-derivable.
+writers wrote *valid* records, so the only loss is MRU ordering, costing at most a
+rebuild. An optional per-target-key advisory lock around the read-modify-write
+removes even that.
 
 ## Build coordinator state and caching
 
 ### In-memory state
 
-All state hangs off one `BuildCoordinator` context (no globals — the project
+All state hangs off one `KitBuildCoordinator` context (no globals — the project
 rule). For the life of the process it memoizes every base-input probe and every
 target resolution, so a diamond in the graph is built once and a source file is
 hashed once:
 
-| Field | Contents | Invalidated |
-|-------|----------|-------------|
-| `config` | The immutable configuration snapshot (key→value). | Never (per process). |
-| `source_hashes` | `path → blob-id` (with stat for sanity). | Never (per process). |
-| `glob_results` | `pattern → (sorted paths, glob-result-hash)`. | Never (per process). |
-| `targets` | `target-name → resolution state`: `in-progress` \| `done(output-tree-id, deep-input-set, materialized-path)`. | Never (per process). |
-| `cas` | Open `KitCas*` handle over `cas/`. | — |
-| `store` | Paths + the target-record reader/writer over `build/`. | — |
-
-"Cached over the life of the coordinator process" (the request brief) means
-exactly these memo tables: the coordinator assumes the workspace does not change
-*under it* mid-build. Each table also doubles as cycle/dedup control — a
-`target-name` found `in-progress` when requested again is a dependency cycle and
-is an error.
-
-A resolved target yields two things its parent needs: the **output `tree-id`**
-(to satisfy a `need`) and the **deep-input-set** (its deep trace's leaf closure,
-to fold into the parent's own deep trace).
+| Field | Contents | Notes |
+|-------|----------|-------|
+| `source_hashes` | `path → blob-id` (+ stat). | Per-process memo. |
+| `glob_results` | `pattern → (sorted paths, glob-result-hash)`. | Per-process memo. |
+| `config_maps` | `config-id → map`, and overlay results. | Immutable; content-addressed. |
+| `targets` | `(target-name, config-id) → future` of `{output-tree-id, deep-leafset, path}`. | Memo **and** in-flight dedup. |
+| `cas` | Open `KitCas*` handle over `cas/`. | Plus optional remote (below). |
+| `store` | Paths + target-record reader/writer over `build/`. | — |
+| `jobs` | Semaphore bounding concurrent recipe processes. | The global parallelism limit. |
+
+"Cached over the life of the coordinator process" means exactly these memo
+tables: the coordinator assumes the workspace does not change *under it*
+mid-build.
+
+The `targets` table is keyed by the full `(target-name, config-id)` and holds a
+**future**, not a flag — so concurrent `need`s of the same `(T, config-id)` await
+one resolution rather than racing (see [Parallelism](#parallelism)). A resolved
+build yields the **output `tree-id`** (to satisfy a `need`), a materialized
+**path**, and its **deep leafset** (the transitive source/glob closure, to fold
+into a parent's deep trace).
+
+### Cycle detection
+
+The `targets` memo alone cannot catch a cycle — a self-dependent target would
+simply await its own unresolved future and deadlock. So each resolution carries an
+explicit **build chain**: the ordered list of `(target-name, config-id)` frames
+from the root request down to here. A `need` whose `(dep, config-id)` already
+appears on the chain is a dependency cycle; the chain *is* the error message
+(`//a → //b → //a`). The chain is per-path and distinct from the cross-path memo:
+the same `(T, config-id)` may appear in many chains (a shared dep) but never twice
+in one.
 
 ### Resolution algorithm
 
-A request for `(T, C)` runs three phases. `C` is fixed for the process; the
-config-dependence of `T` is *recorded in and verified against* its traces, so the
-coordinator keys nothing by config and a change to an *unconsumed* config key
-never busts `T`.
+A request is `resolve(T, cfg, chain)` where `cfg` is the effective propagated
+config (with `cfg.id` its config-id). The config-dependence is recorded in and
+verified against the traces; the *deep* path checks whole-config-id equality, the
+*shallow* path checks consumed keys.
 
 ```
-resolve(T):
-  if T in targets:                      # memoized / cycle check
-     return targets[T] or error("cycle")
-  mark T in-progress
-
-  # ---- Phase 1: deep-trace fast path ----------------------------------
-  for each deep trace D of T (newest first):
-     if refresh(D.base_inputs) all match current state:   # rehash sources,
-        return materialize(D.output)                       # re-run globs,
-                                                           # recompare config,
-                                                           # recheck recipe-id
-  # ---- Phase 2: shallow-trace path ------------------------------------
-  for each shallow trace S of T (newest first):
-     if any of S.direct_base_inputs changed: continue      # this trace can't hold
-     ok = true
-     for each (dep, recorded_tree_id) in S.deps:
-        cur = resolve(dep)                                  # recurse
-        if cur.output_tree_id != recorded_tree_id: ok = false; break
+resolve(T, cfg, chain):
+  key = (T, cfg.id)
+  if key in chain:  error cycle(chain + key)
+  if key in targets:  return await targets[key]        # cross-path memo / in-flight dedup
+  targets[key] = new future ; chain' = chain + key
+
+  # ---- Phase 1: deep fast path (same config, did sources move?) -------
+  for D in deep_traces(T), newest-first:
+     if D.root_config == cfg.id
+        and D.argv     == argv_id(T)
+        and refresh(D.source_glob_leaves) all match live:   # memoized rehash / reglob
+        return done(materialize(D.output, T, cfg), D.leafset)
+
+  # ---- Phase 2: shallow path (config and/or sources moved) ------------
+  for S in shallow_traces(T), newest-first:
+     if S.recipe != recipe_id(T):           continue
+     if S.argv   != argv_id(T):             continue       # local config
+     M = config_by_id(S.config)                            # the built map (serialized blob)
+     if any consumed key in S.config-keys differs between cfg and M:  continue
+     if any direct source/glob leaf of S changed:                     continue
+     ok = true ; child_leafsets = []
+     for (dep, dep_cfg_id, recorded_tree) in S.deps:       # may run in parallel
+        r = resolve(dep, config_by_id(dep_cfg_id), chain')  # replay the need: map recovered by id
+        if r.output != recorded_tree: ok = false; break
+        child_leafsets.push(r.leafset)
      if ok:
-        write_deep_trace(T, union of S.direct_base_inputs
-                            and each dep's deep-input-set)  # refresh the deep trace
-        return materialize(S.output)
+        leafset = union(direct source/glob leaves of S, child_leafsets)
+        write_deep_trace(T, cfg.id, leafset, S.output)      # refresh the deep trace
+        return done(materialize(S.output, T, cfg), leafset)
+
   # ---- Phase 3: run the recipe ----------------------------------------
-  return run_recipe(T)
+  return run_recipe(T, cfg, chain')
 ```
 
 with the materialization ladder shared by every cache hit:
 
 ```
-materialize(tree-id):
-  if tree-id in build/cache/  : return that path            # already on disk
+materialize(tree-id, T, cfg):
+  if tree-id in build/cache/  : return that path             # already on disk
   if tree-id in cas/          : restore into build/cache/, return path
-  else                        : return run_recipe(T)        # CAS evicted: rebuild
+  if remote configured        : fetch tree-id (+ blobs) into cas/, verify, restore
+  else                        : return run_recipe(T, cfg)    # bytes gone: rebuild
 ```
 
-The non-obvious move is **Phase 2 succeeding after a base input changed.** A
-changed source busts the *deep* trace (Phase 1 gives up), but if that source only
-feeds a dep whose recipe maps it to an unchanged output (`tree-id` identical),
-the dep compare in Phase 2 still passes and `T`'s own recipe is skipped. The
-deep trace is the cheap "nothing moved" check; the shallow trace is the "did the
-churn actually reach me?" check. On a Phase-2 hit we also write a *fresh* deep
-trace from the now-known closure, so the next request gets the Phase-1 fast path
-back.
+This is the brief's flow, sharpened by config:
+
+- *Phase 1* is the deep fast path: identical config-id and an all-clear source/glob
+  refresh ⇒ reuse. `refresh` hits the per-process memo, so after the first target
+  touches a file or glob the rest of the closure is free.
+- *Phase 2* is the shallow path, taken when Phase 1 finds nothing (config changed,
+  or a source moved). It guards on the target's own consumed config and direct
+  leaves, then re-resolves each recorded dep **under the dep's recorded config-id**
+  and compares outputs. Re-using the recorded `dep_cfg_id` is correct: if the
+  parent's consumed config were different, the consumed-key guard above would have
+  failed first and the parent would re-run, recomputing its `need` overlays.
+- A Phase-2 hit also writes a *fresh* deep trace from the now-known closure, so the
+  next request gets the Phase-1 fast path back.
+- *Phase 3* runs the recipe only when no trace holds, or when a hit's bytes were
+  evicted and no remote can supply them.
+
+The non-obvious win is **Phase 2 succeeding after an input changed.** A changed
+source busts the deep trace (Phase 1 gives up), but if that source only feeds a dep
+whose recipe maps it to an unchanged output (`tree-id` identical), the dep compare
+passes and `T`'s recipe is skipped. The deep trace is the cheap "nothing moved"
+check; the shallow trace is the "did the churn actually reach me?" check.
 
 ### Running a recipe
 
 ```
-run_recipe(T):
-  sandbox = build/tmp/run-<n>/ ;  out = sandbox/out/
-  spawn recipe with env: { KIT_BUILD_SOCK, KIT_BUILD_OUT=out,
-                           KIT_BUILD_TARGET=T, workspace root }
-  service the recipe's protocol requests, logging each as a dep (next section)
+run_recipe(T, cfg, chain):
+  acquire jobs                                              # global parallelism limit
+  sandbox = build/tmp/run-<n>/ ; out = sandbox/out/
+  spawn recipe (argv = argv(T), recovered by argv_id) with env:
+     { KIT_BUILD_SOCK, KIT_BUILD_OUT=out, KIT_BUILD_TARGET=T, workspace root }
+  service the recipe's protocol requests, logging each as a dep (next section);
+     config-get reads cfg; need recurses resolve(...) under cfg ⊕ overrides
+  release jobs
   on nonzero exit: propagate failure, write NO trace
   on success:
-     output-tree-id = kit_cas_add_tree_from_dir(out)
-     write_shallow_trace(T, direct inputs logged this run, dep tree-ids,
-                         output-tree-id)
-     write_deep_trace   (T, transitive base-input closure, output-tree-id)
+     output = kit_cas_add_tree_from_dir(out)
+     put serialized cfg map (cfg.id) and argv vector (argv_id(T)) into the CAS
+     write_shallow_trace(T, recipe_id, argv_id, cfg.id, consumed key names,
+                         direct leaves, dep edges with their config-ids, output)
+     write_deep_trace   (T, cfg.id, argv_id, transitive source/glob closure, output)
      prepend both to T's target record (dedup, truncate to cap)
-     install out in build/cache/<output-tree-id>/
-     return { output-tree-id, deep-input-set, path }
+     install out in build/cache/<output>/
+     return done({output, path}, leafset)
 ```
 
 Writing *both* trace kinds on every real build is what lets a later request take
 whichever path fits the change it sees.
 
+### Parallelism
+
+The coordinator owns all parallelism: it is the single process that launches and
+manages every recipe, so the overall job limit is configured in it (the `jobs`
+semaphore). Independent `need`s resolve concurrently up to that limit; the
+`targets` futures guarantee a shared `(T, config-id)` is built once even when many
+parents request it at the same instant. Scheduling is at recipe granularity — a
+recipe may parallelize internally, but the coordinator counts it as one job.
+Cycle detection composes with concurrency through the per-path build chain
+described above; the global memo handles dedup, the chain handles cycles.
+
 ## Recipe protocol
 
-Recipes communicate back to the coordinator over a transport left unspecified
-here — a pipe on a well-known fd, a unix socket named in `$KIT_BUILD_SOCK`, or
-an in-process callback table for recipes that are kit library calls rather than
-subprocesses. The command *set* is fixed; the wire is not.
+Recipes communicate back over a **transport the driver defines** and the library
+abstracts behind a host vtable — the same "host supplies all side effects"
+principle the rest of kit follows. The library defines the *command set* and the
+length-prefixed request/response framing; the host supplies connect/read/write.
+
+```c
+typedef struct KitBuildTransport {
+  /* Accept one recipe connection (the recipe dialed the endpoint named in
+   * $KIT_BUILD_SOCK); read/write length-prefixed frames; close. */
+  int (*accept)(void* user, KitBuildConn** out);
+  int (*read_frame)(void* user, KitBuildConn*, uint8_t* buf, size_t cap, size_t* n);
+  int (*write_frame)(void* user, KitBuildConn*, const uint8_t* buf, size_t n);
+  void (*close)(void* user, KitBuildConn*);
+  void* user;
+} KitBuildTransport;
+```
+
+Defaults and portability: a **unix-domain socket** (Linux, macOS, FreeBSD, and
+Windows 10+, which all support `AF_UNIX`) named in `$KIT_BUILD_SOCK`, or an
+**anonymous pipe pair** on inherited fds where a socket is undesirable; Windows
+may instead use a **named pipe**. An **in-process** transport (a direct callback
+table) serves recipes that are kit library calls rather than subprocesses. The
+command set is identical across all of them.
 
 Every request both **returns a value** and **logs a dependency**. The logged
 dependency is what lands in the shallow trace and (after expansion) the deep
@@ -336,89 +464,181 @@ command.
 
 | Command | Returns | Dependency logged |
 |---------|---------|-------------------|
-| `config-get <key>` | the config value (or *unset*) | config dep: `(key, value-hash)`, sentinel hash if unset |
-| `source <path>` | `blob-id` + an absolute path to read | source dep: `(path, blob-id)` |
+| `config-get <key>` | the propagated value (or *unset*) | config dep: key recorded as consumed; its value is the one in the effective `config-id` map (absent = unset) |
+| `source <path>` | `blob-id` + a path to read | source dep: `(path, blob-id)` |
 | `glob <pattern>` | sorted list of matching paths | glob dep: `(pattern, glob-result-hash)` |
-| `need <target>...` | each dep's output `tree-id` + a materialized path | target dep edge: `(dep, output-tree-id)` |
-
-- **`config-get`** records the value's hash, so changing a *consumed* key
-  invalidates while changing an unconsumed key does not. If a changed consumed
-  key causes the recipe to branch and request a *new* key next time, the old trace
-  already fails to match on the changed key, so the new key-set is discovered on
-  the rebuild — the scheme is self-correcting and never needs to predict the
-  input set ahead of time.
-- **`source`** hands back a path inside the workspace (or, in a hermetic variant,
-  a path staged into the sandbox) plus the blob-id the recipe's read is pinned to.
+| `need <target> [k=v…]` | each dep's output `tree-id` + a path | target dep edge: `(dep, dep-config-id, output-tree-id)` |
+
+- **`config-get`** reads the target's *effective propagated* config and records the
+  value's hash, so changing a consumed key invalidates while changing an unconsumed
+  key does not. If a changed consumed key makes the recipe branch and request a
+  *new* key next time, the old trace already fails to match on the changed key, so
+  the new key-set is discovered on the rebuild — self-correcting, never needing to
+  predict the input set ahead of time. Local config (argv) is *not* read here; it
+  arrives as the process's argv.
+- **`source`** hands back a path inside the workspace (or, in a hermetic variant, a
+  path staged into the sandbox) plus the blob-id the read is pinned to.
 - **`glob`** records the whole match set's content via `glob-result-hash`; the
   recipe then reads the returned paths without further declaration.
-- **`need`** is the dynamic-dependency primitive. The coordinator `resolve()`s
-  each target recursively (Phases 1–3), returns its output `tree-id` and a
-  readable materialized path, and records the edge by *output identity*. This is
-  what the shallow path re-checks. `need` is also where cycles are caught.
+- **`need`** is the dynamic-dependency primitive. The optional `k=v` pairs
+  **overlay** propagated config for that sub-build (local argv never propagates).
+  The coordinator resolves `(dep, cfg ⊕ overrides)` recursively, returns its output
+  `tree-id` and a readable path, and records the edge with the dep's `config-id`
+  so the shallow path can re-resolve identically. `need` is also where cycles are
+  caught.
 - **Outputs** need no command: the recipe writes under `$KIT_BUILD_OUT` and the
   coordinator snapshots that directory on exit.
 
+## Remote CAS and shared traces
+
+Two independent network capabilities, split along the DISTRIBUTE.md line —
+**content is trustless (hash-verified); claims are trusted (signed).**
+
+### Remote object fetch (trustless, via a fetch recipe)
+
+The coordinator performs no network I/O itself. When `materialize` (or a blob
+restore) needs an object that is absent locally and a remote is configured, it
+invokes a user-provided **fetch recipe** — an executable or command template, e.g.
+`curl -fsS -o "$KIT_FETCH_OUT" "https://cache.example/{kind}/{pp}/{id}"`, rendered
+with `{kind}` ∈ `blob`/`tree`, `{pp}`, `{id}` exactly like DISTRIBUTE.md's
+external-fetch templates. The fetched bytes land in `build/tmp/` and are
+**verified against the requested content id** before being installed into the
+local CAS; a corrupt or malicious mirror fails the hash check and is discarded. So
+the remote and the fetch recipe are *untrusted* — the existing self-verifying CAS
+makes that safe. This adds one rung to the materialization ladder (tree cache →
+local CAS → remote fetch → run recipe) and lets a clean checkout *download* an
+output instead of rebuilding it.
+
+### Shared traces (trusted, as signed packages)
+
+A trace is a **claim**: "these inputs ⇒ this output tree." Unlike content, it is
+**not self-verifying** — confirming it means re-running the (assumed-deterministic)
+recipe, which is exactly the work we are trying to avoid. To safely import someone
+else's trace we must **trust the claimant**. Traces are therefore shared as
+**signed trace bundles**, reusing the DISTRIBUTE.md package + minisign + trust
+machinery wholesale:
+
+- A bundle is a signed manifest (`kit-build-traces 1`, signed exactly like a
+  `kit-package` manifest) listing `(target-name, kind, trace-id, output-tree-id)`
+  claims, carried in a `.kpkg` (portable tar.gz or native kpkg) alongside the
+  trace bodies and the **serialized config-map and argv blobs** they reference
+  (required — without them an imported shallow trace cannot replay its `need`s),
+  plus, optionally, the referenced output trees/blobs.
+- Trust is the DISTRIBUTE.md model unchanged: verify the minisign signature
+  against the trusted-keys file (`-p KEY`, the anchor file, or `--tofu`); the
+  signed trusted comment binds the signature to the manifest hash.
+- Import: verify signature → anchor key → install the trace bodies into
+  `build/trace/` and prepend them to the relevant target records. Output bytes come
+  from the bundle or from the remote CAS — either way **hash-verified** on use.
+
+The result is a precise security split: a local build can now deep/shallow-**hit**
+on a remote builder's trace — obtaining the output *without running the recipe* —
+while trusting only the *signed claim*; the output bytes themselves remain
+trustless (verified by `tree-id`/`blob-id`). Trusting a trace signer is exactly
+like trusting a package signer in DISTRIBUTE.md: it is trust in their build
+outputs, gated by the trusted-keys allowlist, and auditable by re-running under
+[verify mode](#determinism-and-hermeticity).
+
 ## Determinism and hermeticity
 
 Cache correctness is exactly the assumption *output = f(declared inputs)*,
-deterministically. The design's guard rails:
+deterministically.
 
 - **Declared-inputs-are-complete.** Reading an undeclared file is a hermeticity
-  violation. It cannot be *detected* without sandboxing/FS tracing, but the
-  protocol makes declaration the only way to get an input, and the brief is to
-  enforce it later (a sandbox that denies undeclared reads, or an `strace`/FUSE
-  layer that records them). Until then it is a contract recipes must honor.
-- **Determinism.** Timestamps, RNG, network fetches, and `$RANDOM`-style
-  nondeterminism break the model: the second build's captured tree differs from
-  the trace's recorded output, and the cache would serve a stale-but-believed-
-  current result. Mitigations: declare such inputs (a "now" config value, a
-  pinned URL+hash), or mark a target **no-cache** so it always runs Phase 3.
+  violation. The protocol makes declaration the only *intended* way to get an
+  input, but **enforcement is deferred** (see below): for now it is a contract
+  recipes must honor.
+- **Determinism.** Timestamps, RNG, and unpinned network fetches break the model —
+  a second build's captured tree differs from the recorded output, and the cache
+  would serve a stale-but-believed-current result. Mitigations: declare such
+  inputs (a "now" config value, a pinned URL+hash), or mark a target **no-cache**
+  so it always runs Phase 3.
 - **Verify mode.** A diagnostic mode re-runs a recipe whose trace says "unchanged"
   and compares the fresh `tree-id` to the recorded one; a mismatch flags a
-  nondeterministic or under-declared recipe. This is the recommended way to audit
-  recipes before trusting their traces.
+  nondeterministic or under-declared recipe — the recommended audit before
+  trusting (or signing and sharing) traces.
 
 ## Worked example
 
 `//app:server` depends on `//lib:core`; the recipe globs `src/*.c`, reads
-`src/main.c`, and consumes config `opt`.
-
-1. **Cold build.** No record. Phase 3 runs both recipes. `//lib:core` builds,
-   yields `tree L0`. `//app:server` runs, logs `config opt`, `glob src/*.c`,
-   `source src/main.c`, `need //lib:core → L0`; yields `tree A0`. Shallow + deep
-   traces written for both. `build/cache/A0/` materialized, path returned.
-2. **No-op rebuild.** Phase 1 deep trace of `//app:server` refreshes `opt`,
-   `src/*.c`, `src/main.c`, and the *transitively folded* `lib/core.c` — all
-   memoized, all match. `A0` is in the tree cache. Returns instantly, nothing
-   re-run, no graph walk.
-3. **Comment-only edit to `lib/core.c`.** `lib/core.c`'s blob-id changed → the
-   `//app:server` deep trace fails Phase 1 (its folded `[source]` moved). Phase 2:
-   direct base inputs (`opt`, `src/*.c`, `src/main.c`) are unchanged, so probe the
-   one dep — `resolve(//lib:core)`. That recurse's *own* recipe re-runs (its
-   source moved) but emits a byte-identical `tree L0` (comment stripped). The dep
-   compare `L0 == L0` passes → `//app:server`'s recipe is **skipped**, `A0`
-   restored, and a refreshed deep trace written so step 2's fast path returns.
-4. **Flip `opt`.** `//app:server`'s deep and shallow traces both fail on the
-   `config opt` row → Phase 3 re-runs its recipe. `//lib:core` (which never
-   consumed `opt`) is untouched by Phase 1 and returns `L0` from cache.
-
-## Open questions / future work
-
-- **Transport.** Pipe vs unix socket vs in-process table; framing and the exact
-  request/response encoding. Deliberately unspecified above.
-- **Hermeticity enforcement.** Sandbox that denies undeclared reads; FS-trace
-  fallback to *detect* (not prevent) them and warn.
-- **Remote CAS.** The `materialize` ladder's "restore from CAS" naturally extends
-  to a remote/shared cache (the DISTRIBUTE.md external-object fetch hints are the
-  precedent) so a clean checkout can download outputs instead of rebuilding.
-- **Compact deep traces.** A flat transitive leaf set can be large for deep
-  graphs. An alternative stores a merkle of per-dep deep-input hashes (composing
-  deep traces by reference) at the cost of a shallow walk on refresh — a
-  size/latency trade to revisit if flat traces bloat.
-- **GC policy.** Concrete roots, schedule, and LRU bounds for `trace/`, the CAS,
-  and the tree cache.
-- **Parallelism.** Independent `need`s can resolve concurrently; the memo table
-  must then carry futures (`in-progress` becomes "await") rather than a flag.
-- **Public surface.** Whether this lands as a libkit subsystem behind a public
-  header (the `kit/cas.h` precedent) with a thin `kit build` CLI, gated by a
-  `KIT_BUILD_ENABLED` flag.
+`src/main.c`, and consumes config `opt`. The request is under config `C0` (so
+`config-id = c0`).
+
+1. **Cold build.** No record. Phase 3 runs both recipes. `//lib:core` builds →
+   `tree L0`. `//app:server` logs `config opt`, `glob src/*.c`, `source src/main.c`,
+   `need //lib:core` (under `c0` → `(//lib:core, c0)`); yields `tree A0`. Both trace
+   kinds written for both targets. `build/cache/A0/` materialized, path returned.
+2. **No-op rebuild (same `C0`).** Phase 1: `//app:server`'s deep trace has
+   `root-config c0` (matches) and refreshes `opt` via config-id, `src/*.c`,
+   `src/main.c`, and the folded `lib/core.c` — all memoized, all match. `A0` is in
+   the tree cache. Returns instantly, nothing re-run, no graph walk.
+3. **Comment-only edit to `lib/core.c`.** `lib/core.c`'s blob-id moved → the deep
+   trace fails Phase 1 (a folded source leaf changed). Phase 2: direct leaves
+   (`opt`, `src/*.c`, `src/main.c`) unchanged, so probe the one dep —
+   `resolve(//lib:core, c0)`. Its recipe re-runs (its source moved) but emits a
+   byte-identical `L0` (comment stripped). `L0 == L0` ⇒ `//app:server`'s recipe is
+   **skipped**, `A0` restored, a refreshed deep trace written so step 2's fast path
+   returns next time.
+4. **Flip `opt` (now `C1`, `config-id = c1`).** Phase 1 is skipped outright:
+   `//app:server`'s deep trace has `root-config c0 ≠ c1`. Phase 2 under `c1`:
+   `opt` is a consumed key whose value differs between `c1` and the trace's built
+   map → no shallow trace holds → Phase 3 re-runs `//app:server`. Its `need //lib:core` carries no `opt` override,
+   so the dep resolves as `(//lib:core, c1)`; but `//lib:core` never consumes
+   `opt`, so its shallow trace under `c1` matches by consumed-keys (empty) and
+   direct leaves, and `L0` is reused without re-running. Only the one recipe that
+   actually depends on `opt` re-ran.
+
+## Decided / deferred / open
+
+**Decided.**
+
+- **Transport** is abstracted behind a host vtable the driver defines; defaults are
+  a unix socket or anonymous pipe, with a Windows named-pipe fallback and an
+  in-process variant. Supported hosts: Linux, macOS, Windows, FreeBSD.
+- **Parallelism** lives in the coordinator (the `jobs` semaphore); futures dedup
+  shared targets; the per-path chain detects cycles.
+- **Remote CAS** delegates fetching to a user-provided fetch recipe and verifies by
+  content id (trustless).
+- **Trace sharing** ships traces as signed `.kpkg` bundles over the DISTRIBUTE.md
+  trust model (trusted), because traces are claims, not self-verifying content.
+- **Public surface** is a public header `<kit/build.h>` plus a `kit build` driver
+  command, gated by `KIT_BUILD_ENABLED` (the `kit/cas.h` + `kit cas` precedent). A
+  thin CLI parses flags and supplies the transport/host vtables; the coordinator
+  and trace model live in the library.
+
+**Deferred.**
+
+- **Hermeticity enforcement.** No sandbox-deny or FS-trace at first; the protocol
+  defines the contract, enforcement comes later (deny undeclared reads, or
+  `strace`/FUSE detection that warns).
+- **GC.** Deferred. When built: mark-and-sweep rooted at live target records →
+  reachable trace bodies → referenced trees/blobs/config-maps/argv vectors; sweep
+  unreachable `trace/`, `cas/`, and `cache/`, with LRU size bounds on the
+  (re-derivable) tree cache and CAS.
+
+**Open — compact deep traces.** The flat deep leafset re-lists every transitive
+source in *every* ancestor: O(Σ closure sizes) bytes and O(closure) work to
+refresh. A **structural** representation fixes both and is the natural form of the
+"efficient set union" the closure already needs:
+
+- Store each target's deep leafset as a content-addressed object
+  `deep-set = { direct source/glob leaves, [child-deep-set-id…] }`, with
+  `deep-set-id = BLAKE2b(canonical(direct ‖ child-ids))`. The fully-flattened set
+  is the recursive union, but it is never materialized — an ancestor stores only
+  its own direct slice plus 32-byte pointers to its children's deep-sets. A leaf
+  shared by many ancestors is stored once (a Merkle DAG, structurally sharing like
+  the CAS itself). This *is* the union operation: building a parent's deep-set is
+  "concatenate sorted direct leaves with child-set pointers," not "merge N flat
+  arrays."
+- **Faster checking** falls out of the same structure. Refresh memoizes
+  `deep-set-id → valid?` for the whole build: a subtree checked once is reused by
+  every parent that points at it, and an *unchanged* subtree is recognized by
+  pointer/hash equality at its root without descending to leaves. Change detection
+  becomes O(changed frontier) instead of O(closure): recompute current deep-set-ids
+  bottom-up (memoized, so O(distinct subtrees) = O(nodes)), and where a recomputed
+  id equals the recorded one, the whole subtree is provably unchanged and skipped.
+- The cost is that refresh walks the DAG rather than streaming one flat list, and
+  the child deep-set objects must be present (another store dependency, GC-rooted
+  like traces). The flat form in this doc is the *semantic* definition; the
+  structural form is the recommended physical representation once flat traces bloat
+  — adopt it together with the in-memory union it mirrors.

	kit kit
	git clone https://git.ryansepassi.com/git/kit.git
	Log \| Files \| Refs \| README