kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

Distribution as a library subsystem

Status: implemented. The migration below has landed (one commit): the dist subsystem moved to src/dist/ (+ top-level vendor/), exposed through <kit/cas.h> / <kit/package.h> (src/api/{cas,package}.c), gated by KIT_CAS_ENABLED / KIT_PKG_ENABLED; kit cas / kit pkg are thin CLIs over the public API via a KitCasHost vtable (driver/lib/dist_host.c), with operational errors flowing through ctx->diag. Verified green: test-driver-cas (41) + test-driver-pkg (182) under ASan/UBSan.

Deferred: the v2 deletion + 3-suffix rename (Stage 3, below) were not done — the dead v2 code was carried over unchanged. On inspection the deletion is more surgical than the line-ranges below imply: the v2 extern surface (DistManifest/DistArtifact/DistDependency, dist_manifest_*, dist_kpkg2_*, the v2 DistKpkg* structs + dist_kpkg_* v2 codecs) is safely unreferenced, but the v2 and v3 manifest parsers share the static helpers set_err / trim_lead / trim_trail / copy_field / kind_valid in src/dist/manifest.c (only parse_u64, the v2 finalize, and dist_manifest_path_valid are v2-only). The cleanup must keep the shared helpers — verify the same in src/dist/kpkg.c — and recompile + rerun the cas/pkg suites after.

Signed, content-addressed distribution (kit cas / kit pkg) is today the only major capability that lives entirely inside driver/ — its model, its vendored crypto/compression, and its create/verify/unpack pipelines all sit under driver/dist/ and driver/cmd/{cas,pkg}.c. Every other capability is a libkit subsystem behind include/kit/, with the CLI tool a thin arg-parser on top. This doc captures the plan to bring distribution into the same shape: move the implementation into the library, expose it through two public headers, and reduce cas.c/pkg.c to flag-parsing + host wiring. The design it realizes is in ../DISTRIBUTE.md; the precedent it follows is the ar subsystem (src/api/archive.c + include/kit/archive.h, gated by KIT_AR_ENABLED distinct from KIT_TOOL_AR_ENABLED).

Goal

libkit.a gains a content-store API and a signed-package API, gated by their own subsystem flags so a minimal embedding pays nothing for them. The kit cas and kit pkg tools become thin CLIs that translate flags into public calls and supply host vtables — exactly like ar, ld, objdump. An embedder can create, sign, verify, inspect, and unpack packages, and drive a CAS, without the driver and without linking host crypto/compression.

Why this is the right shape (not CLI-only logic)

Two layers are stacked under driver/dist/, and they have very different readiness:

The layering invariant forces the move: driver/ may include only <kit/*.h>, and src/api may not include driver/ headers — so a public boundary is impossible while the code sits in driver/dist/. Relocating to src/ is a precondition, not a cleanup.

Target tree layout

vendor/                        # top-level: pristine third-party trees
  monocypher/                  # (moved from driver/dist/vendor/monocypher)
  lz4/                         # (moved from driver/dist/vendor/lz4)
include/kit/cas.h            # content model: blob/tree hashing + CAS store
include/kit/package.h        # package model: manifest, sign/verify, create/unpack
src/api/cas.c                  # public handles <-> internal (archive.c precedent)
src/api/package.c
src/dist/                      # moved dist_* subsystem (private headers)
  dist.{c,h} blob.{c,h} tree.{c,h} cas.{c,h}
  manifest.{c,h} kpkg.{c,h} trust.{c,h}
  blake2b.{c,h} ed25519.{c,h} minisig.{c,h} b64.{c,h}
  deflate.{c,h} lz4.{c,h} tar.{c,h}      # kit-maintained shims/extracts

Vendor split, confirmed by inspection: only monocypher and lz4 are pristine third-party trees pulled in by #include — they move to a repo-root vendor/. deflate.c is a kit-maintained extract of miniz (already modified, not pristine), and b64.c / tar.c are self-contained — these stay in src/dist/. The shim includes that currently read "vendor/monocypher/..." (e.g. blake2b.h, ed25519.c) get rewritten to the new top-level path.

Config gating

Add subsystem flags to include/kit/config.h, separate from the tool flags (mirroring KIT_AR_ENABLED vs KIT_TOOL_AR_ENABLED):

#define KIT_CAS_ENABLED 1   /* content store: src/dist/{blob,tree,cas} + kit/cas.h */
#define KIT_PKG_ENABLED 1   /* signed packages: adds manifest/kpkg/minisig/crypto + kit/package.h */

KIT_PKG_ENABLED implies KIT_CAS_ENABLED (packages are built over the content model). KIT_TOOL_CAS_ENABLED / KIT_TOOL_PKG_ENABLED stay and assert their subsystem flag. Off → the units (and the vendored crypto) drop entirely, so a minimal embedding carries no Ed25519/BLAKE2b/DEFLATE/LZ4. The Makefile's LIB_SRCS_* gains a dist regime that pulls src/dist/*.c plus the enabled vendor/ trees.

Public API surface

Two headers, mirroring DISTRIBUTE.md's content-model vs signed-package split. Model structs are exposed as POD (renamed to the Kit* convention); the vendored primitives and the kpkg wire codecs stay internal.

include/kit/cas.h — content model (self-verifying, no trust)

include/kit/package.h — package model (signed)

Kept internal (src/dist/ private headers)

All vendored code; the dist_blake2b / dist_ed25519 / dist_minisig / dist_b64 / dist_gz / dist_lz4 / dist_tar shims; and the kpkg wire codecs (header / descriptor / index encode-decode). Rationale: raw crypto and on-wire binary layout are implementation detail — exposing them invites misuse and an API-stability burden. The logical model and pipelines are the contract.

New host capabilities

The library reaches the host through KitContext.file_io (read/write, already present) plus one new vtable for the operations KitFileIO doesn't cover — every one of which the driver already implements:

typedef struct KitDistHost {
  int (*mkdir_p)(void* user, const char* path);
  int (*mark_executable)(void* user, const char* path);
  int (*walk_regular_files)(void* user, const char* root, /* callback */ ...);
  int (*fill_random)(void* user, uint8_t* out, size_t n); /* keygen only */
  void* user;
} KitDistHost;

DistCasHost already models mkdir_p + mark_executable; this generalizes it and adds the directory walk (driver_walk_regular_files) and CSPRNG (driver_random_bytes). Naming/placement TBD during Stage 2 (could fold the CAS-only subset into KitCas and keep fill_random package-side).

Error reporting (decided)

Public dist calls return KitStatus and emit human-readable detail through ctx->diag — not through an err-buffer at the boundary. This is the established convention, not a new pattern:

Mechanics:

Versioning: latest-only (decided)

We support only the current on-disk format and drop all back-compat code. This is verified to be pure deletion with zero behavioral change: every v2 symbol (dist_manifest_*, the non-3 dist_kpkg_*, dist_kpkg2_*, and the DistManifest / DistArtifact / DistDependency / DistKpkgHeader / DistKpkgDescriptor / DistKpkgIndexRecord structs) is referenced only in its own definition files — never by pkg.c, cas.c, or any test. The driver already emits and reads v3 exclusively.

Dropping it pays off twice:

  1. Deletes the dead v2 structs / functions / constants from manifest.c and kpkg.c.
  2. Lets the survivors shed the 3 suffix as they go public: DistPackageManifestKitPackageManifest, DistKpkg3HeaderKitPackageHeader, internal dist_kpkg3_*dist_kpkg_*. The versioned naming only existed to coexist with v2.

Precision: drop the v2 parse paths and C identifiers, but keep the on-disk wire magic at kpkg3\0 / kit-package 3 / kit-encoding 3. "Latest version" means v3 on disk; renumbering the wire format would itself break anything already produced. We stop accepting v2 input; we do not renumber.

Staged plan (each stage builds green)

  1. Vendor move. driver/dist/vendor/{monocypher,lz4} → top-level vendor/{monocypher,lz4}; rewrite the shim #include paths; update the Makefile. Pure relocation, no API change — lands first to isolate path churn.
  2. Lift-and-shift the content layer. Move dist.{c,h} blob tree cas to src/dist/; add src/api/cas.c + include/kit/cas.h wrapping blob/tree/CAS; add KIT_CAS_ENABLED; repoint driver/cmd/cas.c at the public header + a host vtable. Smallest behavioral slice; proves the boundary end to end.
  3. Drop v2 first, then extract the package pipelines. Delete the dead v2 code (see Versioning above) and shed the 3 suffix — a self-contained, zero-behavior-change cleanup that shrinks the surface before it moves. Then move manifest kpkg minisig trust b64 deflate lz4 tar + crypto shims to src/dist/; lift the pkg_create_* / pkg_verify_* / unpack / key-resolution logic out of driver/cmd/pkg.c into src/api/package.c behind kit_pkg_*, converting operational driver_errfapi_diagf (see Error reporting above) and driver_* fs/random → KitDistHost. pkg.c shrinks to arg parsing + host wiring + trust-path/env policy. This is the bulk of the work and the main risk.
  4. Tests. Keep test/cas/run.sh + test/pkg/run.sh as end-to-end CLI tests; optionally add unit tests that call the new public API directly (now possible — coverage was CLI-only before).
  5. Docs. Update ../DISTRIBUTE.md paths (the layering diagram's driver/dist/* rows become src/dist/* + the two public headers), the ../DESIGN.md layering box (the driver/dist/ callout moves), and the CLAUDE.md code map (add vendor/, src/dist/, kit/cas.h, kit/package.h).

Risks / watch items