plan: sketch RQL RSN - kit

commit b8f536ed61f21edb813cdb1fbc20e6ff7c435cfa
parent 1233daa9e483569e46de1aff05ef023cbe92dc14
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Mon,  1 Jun 2026 17:33:42 -0700

plan: sketch RQL RSN

Diffstat:
A doc/plan/RQL.md  | 927 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A doc/plan/RSN.md  | 627 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

2 files changed, 1554 insertions(+), 0 deletions(-)
diff --git a/doc/plan/RQL.md b/doc/plan/RQL.md
@@ -0,0 +1,927 @@
+# RQL — a typed relational query language (planned work)
+
+A replacement for SQL: declarative data **query and manipulation**, but *typed,
+consistent, and close to the relational algebra*. `RQL` is a working name (TBD).
+
+SQL conflates three layers that should be separate — a surface syntax, a logical
+algebra, and a physical plan — and its surface does not map cleanly onto its own
+algebra. That mismatch is the source of most of SQL's irregularity. RQL keeps the
+layers apart: a small, total, well-typed **core algebra** (the IR) and a
+**pipeline surface** that maps onto it transparently. Same discipline as a
+compiler: surface → IR → plan, where the IR is where the laws live.
+
+This doc is the living design and the ledger of decisions. It records *what* we
+decided and *why*; open questions sit at the end and graduate up into the ledger
+as we settle them.
+
+## What we're correcting
+
+The specific SQL faults RQL is designed to remove:
+
+- **Clause order ≠ logical order.** `SELECT` is written first but evaluated
+  nearly last, which breaks composition, top-to-bottom reading, and completion.
+- **Three-valued logic.** `NULL` poisons predicates; `NOT IN` with a null is a
+  silent footgun; aggregates skip nulls inconsistently.
+- **`GROUP BY` conflates partitioning with aggregation**, forcing the "every
+  select item must be grouped or aggregated" rule and a separate `HAVING`.
+- **Weak typing & implicit coercion**; no sum types, no row polymorphism, errors
+  surface at runtime.
+- **Bags pretending to be sets**, with `DISTINCT` bolted on, ordered/duplicate
+  column names, and `SELECT *`.
+- **Universal quantification is miserable** — Codd's division ("supplies *all*
+  parts") has no clean SQL spelling.
+
+## Locked decisions
+
+Each entry is a decision plus a one-line rationale. Numbers are stable IDs in
+decision order; the grouping is thematic, so IDs need not run contiguously within
+a group.
+
+### Semantics
+
+1. **A relation is a set of distinct records.** A record (tuple) is an unordered
+   set of named, typed fields; a relation is a set of records over one heading.
+   Boolean-semiring sets, not bags. — Clean algebraic laws ⇒ sound optimizer
+   rewrites (predicate/projection pushdown, join reordering are real equalities);
+   `count` is meaningful (distinct), not "however many dupes were present."
+2. **Base tables must declare a key.** Distinctness is a checked property, not a
+   hope. — Makes "set" honest, and prevents aggregating over a lossy projection.
+3. **Multiplicity is data, not a second semantics.** When duplicates must be
+   counted, carry an explicit `count: Int` (or `weight`) field and weight
+   aggregates explicitly (`sum(group.amount * group.count)`). — Buys the bag / K-relation
+   (provenance-semiring) expressiveness *as a value*, without two equalities, a
+   truncating `minus`, or weight-aware aggregate magic. Accepted cost: a synthetic
+   id at ingest for genuinely keyless multiset data. Revisit if heavy event/log
+   ingestion makes that friction too high.
+4. **Ordering is not part of a relation.** `sort` / `take` / `window` produce a
+   **`Seq`**, a distinct type. — Relations are unordered; ordering is a
+   presentation/boundary concern and shouldn't pollute relational laws.
+
+### Missing values
+
+5. **No `NULL`. Optionality lives in the type** (`T?` = `Option T`), introduced
+   *only* by outer joins. Predicates are total, 2-valued booleans; defaults via
+   `??`. — Removes three-valued logic and the `NOT IN` / aggregate-skipping
+   footguns; "missing" becomes explicit and local.
+26. **Option semantics: total equality, `some`-widening, `where`-narrowing.**
+    `==` / `!=` on `T?` are total Bool (structural: `none == some x` is `false`); a
+    non-option widens to an option implicitly (`T <: T?`, the only implicit
+    conversion); `where x is some` (or `x != none`) narrows `x` to `T` downstream.
+    Ordering, arithmetic, and value-use on `T?` need explicit resolution. — Keeps
+    option handling total and ergonomic without reintroducing 3-valued logic
+    (refines #5).
+
+### Surface
+
+6. **Pipeline / algebra surface:** `R |> op |> op …`, one stage per core
+   operator, written in evaluation order. A pipeline is just a relation
+   expression (there is no `from` keyword), so it nests anywhere a relation is
+   expected. — Composition, top-to-bottom reading, and a 1:1 correspondence with
+   the IR. A comprehension (calculus) surface over the same IR is deferred, not
+   rejected.
+7. **Grouping only partitions; aggregation is ordinary expressions over the
+   relation-valued group field.** No `HAVING` (it is `where` after `group`); no
+   "must appear in GROUP BY" rule (referencing a non-key scalar after grouping is
+   a type error). — One uniform mechanism; the awkward SQL special cases vanish.
+
+### Scope
+
+8. **Query + DML + constraints, unified by relational assignment.** A database is
+   a set of relation variables; insert / update / delete are all `R := <relational
+   expression>`; keys, foreign keys, and arbitrary predicates are declared
+   constraints checked on write. — One mechanism (Third Manifesto / Tutorial D
+   style); set semantics makes assignment idempotent and total.
+
+### Types
+
+*(First draft — see "Type system" below. The shape is settled; specific
+inference rules and the keys-in-type ambition dial are open.)*
+
+9. **Row-polymorphic structural records.** Headings are structural record types;
+   `extend` / `project` / `rename` / `join` are polymorphic over "the other
+   fields" via a row variable, with a field-absence ("lacks") predicate.
+   Duplicate labels are forbidden (a heading can't name a field twice).
+10. **Named scalar domains, no implicit cross-domain coercion.** `Money` is not
+    `Float`; conversions are explicit. — Supports "typed and consistent."
+11. **Keys/FDs live in a metadata + inference layer, not the type proper.** Keys
+    are declared on base tables and *inferred* through operators alongside the
+    type; they power the fan-out-must-aggregate check, optimizer hints, and
+    warnings, but do not participate in type equality. — Pragmatic sweet spot;
+    leaves a clean path to full type-level keys if signatures ever need them.
+
+### Joins
+
+12. **Explicit-predicate joins + key-inference are the foundation; the
+    relationship/FK layer is deferred, additive sugar.** `join R on <predicate>`
+    is the primitive — it covers equi, theta, range, and self joins. When the
+    condition equates a key of one side, key-inference (decision #11) bounds the
+    join to ≤1 match (no fan-out) and feeds the must-aggregate warning; a many-to-
+    many join is flagged. A relationship layer — declared references, reverse
+    navigation, path expressions (`order.customer.region.name`), and
+    cardinality-typed "many" sides that *force* aggregation — rides on top later.
+    It is purely additive and breaks nothing. — Ships a typed join with a
+    cardinality story over arbitrary / un-annotated data first, and defers the
+    schema-declaration machinery (named relationships, reverse naming, path
+    resolution) until the core is proven.
+25. **Join result disambiguation: source-qualified names.** Every field carries
+    the source relvar/alias it came from; a bare name is sugar for unambiguous
+    access. A `join` forms the *qualified union* of both sides (no auto-merge):
+    same bare names coexist as distinct qualified fields and must be referenced
+    `source.field`. `join R as a …` supplies a qualifier (needed for self-joins);
+    `select` / `extend` emit fresh unqualified fields, so qualification is
+    join-local. Natural join (merge common fields, equal types required) is opt-in
+    (`join R natural`). — Refines #9: labels are unique *qualified* names, bare
+    access must be unambiguous.
+
+### Aggregation
+
+13. **Grouping is partition + ordinary expressions; there is no "must-group"
+    rule.** The block form `group by K { … }` binds the per-group relation to the
+    keyword `group` and desugars to `group K into g |> extend … |> remove g`.
+    `group` is a relation; aggregates — written as function calls, `count(group)`,
+    `mean(group.salary)` — are the only relation→scalar bridge; `group.f` where a
+    scalar is expected is a plain type error. — Recovers SQL's grouping rule from
+    the type system; `having` is just `where` after the block.
+14. **Aggregates are commutative monoids; column extraction is a bag.**
+    `group.f` yields one value *per row* (a bag), not a dedup'd set, so sets +
+    keys make aggregation correct with no multiplicity bookkeeping. Order-
+    dependent reductions (first, in-order concat, lag/lead) are *not* aggregates
+    — they live on the `Seq` layer. — A clean dividing line that also explains
+    why windows are separate.
+15. **Empty / option / fan-out aggregation is explicit, never silent.** Empty
+    input → option (`mean`/`min`/`max` → `None`) or identity (`sum`/`count` →
+    `0`); aggregating an option column is a type error (resolve first);
+    duplicate-sensitive aggregates over a fanned-out field warn. — Replaces SQL's
+    silent NULLs, null-skipping, and double-counting with types plus one warning.
+
+### Manipulation (DML)
+
+16. **A database is a set of relvars; the only mutation primitive is relational
+    assignment `R := <expr>`.** Insert / delete / update are sugar over it,
+    reusing the query operators and expressions; assignment requires a *static
+    heading match* (no positional columns). Set semantics ⇒ insert/delete
+    idempotent, duplicate-key insert rejected (no silent upsert). Adds the `with`
+    operator (override an existing field) as the typed sibling of `extend` (add a
+    new field). — One mechanism; query and mutation share a language.
+17. **Constraints are boolean relational expressions that must hold after every
+    transaction.** Keys and FKs are named, incrementally-checked special cases;
+    tuple-level *and* database-level (multi-relvar) checks are unified and
+    first-class. Referential actions (restrict default / cascade / set-null) are
+    declared repairs applied within the transaction. — Cross-relation invariants
+    SQL cannot express become ordinary, checkable booleans.
+18. **The transaction is the checking and atomicity boundary.** A batch of
+    assignments is applied together with all constraints checked once at commit;
+    any violation rolls the batch back atomically; a bare assignment is a
+    singleton transaction. — Makes FKs and cascades coherent (parent + child
+    together) and gives one consistency boundary.
+
+### Recursion
+
+19. **Recursion is a restricted least fixpoint: monotone, range-restricted, with
+    stratified negation.** `recursive R = base |> union (… R …)`, terminating by
+    construction (semi-naive evaluation); the result is an ordinary relation;
+    `closure(...)` is sugar. — First-class reachability / ancestry / BOM
+    explosion without SQL's recursive-CTE pain or unrestricted-recursion
+    non-termination. Recursion *with aggregation* (shortest paths) is deferred.
+
+### Ordered data & windows
+
+20. **The `Seq` layer carries all order-dependent computation.** `sort by`
+    produces `Seq R`; scans (`Seq → Seq`: running/moving/`lag`/`lead`/rank) and
+    ordered folds (`Seq → scalar`: `first`/`last`/`concat`) live there, never in
+    relational aggregates. The `Rel`/`Seq` type boundary *enforces* decision #14.
+    — Order is a type, not a convention.
+21. **Windows reuse partitioning + sorting + `extend`, not a bespoke `OVER`.**
+    `partition by P order by S extend {…}` desugars to `group P into g |> with g =
+    (g |> sort by S |> extend <window-exprs>) |> ungroup g`; frames are relative
+    ranges (`rows[..current]`, `rows[-3..0]`); `lag`/`lead`/`rank` are `Seq`
+    built-ins; output is a `Seq`. — Windows fall out of pieces we already have.
+
+### Concrete syntax
+
+22. **`=` binds values, `:` ascribes types; a bare field name is punning; `--`
+    starts a comment.** Records/tuples are `{ f = e }`, headings and field decls
+    are `{ f: T }`, and `x` in a record / select / group block means `x = x`. —
+    Removes the value/type brace ambiguity in the early sketches.
+23. **`.` is type-directed.** `e.f` is field access on a record (→ scalar), column
+    extraction on a relation (→ `Col T`, a per-row bag), or option-propagating on
+    `T?` (→ `T'?`). A `Col` is legal only as an aggregate / per-row argument; used
+    as a scalar it is a type error — which *is* the no-must-group rule. — One
+    operator, three meanings, all from the operand's type.
+24. **Pipelines are bare relation expressions; aggregates are function calls; the
+    group block binds `group`.** No `from` keyword (a pipeline nests anywhere a
+    relation is expected; `from` survives only in `delete from` / `insert … from`).
+    Aggregates use call syntax (`count(group)`); the sugar `group by K {…}` binds
+    the keyword `group` (`group K into name` for nesting). `select {…}` is sugar
+    (extend + project), `project` / `remove` are pure π, and `let` names a relation
+    expression (replacing CTEs). — Maximal composability, one call syntax.
+
+## Type system (draft)
+
+### Type vocabulary
+
+- **Scalar domains:** `Int`, `Float`, `Bool`, `Text`, `Time`, … and user-defined
+  domains over them. No implicit coercion across domains.
+- **Option:** `T?` = `Option T`. Only outer joins introduce it.
+- **Record (heading):** `{ a: T1, b: T2, … }` — unordered, uniquely-named,
+  typed fields. Field order is irrelevant.
+- **Relation:** `Rel R`, a set of records over heading `R`, e.g.
+  `Rel {sensor: Id, at: Time, value: Float}`.
+- **Sequence:** `Seq R`, an *ordered* list of records (may repeat). Produced by
+  `sort`; the home of order-dependent operations (windows, top-n).
+- **Nested relation:** a field may itself be `Rel R'` (relation-valued attribute).
+  This is how `group` works; it appears only as an intermediate.
+
+### Core operators (the IR)
+
+Each is total and (mostly) `Rel → Rel`. Row variable `ρ` stands for "the other
+fields"; `f ∉ ρ` is the lacks predicate.
+
+```
+restrict  (p: {ρ} -> Bool)        : Rel {ρ}            -> Rel {ρ}
+extend    (f = e: T)              : Rel {ρ}            -> Rel {ρ, f: T}      [f ∉ ρ]
+with      (f = e: T')            : Rel {f: T | ρ}     -> Rel {f: T' | ρ}    [f ∈ ρ]
+project   F                        : Rel {F | ρ}        -> Rel {F}
+remove    F                        : Rel {F | ρ}        -> Rel {ρ}
+rename    a -> b                   : Rel {a: T | ρ}     -> Rel {b: T | ρ}    [b ∉ ρ]
+join                               : Rel R1 -> Rel R2   -> Rel (R1 ⋈ R2)
+union/minus/intersect              : Rel R  -> Rel R    -> Rel R
+group     K into g                 : Rel {K, A}         -> Rel {K, g: Rel {A}}
+ungroup   g                        : Rel {K, g: Rel A}  -> Rel {K, A}
+sort by e                          : Rel R              -> Seq R   (→ Seq layer)
+```
+
+Aggregates are ordinary function calls `Rel A -> scalar` (`count`, `sum`, `mean`,
+…), used inside `extend` after `group`; `count(group)` is set cardinality.
+
+### Join headings & disambiguation
+
+The default `join` (explicit predicate, decision #12) forms the **qualified
+union** of the two headings: every field keeps the source relvar/alias it came
+from, and same-named fields from the two sides coexist as *distinct* qualified
+fields rather than being merged. A **bare name is sugar for unambiguous access**;
+when a name appears on both sides you must qualify it `source.field`:
+
+```
+employees |> join departments on dept == departments.id
+-- both sides have `name`, so it stays qualified:
+|> select { who = employees.name, team = departments.name }
+```
+
+- **Aliases:** `join R as a on …` supplies the qualifier `a` — required for
+  self-joins (`join employees as mgr on manager_id == mgr.id`).
+- **Equi-join keys** are kept as the two distinct fields they are (`dept` and
+  `departments.id` above); a `select` or natural join collapses them if wanted.
+- **`select` / `extend` clean up:** the fields they emit are fresh and
+  unqualified, so qualification is a transient, join-local concern.
+- **Natural join is opt-in:** `join R natural` merges the fields common to both
+  (which must then have equal types) into single bare fields and joins on their
+  equality — the Tutorial-D behavior, now explicit rather than the default.
+
+This refines decision #9: a heading's labels are unique *qualified* names; bare
+access must resolve to exactly one of them, or it is a type error.
+
+### Keys and functional dependencies
+
+Keys are tracked beyond base-table declarations and **inferred through
+operators**, so the compiler can reason about cardinality:
+
+- `restrict`: keys preserved.
+- `project F`: a declared key survives iff it ⊆ F; the full projected heading is
+  always a trivial superkey (sets auto-dedup).
+- `group K into g`: `K` becomes the key of the result by construction.
+- `join` on fields `J`: if `J` is a key of one side, that side contributes at
+  most one match (no fan-out) and the other side's key is preserved; if `J` keys
+  neither side, the join may fan out and the result key is the union of both.
+
+**Payoff:** the compiler knows when a join fans out, so accidental row
+multiplication feeding a `sum`/`count` becomes *detectable* rather than silently
+wrong. This is the safety net behind decisions #1–#2.
+
+Resolved (decision #11): keys/FDs live in a **metadata + inference layer**, not
+the type proper — propagated for cardinality reasoning, the fan-out check, and
+optimizer hints, but not part of type equality.
+
+### Inference
+
+Pipeline stages are fully inferred (Hindley–Milner over rows); base tables and
+domains are annotated. Query result types are inferred and checkable. Basis:
+Rémy/Wand-style records with absence + row unification (closer to that than to
+Leijen's scoped labels, since relations forbid duplicate labels).
+
+## Aggregation & grouping
+
+### Grouping = partition + ordinary expressions
+
+`group by K { … }` is sugar. The primitive is `group K into g` (nest): it turns
+`Rel {K, A}` into `Rel {K, g: Rel {A}}` — one row per distinct `K`, with the
+matching rows collected into the relation-valued field `g`. The block form adds
+fields computed from `g` and drops `g`:
+
+```
+R |> group by k1, k2 { a = count(group), b = mean(group.salary) }
+
+-- desugars to (the block binds the keyword `group`):
+R |> group (k1, k2) into g
+  |> extend a = count(g), b = mean(g.salary)
+  |> remove g
+```
+
+`group` is in scope in the block; *output* it to keep the nested relation
+(`{ emps = group, n = count(group) }` is a department with its list of
+employees), or omit it and it's dropped.
+
+### Why there's no "must appear in GROUP BY" rule
+
+After grouping, `k1, k2` are scalar key fields and `group` is a *relation*. No
+special rule is needed — it falls out of types:
+
+- `group.salary` is a **column**: the `salary` values across the group's rows (a
+  bag — see below), not a scalar.
+- An **aggregate** is the only thing that turns a column/relation into a scalar
+  (`count(group) : Int`, `mean(group.salary) : Float`).
+- `group.salary` where a scalar is expected is a type error. That *is* SQL's
+  "grouped or aggregated" rule, recovered for free.
+
+### Columns vs. projection — why aggregation is correct
+
+`group.f` (column extraction) yields one value **per row** — a bag — while
+`project {f}` yields a **set** (dedup'd). They differ exactly on duplicates, and
+that is what makes aggregation correct without multiplicity bookkeeping: base
+tables have keys (#2), so a group's rows are distinct *real* rows, so
+`group.salary` has one entry per real row and `sum(group.salary)` is right.
+Decisions #1 + #2 buy footgun-free aggregation; the explicit-`count` hatch (#3) is
+only needed once you deliberately collapse to a keyless form.
+
+An aggregate's argument is a **per-row expression** over the group: `group.f`
+binds the current row's field, so `sum(group.qty * group.price)` reduces
+`qty*price` row by row. Equivalent comprehension form:
+`sum(r.qty * r.price for r in group)`.
+
+### The aggregate functions
+
+Relational aggregates must be **commutative monoids** (sets are unordered ⇒ the
+reduction must be order-independent): `count`, `sum`, `min`, `max`, `mean`,
+`and`/`or`, `count(unique c)`. Order-dependent reductions are *not* relational
+aggregates — they belong to the `Seq` layer. Empty-input behavior uses options,
+not a silent NULL:
+
+| aggregate       | empty input | type    |
+|-----------------|-------------|---------|
+| `count`         | `0`         | `Int`   |
+| `sum`           | `0`         | `Num`   |
+| `mean/min/max`  | `None`      | `T?`    |
+
+`group by`-produced groups are non-empty by construction, so the metadata layer
+narrows `mean`/`min`/`max` back to `T` there. Aggregating an **option column**
+(`Col T?`) is a **type error** — resolve first (`group.salary ?? 0`, or filter) —
+which removes SQL's inconsistent null-skipping.
+
+### Whole-relation aggregation
+
+`aggregate { … }` is `group by ()` — one group, one row out:
+
+```
+sales |> aggregate { total = sum(group.amount), n = count(group) }
+```
+
+### Nested / multi-level aggregation
+
+`g` is a relation, so any relational expression — including another `group by` —
+applies to it. Aggregating subtables composes:
+
+```
+orders
+|> group by region {
+     revenue  = sum(group.amount),
+     by_month = (group |> group by month { m_rev = sum(group.amount) }),  -- inner `group` shadows outer
+   }
+```
+
+### Fan-out detection (the safety net)
+
+A field `f` from base table `T` (key `KT`) is determined by `KT`. After a many-to-
+many join the relation's key grows to `KT ∪ KS`, so `f`'s values repeat across the
+extra dimension. Rule: a **duplicate-sensitive** aggregate (`sum`, `mean`,
+`count`) over a field whose determinant key is *coarser* than the relation's
+current key is flagged — its values are being double-counted; **duplicate-
+insensitive** aggregates (`min`, `max`) are exempt. This is the concrete cash-out
+of key-inference (#11): the double-count SQL does silently becomes a warning.
+
+## DML & constraints
+
+### Everything is relational assignment
+
+A database is a set of **relvars** (relation variables); `table T { … }` declares
+one. A relvar holds a relation *value*; the sole mutation primitive is
+**relational assignment**, `T := <relational expression>`, where `expr`'s heading
+must match `T`'s exactly (statically checked — no positional column matching). The
+three DML verbs are **sugar** over assignment, reusing the *same* operators and
+expressions as queries:
+
+```
+insert into R from <expr>       -->  R := R union <expr>
+insert into R { id = 7, … }     -->  R := R union (rel{ …one tuple… })
+delete from R where p           -->  R := R |> where not p
+update R where p set { f = e }  -->  R := (R |> where not p)
+                                          union (R |> where p |> with { f = e })
+```
+
+`with { f = e }` overrides an existing field (the override sibling of `extend`,
+which only adds); using the wrong one is a typed error, catching the `set`-typos
+SQL silently accepts. Set semantics makes these well-behaved: re-inserting an
+identical tuple is a **no-op** (`R ∪ {t}`, `t ∈ R`); inserting a *different* tuple
+with an existing key violates the key constraint and is **rejected** (no silent
+upsert); deleting absent rows is a no-op. Insert/delete are idempotent.
+
+### Constraints are booleans that must hold
+
+A constraint is a **boolean relational expression over the database that must be
+true after every transaction.** Keys and FKs are named, incrementally-checked
+special cases of that one idea:
+
+- **Key** (#2): `key (id)`, `key (a, b)` — the projection onto the key is
+  injective; multiple candidate keys allowed.
+- **Foreign key:** `dept: Id references departments(id)` — an inclusion
+  constraint, `R{dept} ⊆ departments{id}`; a nullable FK (`dept: Id?`) admits
+  `None`. Parent-side referential actions — **restrict** (default), **cascade**,
+  **set-null** — are declared repairs applied within the transaction before the
+  check.
+- **Check:** an arbitrary predicate. Tuple-level (`check salary > 0`) is the
+  common case; **database-level** constraints spanning relvars are first-class —
+  exactly what SQL cannot express:
+
+  ```
+  check count(employees |> where role == "CEO") <= 1
+  check (employees{dept} ⊆ departments{id})        -- an FK, spelled out
+  ```
+
+Any assignment whose result would falsify a constraint is rejected; the relvar is
+left unchanged.
+
+### Transactions are the checking boundary
+
+Because constraints span relvars, the unit of checking is the **transaction**: a
+batch of assignments applied together, all constraints checked once at commit, any
+violation rolling the whole batch back atomically. A bare assignment is a
+singleton transaction. This is what lets a parent and child be inserted together
+past an FK, and what makes cascades coherent.
+
+```
+transaction {
+  insert into departments { id = 3, name = "Eng" }
+  insert into employees   { id = 7, name = "Ann", dept = 3 }
+}        -- both visible together; constraints checked once
+```
+
+### Views
+
+A **view** is a named relation *expression*, not stored: `view active =
+employees |> where active`. Read-only to start; making views updatable (mapping a
+view-assignment back onto base relvars — the view-update problem) is deferred.
+
+### What this buys
+
+Mutation reuses the entire query language — no separate DML dialect (contrast
+SQL's bespoke `UPDATE … SET`). Three verbs collapse to one primitive, so they
+compose and can't interact surprisingly. Constraints unify to "invariants that
+hold after every transaction," with multi-relvar invariants first-class. Static
+heading-match kills positional-column bugs. The transaction is simultaneously the
+atomicity and the constraint boundary.
+
+## Recursion
+
+Restricted to a **least fixpoint of a monotone, range-restricted relational
+expression** — enough for reachability, ancestry, and BOM explosion, and
+guaranteed to terminate. A recursive relation is defined with `recursive`:
+
+```
+-- edges: Rel {src: Id, dst: Id, key (src, dst)}
+recursive reach =
+  edges
+  |> union (reach |> join edges on dst == edges.src
+                  |> select { src, dst = edges.dst })
+```
+
+What keeps it total:
+
+- **Monotone:** the recursive name may not appear under `minus`/`not` — negation
+  is **stratified** (it may only refer to a lower stratum), so each iteration only
+  *adds* tuples.
+- **Range-restricted:** every output value comes from the base/input, none are
+  invented (no unbounded arithmetic in the recursive arm). Monotone +
+  range-restricted + finite input ⇒ the least fixpoint is reached in finitely many
+  steps.
+
+The result is an ordinary relation, so it composes with everything else (filter /
+join / group the closure). The common case has sugar — `closure(edges: src ->
+dst)` expands to the fixpoint above. Recursion *with aggregation* (shortest paths,
+PageRank-style iteration) needs well-founded aggregate semantics and is deferred.
+
+## Windows & the `Seq` layer
+
+### Principle
+
+Relations are unordered (#4); **ordering produces a `Seq`**, and *all*
+order-dependent computation lives there — never in relational aggregates, which
+must be commutative monoids (#14). `sort by e : Rel R -> Seq R` is the gateway,
+and the `Rel`/`Seq` type boundary is what enforces #14: you cannot write a running
+total over a `Rel`, nor a duplicate-collapsing set op over a `Seq`, without
+crossing the boundary explicitly. A `Seq` supports two kinds of computation:
+
+- **Scans** (`Seq R -> Seq R'`): give every row a value from its position,
+  neighbors, or a frame — running totals, moving averages, `lag`/`lead`, ranks.
+  Row count preserved.
+- **Ordered folds** (`Seq R -> scalar`): reduce in order — `first`, `last`,
+  `concat(sep)`. (Order-*independent* reductions remain relational aggregates;
+  these are the ones that genuinely need order.)
+
+Plus slicers `take n`, `drop n`, `take while p`, which subsume top-N and
+pagination (`sort by x desc |> take 10`).
+
+### Window expressions (scans)
+
+A scan adds a column via `extend`, where the expression sees ordered context:
+
+- **Frame aggregate:** `agg(e) over <frame>`, an aggregate over a range relative
+  to the current row:
+  - `rows[..current]` — running (unbounded preceding → current)
+  - `rows[current..]` — current → end
+  - `rows[-3..0]`     — 3 preceding → current (a width-4 moving window)
+  - `rows[-1..1]`     — immediate neighbors
+- **Offsets:** `lag(e, k, default)`, `lead(e, k, default)` — value k rows away.
+- **Ranking:** `row_number()`, `rank()`, `dense_rank()`, `ntile(n)` — read the
+  sort order and ties.
+
+### Partitioning reuses `group`
+
+`PARTITION BY` is just "window within each group independently," and we already
+have grouping. So:
+
+```
+trades
+|> partition by symbol order by ts
+   extend running = sum(amount) over rows[..current],
+          prev    = lag(price, 1),
+          r       = rank()
+```
+
+is sugar for group / sort-within / position-aware `extend` / ungroup:
+
+```
+trades
+|> group symbol into g
+|> with g = (g |> sort by ts
+               |> extend running = sum(amount) over rows[..current],
+                         prev    = lag(price, 1),
+                         r       = rank())
+|> ungroup g
+```
+
+The result is a **`Seq`** carrying within-partition order; cross-partition order
+is unspecified until a final `sort by`. Rows stay distinct, so it can be taken
+back to a `Rel` with `as rel` if you want a set.
+
+### Idioms
+
+```
+-- first row per partition
+events
+|> partition by user order by ts extend rn = row_number()
+|> where rn == 1
+
+-- 7-day moving average
+… |> partition by sku order by day extend ma = mean(qty) over rows[-6..0]
+
+-- in-order audit trail per case (ordered fold)
+steps |> group by case { trail = (group |> sort by ts |> concat(name, " -> ")) }
+```
+
+### Why this beats SQL windows
+
+- No bespoke `OVER (PARTITION BY … ORDER BY … ROWS BETWEEN …)` sub-language:
+  partitioning is the `group` you already know, ordering is `sort by`, the window
+  value is an `extend`.
+- The `Rel`/`Seq` boundary makes order-dependence *visible in the type* and
+  enforces the aggregate/scan split (#14) instead of leaving it to convention.
+- Frames are terse, and ordered folds (`first`/`concat`) are just `Seq`
+  functions, not special cases.
+
+## Concrete syntax (grammar)
+
+A first concrete grammar — enough to parse the examples above and to pin the
+ambiguities. Notation: `(…)` grouping, `*` zero-or-more, `?` optional, `|` choice;
+UPPERCASE = tokens; `--` starts a line comment.
+
+### Module & declarations
+
+```
+module     ::= item*
+item       ::= domainDecl | tableDecl | viewDecl | letDecl | recDecl
+             | constraintDecl | stmt
+
+domainDecl ::= "domain" IDENT "=" type ("check" "(" expr ")")?
+tableDecl  ::= "table" IDENT "{" field ("," field)* ("," tableCon)* ","? "}"
+field      ::= IDENT ":" type ("references" IDENT "(" IDENT ")" refAction*)?
+refAction  ::= "on" ("delete" | "update") ("restrict" | "cascade" | "set" "null")
+tableCon   ::= "key" "(" IDENT ("," IDENT)* ")" | "check" "(" expr ")"
+viewDecl   ::= "view"      IDENT "=" relExpr
+letDecl    ::= "let"       IDENT "=" relExpr     -- names a relation (replaces CTEs)
+recDecl    ::= "recursive" IDENT "=" relExpr     -- monotone, range-restricted
+constraintDecl ::= "constraint" IDENT "check" "(" expr ")"
+```
+
+### Statements
+
+```
+stmt ::= IDENT ":=" relExpr                              -- relational assignment
+       | "insert" "into" IDENT (record | "from" relExpr)
+       | "delete" "from" IDENT "where" expr
+       | "update" IDENT "where" expr "set" record
+       | "transaction" "{" stmt* "}"
+       | relExpr                                          -- a query to evaluate
+```
+
+### Relation expressions (pipelines)
+
+```
+relExpr ::= relPrim ("|>" stage)*
+relPrim ::= IDENT | "(" relExpr ")" | relLit
+          | "closure" "(" IDENT ":" IDENT "->" IDENT ")"
+relLit  ::= "rel" "{" record ("," record)* ","? "}"
+
+stage ::= "where"   expr
+        | "extend"  bindings   | "with" bindings
+        | "select"  block      | "project" nameBlock | "remove" nameBlock
+        | "rename"  IDENT "->" IDENT
+        | "join"    relPrim ("as" IDENT)? ("on" expr | "natural")
+        | ("union" | "minus" | "intersect") relPrim
+        | "group" "by" nameList block?                   -- sugar: binds `group`
+        | "group" nameList "into" IDENT                  -- primitive nest
+        | "ungroup" IDENT
+        | "aggregate" block                              -- = group by ()
+        | "sort" "by" sortKey ("," sortKey)*
+        | "take" ("while" expr | expr) | "drop" expr
+        | "partition" "by" nameList "order" "by" sortKey ("," sortKey)*
+          "extend" bindings                              -- window; yields Seq
+
+sortKey  ::= expr ("asc" | "desc")?
+bindings ::= binding | block
+binding  ::= IDENT "=" expr
+block    ::= "{" entry ("," entry)* ","? "}"   ;  entry ::= binding | IDENT   -- punning
+nameBlock::= "{" nameList ","? "}"             ;  nameList ::= IDENT ("," IDENT)*
+record   ::= "{" entry ("," entry)* ","? "}"                  -- value tuple
+```
+
+### Types
+
+```
+type    ::= (IDENT | "Rel" heading | "Seq" heading) "?"?
+heading ::= "{" IDENT ":" type ("," IDENT ":" type)* ","? "}"
+```
+
+### Scalar expressions (low → high precedence)
+
+```
+expr  ::= or ("??" or)?                        -- option default
+or    ::= and ("or"  and)*
+and   ::= cmp ("and" cmp)*
+cmp   ::= add (cmpOp add | "is" ("some" | "none"))?   -- non-associative
+cmpOp ::= "==" | "!=" | "<" | "<=" | ">" | ">=" | "in" | "subset"
+add   ::= mul (("+" | "-") mul)*
+mul   ::= un  (("*" | "/" | "%") un)*
+un    ::= ("-" | "not") un | cast
+cast  ::= post ("as" type)?
+post  ::= prim trailer*
+trailer ::= "." IDENT | "(" callArgs? ")" | "over" frame
+prim  ::= LITERAL | IDENT | "(" expr ")" | quant
+quant ::= ("all" | "any") "(" expr "for" IDENT "in" relExpr ")"
+frame ::= "rows" "[" bound? ".." bound? "]"    ;  bound ::= "current" | INT
+callArgs ::= expr ("," expr)*                  -- function / aggregate args
+           | expr "for" IDENT "in" relExpr     -- aggregate comprehension form
+LITERAL ::= INT | FLOAT | TEXT | "true" | "false" | "none" | "some" "(" expr ")"
+```
+
+Numeric literals allow `_` digit separators (`50_000`); the `50k`-style shorthand
+in earlier prose was informal.
+
+### Semantic clarifications
+
+These resolve the ambiguities the prose left open; they are the canonical reading
+and supersede any looser earlier example.
+
+1. **`.` is type-directed (#23).** `e.f` is field access on a record (→ scalar),
+   **column extraction** on a relation (→ `Col T`, a per-row bag), or option-
+   propagating access on `T?` (→ `T'?`). A `Col` is only ever an aggregate
+   argument; used as a scalar it is a type error — that *is* the no-must-group
+   rule.
+2. **`=` vs `:` (#22).** `=` binds a value (`{ id = 7 }`); `:` ascribes a type
+   (`{ id: Id }`). A bare `x` in a block is punning for `x = x`.
+3. **Pipelines are expressions; no `from` (#24).** A query is a relation
+   expression, so `R |> …` nests anywhere a relation is expected (join operands,
+   nested groups, `let` / `view` bodies). `from` appears only in `delete from` /
+   `insert … from`.
+4. **`group` keyword; `into` for naming.** `group by K { … }` binds the per-group
+   relation to `group`; nest with the `group K into name` primitive to avoid
+   shadowing.
+5. **Aggregates are calls.** `count(group)`, `mean(group.salary)`,
+   `sum(group.qty * group.price)`; the arguments are per-row expressions over the
+   group that the aggregate reduces. The comprehension form `sum(e for r in R)` is
+   equivalent.
+6. **`select` vs `project`.** `project {a, b}` is pure π (existing names only);
+   `select { a, x = e }` builds an exact output record and desugars to
+   `extend` (computed) + `project` (to the listed fields).
+7. **Frames.** In `over rows[a..b]`: `current` ≡ 0, negative = preceding, positive
+   = following, an empty side = unbounded — `rows[..current]` running,
+   `rows[-3..0]` width-4 trailing, `rows[-1..1]` neighbors.
+8. **Option discipline.** `==` / `!=` on options are **total** Bool (structural:
+   `none == some x` is `false`), and a non-option widens to an option implicitly
+   (`T <: T?` — the only implicit conversion). Ordering, arithmetic, and value-use
+   on a `T?` need resolution first (`??`, or narrowing via `where x is some`);
+   there is no silent null propagation.
+9. **Qualified names (#25).** A field carries its source relvar/alias. A bare name
+   must resolve unambiguously; after a `join`, a name present on both sides is
+   written `source.field`, and `join R as a` aliases the source (needed for
+   self-joins). In row context, a leading identifier that names a source qualifier
+   is qualified field access; otherwise `.` is the type-directed access of
+   clarification 1.
+
+## Showcase
+
+One schema exercised end-to-end in canonical syntax: schema + constraints, then a
+battery of queries, windows, recursion, and DML. Everything here is intended to
+parse under the grammar above and type-check under the rules above.
+
+### Schema
+
+```
+domain Id    = Int
+domain Money = Int  check (value >= 0)        -- minor units; `value` is the domain value
+
+-- FK direction is declarative; declaration order and cycles (departments.head ⇄
+-- employees.dept) are fine — the schema is checked as a whole.
+
+table departments {
+  id:     Id,
+  name:   Text,
+  budget: Money,
+  head:   Id? references employees(id),       -- nullable FK: a dept may have no head
+  key (id),
+  key (name),                                  -- a second candidate key
+}
+
+table employees {
+  id:      Id,
+  name:    Text,
+  email:   Text,
+  dept:    Id  references departments(id) on delete restrict,
+  manager: Id? references employees(id),       -- self-ref, nullable (the CEO has none)
+  salary:  Money,
+  hired:   Time,
+  key (id),
+  key (email),
+  check (salary > 0),
+}
+
+table projects {
+  id:   Id,
+  name: Text,
+  dept: Id references departments(id),
+  key (id),
+}
+
+table assignments {                            -- employees ⋈ projects (many-to-many)
+  emp:     Id references employees(id) on delete cascade,
+  project: Id references projects(id) on delete cascade,
+  hours:   Int  check (hours >= 0),
+  key (emp, project),
+}
+
+-- a database-level invariant SQL cannot state directly:
+constraint one_ceo check (count(employees |> where manager == none) <= 1)
+
+let over_budget =
+  employees
+  |> group by dept { payroll = sum(group.salary) }
+  |> join departments on dept == departments.id
+  |> where payroll > budget
+constraint payroll_within_budget check (count(over_budget) == 0)
+```
+
+### Queries
+
+```
+-- Q1  average salary per dept, only well-paid teams, sorted
+employees
+|> where salary > 50_000
+|> group by dept { headcount = count(group), avg_pay = mean(group.salary) }
+|> where avg_pay > 70_000                       -- no HAVING; just `where` after group
+|> join departments on dept == departments.id
+|> select { team = departments.name, headcount, avg_pay }
+|> sort by avg_pay desc                          -- result is a Seq
+
+-- Q2  each employee with their manager (self-join; a None manager matches no row)
+employees
+|> join employees as mgr on employees.manager == mgr.id
+|> select { who = employees.name, boss = mgr.name }
+-- use `left join` instead to keep the CEO, with boss = none
+
+-- Q3  total hours and head-count per project
+assignments
+|> group by project { total_hours = sum(group.hours), people = count(group) }
+|> join projects on project == projects.id
+|> select { project = projects.name, total_hours, people }
+|> sort by total_hours desc
+
+-- Q4  fan-out: the compiler catches a double-count at compile time
+employees
+|> join assignments on id == assignments.emp     -- 1:many ⇒ each emp recurs
+|> group by dept { payroll = sum(group.salary) }  -- ⚠ salary double-counted per assignment
+-- correct (employee grain, no fan-out):
+employees |> group by dept { payroll = sum(group.salary) }
+
+-- Q5  top-3 earners per department (window ⇒ Seq)
+employees
+|> partition by dept order by salary desc extend rank = rank()
+|> where rank <= 3
+|> sort by dept, rank
+
+-- Q6  cumulative payroll as people are hired (window scan)
+employees
+|> partition by dept order by hired
+   extend cumulative_payroll = sum(salary) over rows[..current]
+|> select { dept, name, hired, cumulative_payroll }
+
+-- Q7  all transitive reports under employee #1 (recursion + qualified disambiguation)
+let reports = employees |> where manager is some       -- narrows `manager` to Id
+                        |> select { mgr = manager, emp = id }
+recursive chain =
+  reports
+  |> union (chain |> join reports on chain.emp == reports.mgr
+                  |> select { mgr = chain.mgr, emp = reports.emp })
+chain
+|> where mgr == 1
+|> join employees on emp == employees.id
+|> select { report = employees.name }
+```
+
+### Mutation
+
+```
+-- 10% raise for one department
+update employees where dept == 3 set { salary = salary * 11 / 10 }
+
+-- hire + first assignment atomically (the assignment's FK to the new employee
+-- holds because both land in one transaction, checked once at commit)
+transaction {
+  insert into employees {
+    id = 42, name = "Bo", email = "bo@co", dept = 3,
+    manager = 7, salary = 9_000_000, hired = date("2026-06-01")
+  }
+  insert into assignments { emp = 42, project = 5, hours = 0 }
+}
+
+-- retire a project; its assignments cascade away
+delete from projects where id == 5
+```
+
+### Callouts
+
+- **Q2** `employees.manager == mgr.id`: `manager: Id?` against `mgr.id: Id` widens
+  by implicit `some` (#26); equality is total, so a None manager equals no row and
+  the CEO drops out of the inner join. `manager` must be qualified (both sides are
+  `employees`).
+- **Q4** the `⚠` is a *compile-time* warning from key-inference (#11, #15), not a
+  runtime surprise: `salary` is determined by `employees.id` but the joined rows
+  are keyed by `(emp, project)`.
+- **Q5 / Q6** output is a `Seq` (#20); `rank()` and `sum(…) over` are `Seq`-layer
+  operations, unavailable on a `Rel`.
+- **Q7** `where manager is some` narrows `manager` from `Id?` to `Id` (#26); the
+  recursive arm needs qualifiers (`chain.emp`, `reports.mgr`) because both inputs
+  carry `mgr` and `emp` (#25); `union` type-checks because both arms have heading
+  `{mgr, emp}`.
+
+## Open questions
+
+- **Name.** `RQL` is a placeholder.
+- **Relationship/FK layer (deferred).** The additive sugar from decision #12:
+  declared references, reverse navigation, path expressions, cardinality-typed
+  many-sides. When to build it; named-relationship & ambiguity rules; path
+  resolution.
+- **Comprehension surface.** Add a calculus surface over the same IR later, or
+  never?
+- **Value-based & peer frames.** SQL `RANGE` / `GROUPS` (frames by order-key
+  *value* or tie-peers) deferred; positional `rows[…]` frames first.
+- **Recursion with aggregation** (shortest paths, fixpoint + `min`/sum). Deferred
+  — needs well-founded aggregate-in-recursion semantics.
+- **Updatable views.** The view-update problem — when can a view assignment map
+  back to base relvars? Deferred.
+- **Defaults & generators.** Column defaults (constants / pure exprs) vs. impure
+  generators (identity/autoincrement, `now()`), which must arrive as explicit
+  host effects. Design TBD.
+- **Upsert / merge sugar.** A declared-intent form over assignment.
+- **Concurrency / isolation.** Likely out of scope (RQL is a language, not an
+  engine), but the transaction boundary needs an isolation story if it ever runs
+  concurrently.
+- **Implementation substrate.** Could ride cfree's pipeline as a CG-API frontend
+  (cf. RSN) or stand alone. Out of scope until the language is nailed.
diff --git a/doc/plan/RSN.md b/doc/plan/RSN.md
@@ -0,0 +1,627 @@
+# RSN — Ryan's Scripting Notation (planned work)
+
+A small, **statically-typed** scripting language that lets cfree replace its
+remaining external build/test dependencies — **make, POSIX shell, and Python** —
+with code it owns and ships, and that runs on cfree's *own* compiler pipeline.
+RSN is the language; the build system that grows on top of it is a later layer
+(sketched at the end). This doc is the living design.
+
+## The architecture, in one line
+
+**RSN is another frontend on cfree's pipeline** (alongside C, Wasm, toy): RSN
+source → public CG API → IR → execution. To make a garbage-collected language
+fit, **cfree's IR gains a first-class managed-reference type**, so RSN runs on the
+existing IR interpreter (arena-allocated first, then *precisely* GC'd) and can
+compile to *fast native code* through the existing backends later. Static typing
+is what makes
+this pay off: typed operations lower to real IR ops (not opaque runtime calls),
+so the optimizer/backends optimize RSN, and the managed-ref type makes every GC
+root statically known.
+
+Because this turns the IR into a GC-aware IR, it is **staged**, and Stage 1 is
+itself split to keep the highest-blast-radius change (touching the shared CG type
+system) out of the critical path until the language has proven itself:
+
+- **Stage 1a (now):** typed RSN frontend + interp, **arena-allocated, no
+  collector** — managed values live in a run-scoped arena freed at exit. The
+  interp carries a per-function ref-bitmap as local metadata; the C compiler's CG
+  type system is untouched. Cooperative fibers land here. This is where the
+  language gets exercised on real build/test scripts.
+- **Stage 1b (when reclamation is actually needed):** the managed-ref IR type +
+  GC op dialect + the interp's root scan and mark-sweep collector, replacing the
+  arena. Precise by construction. Native/Wasm/C backends reject managed-ref
+  functions.
+- **Stage 2 (later, optional):** native/JIT codegen for managed code — safepoints
+  + GC stack maps in the optimizer and the aa64/x64/rv64 backends — so a build
+  tool written in RSN compiles to a native binary.
+
+## Why
+
+Three external tools carry three jobs today: **make** (~1.5k lines: the
+dependency DAG, host/config branching, `CFREE_*_ENABLED` gating, the cross-arch
+matrix), **shell** (~12k LOC under `test/`: the corpus harness engine, process
+orchestration, golden-diffing), and **Python** (parsing tool output and wrangling
+text/JSON/CSV). RSN's goal is one owned, typed language that does all three, so
+cfree's only external dependency is a seed C toolchain. 3 → ~0.
+
+## Locked decisions
+
+**Architecture & runtime**
+
+1. **Typed RSN as a CG-API frontend** (`lang/rsn/`, with a type checker); no
+   bespoke bytecode — the IR / interp `InterpInsn` *is* RSN's bytecode.
+2. **GC-aware IR**: a managed-reference type + a small GC op dialect in the CG
+   type system + IR, firewalled so the C path pays nothing.
+3. **Precise, non-moving mark-sweep GC**, precise by construction.
+4. **Cooperative single-threaded fibers from day one**, on the interp's explicit
+   swappable stack and its reserved `CFREE_INTERP_BLOCKED` yield state. Cores are
+   saturated by spawned child processes, not threads.
+5. **Staged delivery**: interp first — arena-allocated (1a), then precise GC (1b);
+   native managed codegen later (Stage 2).
+
+**Language**
+
+6. **Syntax**: brace / C-ish (Go/Wren family), strictly **LL(1)**, **significant
+   newline** termination.
+7. **Bindings**: `let` (immutable) + `var` (mutable). Object *contents* stay
+   mutable regardless — this governs only rebinding the name.
+8. **Algebraic data types**: `struct` (named product) + `enum` (sum / variants) +
+   `(T, U, …)` tuple (anonymous product), all destructured by the one `pattern`
+   grammar (`match`/`if let`/`let`/`for`).
+9. **Nullability**: refs are non-null; absence is `T?` (`Option<T>`), unwrapped
+   with `??` (coalesce) and `if let`. Distinct from `Result` (absence vs error).
+10. **Errors**: `Result<T, E>` + Zig-style `try` (propagate) / `catch` (recover),
+    not exceptions. `Result`/`Option` are sum types; `match` also destructures
+    them.
+11. **`any` + `match`**: `any` is the dynamic escape hatch (JSON, parsed text);
+    `match` narrows it (`string s`, `int n`, …). `match` is the single multi-way
+    construct and general destructurer — literals/strings, ADTs, tuples, `any`
+    type-narrows, and or-patterns (`a | b -> …`); no fallthrough, no `switch`.
+12. **Generics — Stage 1 ships only the builtin containers** (`list<T>`,
+    `map<K, V>`) as hand-written specializations (packed/unboxed `list<int>` /
+    `list<float>`, a shared boxed instantiation otherwise). The general
+    instantiation strategy (.NET-style hybrid vs. plain monomorphization, plus
+    constraints for user-defined generic fns) is **deferred** until the feature
+    exists — not locked ahead of need.
+13. **Expression `if`/`match`** (value-producing); blocks appear only in
+    control-flow slots, so a bare `{}` stays a map literal and `?` is reserved for
+    optionals (no ternary). **`match` is the *single* multi-way construct** — there
+    is no separate `switch`; literal/string arms plus or-patterns (`a | b -> …`)
+    cover value-dispatch, arms never fall through.
+14. **Modules**: file = module; `pub` exports (private by default); `import
+    "x.rsn" as x`, qualified access.
+15. **Patterns**: Lua-style string patterns, not a full regex engine.
+16. **Methods via UFCS**: `a.f(b)` ≡ `f(a, b)`. `a.name` is a field load when
+    `name` is a field of `a`'s type, otherwise `a.name(...)` resolves `name` as a
+    free function with `a` as its first argument. One rule covers `objs.push`,
+    `src.replace`, `xs.map` — no method-declaration form, no receiver concept.
+    Tie-break: a field shadows a free function of the same name.
+17. **String interpolation**: `"...${expr}..."` (a lexer feature; the interpolated
+    `expr` is a full expression, `\$` escapes a literal `$`). Argv is still built
+    with lists (`["cfree", "cc", src]`), never by interpolating into one command
+    string — that keeps shell-quoting bugs out by construction.
+18. **First-class tuples + one pattern grammar**: `(T, U, …)` (arity ≥ 2) is a
+    storable, nestable anonymous product; `()` is unit and `(e)` is grouping.
+    Destructure-only (no `.0`/`.1`). A single `pattern` concept serves
+    `let`/`var`/`for`/`match`/`if let` (tuples always written `(a, b)`);
+    `map` yields `(K, V)` so `for (k, v) in m` is just a tuple pattern. `(a, b) =
+    (b, a)` is parallel assignment (RHS fully evaluated first).
+
+Superseded by this design: the earlier **NaN-boxed dynamic value model** and
+**dynamic typing** — typing removes the need for a universal tagged value.
+
+## What we reuse
+
+- **The frontend→CG→opt→interp pipeline.** RSN attaches as the compiler's
+  **interp sink** (`cfree_interp_program_attach`) so each compiled RSN function is
+  lowered to `InterpInsn`, exactly as `cfree run --no-jit` does for C.
+- **The CG type system** (`CfreeCgTypeId`) and recording IR (`src/cg`) — extended
+  with the managed-ref type, not replaced.
+- **The IR interpreter** (`src/interp`): a register VM with direct-threaded
+  dispatch, an **explicit swappable stack** documented as a "swap-ready substrate
+  for fibers/virtual threads," and a **reserved `BLOCKED` yield status**. The
+  fiber runtime and a yielding `run()` complete a half-built design.
+- **`src/core/`** (arena/Heap/Slice/StrBuf/VEC/hashmap/diag) — the frontend's and
+  runtime's substrate.
+- **`rt/lib/`** — the home for RSN's runtime helpers (container growth, string
+  ops, the collector, `any`-boxing, the iterator protocol).
+- **The optimizer and native backends** (Stage 2) — typed RSN lowers to real IR.
+- **CAS** (`include/cfree/cas.h`) — the build layer's content-addressed cache.
+- **The host-vtable convention** (`CfreeCasHost`, `CfreeInterpHost`) — RSN's host
+  vtable supplies all I/O and the scheduler's event source; which fields are bound
+  is the capability gate.
+
+## The IR extension: managed references + GC
+
+The heart of the design, specified with its firewall discipline.
+
+**The type.** A new CG type kind — a **managed ref**: an opaque, GC-owned pointer
+to a heap object of a given *shape*. Distinct from a raw C pointer. RSN's
+`string`/`list`/`map`/`fn`/`fiber`/`chan`/record/`enum`/tuple/`any` values are
+managed refs; `int`/`float`/`bool` are unmanaged scalars.
+
+**The ops** (a small dialect): `managed_alloc(shape) -> ref`, `ref_field_load/
+store`, `ref_elem_load/store`, `ref_eq`, `ref_null`. Safepoints are implicit at
+allocations, calls, and loop back-edges (no explicit op in the interp; once the
+1b collector is on it can collect at any allocation because all roots are in
+registers).
+
+**Object shapes for precise tracing.** Every managed object's header points at a
+**shape descriptor** listing which fields are managed refs (or a trace function
+for containers). **Sum types use a *per-variant* shape**: the collector reads the
+active tag, then traces that variant's ref-map. Records use a single field-map.
+
+**Root finding in the interp.** Each `InterpFunc` carries a static **ref-bitmap**
+marking which PRegs hold managed refs. Roots = every live ref-PReg across all
+live fibers' frames + globals + the string-intern table + a small **C-roots**
+scratch for native functions mid-construction. Precise; no stack maps at the
+interp level.
+
+**Two cost-containing invariants (locked):**
+- **Non-moving** → no write/read barriers, no pointer rewriting in codegen; the
+  backend only ever *identifies* roots, never *updates* them.
+- **No interior pointers** → element/field access re-derives from the base ref, so
+  a stack map only ever needs base refs.
+
+**Firewall discipline.** The managed dialect is an *optional IR feature*: a
+function with no managed refs needs no GC support and pays nothing. The C frontend
+never emits managed refs, so C → IR → {native, Wasm, C} is untouched and
+zero-cost. A backend without managed support asserts out cleanly on a managed
+function. The managed-IR capability is its own flag (`CFREE_MANAGED_IR`), distinct
+from the `CFREE_RSN_ENABLED` frontend that consumes it. The standing test: a
+C-only build compiles and links with the managed dialect compiled out.
+
+**Stage 2 (native).** Precise native GC needs safepoints + **stack maps**: the
+optimizer's liveness + register allocator record which managed refs are live at
+each safepoint/call, per arch. This is the real cross-cutting work and the reason
+native is staged separately. The two invariants above keep it to *root
+identification* rather than the full moving-GC codegen problem.
+
+## Type system
+
+Static, with local inference so scripts stay terse.
+
+- **Primitives:** `int` (i64), `float` (f64), `bool`, `string`.
+- **Containers (builtin generics):** `list<T>`, `map<K, V>`.
+- **Functions:** `fn(A, B) -> R`; closures capture bindings by reference.
+- **Records (product):** `struct Name { field: T, … }` — named-field managed
+  objects with a single shape.
+- **Sums (variant):** `enum Name { Variant, Variant(T, …), … }` — tagged unions
+  with optional payloads; per-variant shape. Constructors are values
+  (`Leaf(5)`, `Empty`); destructured by `match`.
+- **Tuples (anonymous product):** `(T, U, …)`, arity ≥ 2 — a managed ref to an
+  anonymous positional record (reuses the record shape, trivially traced).
+  Storable and nestable (`list<(string, string)>`, `(int, (int, int))`).
+  `()` is unit and `(e)` is grouping, so 1-tuples don't exist. **Destructure-only**
+  — no `.0`/`.1` (consistent with variant payloads, and it dodges the `pair.0.1`
+  float-literal lexer hazard); pull one element with `let (_, v) = p`. Packed
+  all-scalar tuples (e.g. unboxed `(int, int)`) are a deferred niche opt, same
+  bucket as packed `list<int>`.
+- **Builtin sums:** `Option<T> = { Some(T), None }` (with `T?` sugar and `??` /
+  `if let`), `Result<T, E> = { Ok(T), Err(E) }` (with `try` / `catch`).
+- **Builtin error type:** `struct Error { code: int, message: string }` — the
+  default `E` for host/process operations (`run`, `file_io`); user code may use
+  any `E`.
+- **Methods are UFCS:** there is no method-declaration form. `a.f(args)` lowers to
+  `f(a, args)` when `f` is a free function in scope; `a.name` is a plain field
+  load when `name` is a field. A field shadows a free function of the same name.
+- **`any`:** a managed ref to a `{ tag, payload }` box for genuinely dynamic data;
+  narrowed by `match` type-patterns that name the real (lowercase) types
+  (`string s`, `int n`, `list xs`, …). Boxing is confined to `any` (and
+  `Option<scalar>`), not universal.
+- **Inference:** locals infer from initializers (`let n = 5` ⇒ `int`); function
+  parameters and return types are annotated. Explicit type arguments are *not*
+  written at call sites (inferred), so `<` is unambiguous in expressions — no
+  turbofish.
+- **Generics — Stage 1 is builtin containers only.** `list<T>` and `map<K, V>`
+  ship as hand-written specializations: `list<int>` / `list<float>` are
+  packed/unboxed, ref element types share one boxed instantiation. The general
+  scheme for user-defined generics (.NET-style value/ref hybrid vs. plain
+  monomorphization, and constraints) is deferred to its own phase — when it lands,
+  the value/ref split and ref-vs-scalar-parameterized shape descriptors are the
+  intended direction, but nothing here is locked yet.
+- **Modules:** each `.rsn` file is a module; `pub` exports (private by default);
+  `import "util.rsn" as util` then `util.name`. Qualified, clash-free. Selective
+  `from … import { … }` is a possible later sugar.
+
+## Value model
+
+Typed, so values are their **natural machine representations** — no universal
+tagged word:
+
+| RSN type                          | Representation                          |
+|-----------------------------------|-----------------------------------------|
+| `int` / `float` / `bool`          | i64 / f64 / i8 (unmanaged scalars)      |
+| `string`, `list`, `map`, `fn`, `fiber`, `chan`, record, `enum` | managed ref |
+| `(T, U, …)` tuple                 | managed ref (anonymous positional record); all-scalar packing deferred |
+| `Option<T>` / `T?`                | managed ref (a sum); `Option<scalar>` boxes |
+| `any`                             | managed ref to a `{tag, payload}` box   |
+
+`int` is a true 64-bit integer (the reason typing was attractive — no NaN-box, no
+bigint spill, no pointer masking). Mixed `int`/`float` arithmetic requires an
+explicit `float(x)`. Strings are immutable, so they are always safe to share
+across fibers; **interning is selective, not universal** — short strings and any
+string used as a map key are interned (cheap equality, CAS keys), while large
+strings (file bodies, captured subprocess output) are held by reference without
+hashing them into the intern table. `Option<scalar>` boxing is later
+niche-optimizable; builtin `list<int>` stays packed/unboxed.
+
+## Errors and results
+
+Values, typed, with Zig's `try` ergonomics — no exceptions, no unwind tables.
+
+- **`Result<T, E>`** (a sum): `Ok(v)` / `Err(e)`. `try expr` unwraps `T` on `Ok`,
+  or returns the `Err` from the enclosing function on `Err`; it is type-checked
+  (the enclosing return must be a `Result` with a compatible `E`). At the top
+  level, a `try` that hits `Err` halts the run nonzero. `expr catch handler`
+  recovers, with an optional `|e|` binder.
+- **Process primitives:** `run(cmd) -> Result<Proc, Error>` (`Ok` on exit 0, fail
+  fast with `try`); `exec(cmd) -> Proc` (raw `{code, out, err}`, for harnesses
+  that expect nonzero exits).
+- `match` destructures `Result`/`Option`/any sum generally; `try`/`catch`/`??`/
+  `if let` are the ergonomic shortcuts.
+
+```
+fn link(objs: list<string>) -> Result<string, Error> {
+  let a = try run(["cfree", "ar", "rcs", "lib.a"] + objs)   # propagate on failure
+  return Ok(a.out)
+}
+```
+
+## Surface syntax
+
+Brace-delimited, keyword-led, semicolon-free, mandatory braces on block bodies.
+Type annotations only in type positions. Comments are `#` to end-of-line (line or
+trailing); there are no block comments. Strings interpolate with `"${expr}"`.
+Method calls are UFCS (`xs.map(f)` ≡ `map(xs, f)`).
+
+```
+struct Target { name: string, inputs: list<string>, optimize: bool }
+
+enum Json {
+  Null
+  Bool(bool)
+  Num(float)
+  Str(string)
+  Arr(list<Json>)
+  Obj(map<string, Json>)
+}
+
+fn render(j: Json) -> string {
+  match j {                          # match is an expression
+    Null    -> "null"
+    Str(s)  -> quote(s)
+    Num(n)  -> str(n)
+    Arr(xs) -> "[" + join(map(xs, render), ",") + "]"
+    _       -> "..."
+  }
+}
+
+fn build(t: Builder) {
+  let objs: list<string> = []        # annotation: empty list element type
+  for src in glob("src/**/*.c") {
+    let obj = src.replace(".c", ".o")
+    t.target(obj, [src], fn() { try run(["cfree", "cc", "-c", src, "-o", obj]) })
+    objs.push(obj)                   # mutating contents (let binding is fine)
+  }
+  let mode = if t.release { "opt" } else { "dbg" }   # value-if; no ternary
+  t.target("libcfree.a", objs, fn() { try run(["cfree", "ar", "rcs", "libcfree.a"] + objs) })
+}
+```
+
+## Grammar (LL(1))
+
+LL(1) by construction; type syntax confined to type positions. The lexer
+suppresses the statement-terminating newline after an open bracket, a trailing
+binary operator, **or when the next non-blank token is a leading `.`** (so
+method/field chains may wrap across lines); the EBNF omits the newline token.
+Comments (`#` to end-of-line; no block form) and string interpolation
+(`"${expr}"`, `\$` escapes a literal `$`) are lexer-level; an interpolated `${…}`
+lexes as a parenthesized sub-expression.
+
+```ebnf
+program    = { item } ;
+item       = importDecl | [ "pub" ] decl | stmt ;
+importDecl = "import" STRING "as" IDENT ;
+decl       = fnDecl | structDecl | enumDecl ;
+
+fnDecl     = "fn" IDENT [ generics ] "(" [ params ] ")" [ "->" type ] block ;
+structDecl = "struct" IDENT [ generics ] "{" { IDENT ":" type [ "," ] } "}" ;
+enumDecl   = "enum" IDENT [ generics ] "{" { variant [ "," ] } "}" ;
+variant    = IDENT [ "(" type { "," type } ")" ] ;
+generics   = "<" IDENT { "," IDENT } ">" ;
+params     = IDENT ":" type { "," IDENT ":" type } [ "," ] ;
+
+stmt       = letDecl | varDecl | whileStmt | forStmt | returnStmt
+           | "break" | "continue" | exprStmt | block ;
+letDecl    = "let" pattern [ ":" type ] "=" expr ;   (* pattern irrefutable *)
+varDecl    = "var" pattern [ ":" type ] "=" expr ;   (* pattern irrefutable *)
+whileStmt  = "while" cond block ;
+forStmt    = "for" pattern "in" cond block ;          (* pattern irrefutable *)
+returnStmt = "return" [ expr ] ;
+exprStmt   = expr [ assignOp expr ] ;   (* tuple-of-lvalues LHS = parallel assign *)
+assignOp   = "=" | "+=" | "-=" | "*=" | "/=" ;
+block      = "{" { stmt } "}" ;     (* value = trailing expr-stmt, else unit *)
+
+type       = baseType { "?" } ;
+baseType   = IDENT [ "<" type { "," type } ">" ]
+           | "fn" "(" [ type { "," type } ] ")" "->" type
+           | "(" [ type "," type { "," type } ] ")" ;
+             (* left-factored: "()" = unit; "(" T "," U … ")" = tuple (arity >= 2).
+                No "(" T ")" form — types have no paren-grouping, so a lone "(int)"
+                is a syntax error; unit-vs-tuple is decided after "(" on ")" vs a type. *)
+
+expr       = coalesce [ "catch" [ "|" IDENT "|" ] expr ] ;
+coalesce   = orExpr  { "??" orExpr } ;
+orExpr     = andExpr { "||" andExpr } ;
+andExpr    = eqExpr  { "&&" eqExpr } ;
+eqExpr     = relExpr { ("==" | "!=") relExpr } ;
+relExpr    = addExpr { ("<" | "<=" | ">" | ">=") addExpr } ;
+addExpr    = mulExpr { ("+" | "-") mulExpr } ;
+mulExpr    = unary   { ("*" | "/" | "%") unary } ;
+unary      = "try" unary | ("!" | "-") unary | postfix ;
+postfix    = primary { callOp | indexOp | fieldOp } ;
+callOp     = "(" [ args ] ")" ;
+indexOp    = "[" expr "]" ;
+fieldOp    = "." IDENT ;
+args       = expr { "," expr } [ "," ] ;
+
+primary    = INT | FLOAT | STRING | "true" | "false"
+           | ifExpr | matchExpr | spawnExpr
+           | IDENT [ structLit ]          (* structLit suppressed in cond; see below *)
+           | parenExpr
+           | listLit | mapLit | fnLit ;
+parenExpr  = "(" expr [ "," expr { "," expr } ] ")" ;
+             (* left-factored: no comma => grouping (value is the inner expr);
+                >= 1 comma => tuple (arity >= 2). The grouping-vs-tuple choice is
+                made on the token after the first expr: ")" vs ",". No 1-tuples,
+                so "(a,)" is rejected. *)
+spawnExpr  = "spawn" block ;              (* yields a fiber handle *)
+ifExpr     = "if" ( "let" pattern "=" cond | cond ) block
+             [ "else" ( ifExpr | block ) ] ;
+cond       = expr ;   (* a bare `IDENT { … }` struct literal is NOT taken here; a
+                         struct literal in condition position must be
+                         parenthesized: `if (Foo{…}).ok { … }` *)
+matchExpr  = "match" expr "{" { matchArm } "}" ;
+matchArm   = pattern "->" ( block | expr ) [ "," ] ;
+pattern    = patternAtom { "|" patternAtom } ;   (* or-pattern; all alternatives
+                                                    must bind the same names+types.
+                                                    `|` here is pattern position,
+                                                    distinct from catch's `|e|`. *)
+patternAtom = "_" | literal
+           | IDENT IDENT                       (* type-narrow: type binder *)
+           | IDENT [ "(" pattern { "," pattern } ")" ]    (* Ctor / binding *)
+           | "(" pattern "," pattern { "," pattern } ")" ;  (* tuple, arity >= 2 *)
+structLit  = "{" [ field { "," field } [ "," ] ] "}" ;
+field      = IDENT ":" expr ;
+listLit    = "[" [ expr { "," expr } [ "," ] ] "]" ;
+mapLit     = "{" [ entry { "," entry } [ "," ] ] "}" ;
+entry      = ( STRING | "[" expr "]" ) ":" expr ;   (* string or computed key;
+                         bare `IDENT:` is a struct field, never a map key, so
+                         `{name: x}` is unambiguous and `map<int,V>` literals are
+                         written `{[k]: v}` *)
+fnLit      = "fn" "(" [ params ] ")" [ "->" type ] block ;
+literal    = INT | FLOAT | STRING | "true" | "false" ;
+```
+
+LL(1) tactics that constrain the language:
+
+- **`let`/`var` lead bindings; `pub` leads exports; `import` leads imports.**
+- **Assignment is statement-level** (`=` vs `==` unambiguous; no `if (a = b)`).
+- **The `{` rule.** In a statement or control-flow slot (`block`, after `if`/
+  `else`/`while`/`for`/`fn`/`spawn`, and a match arm's `->`), `{` is a block. In
+  expression position, `{` is a map literal and `Name{...}` is a struct literal.
+  To use a map where a block is expected, parenthesize: `-> ({...})`. This is what
+  lets value-`if`/`match` coexist with `{k: v}` maps under LL(1).
+- **Struct literals are suppressed in condition position.** Inside the `cond` of
+  `if`/`while`/`for`, a bare `Name { … }` is *not* a struct literal — otherwise
+  the `{` is eaten as a struct body and starves the required block (the classic
+  Rust/Go conflict). A struct literal needed in a condition must be parenthesized:
+  `if (Foo{…}).ok { … }`. This is what actually makes the grammar LL(1); without
+  it `IDENT [structLit]` and the control-flow `{`-block production collide.
+- **The `(` rule.** In expression position `(` is left-factored: parse one `expr`,
+  then a following `)` means it was grouping and `,` means it's a tuple — so
+  grouping and tuples never conflict at the open paren. In type position there is
+  no grouping, so `()` is unit and `(T, …)` is a tuple, decided right after `(`.
+  In pattern position a leading `(` is always a tuple pattern. (`spawn`/`fn`
+  before `(` are keywords, handled before this rule.)
+- **`fn` at item/stmt start is a named declaration**; anonymous `fn` only in
+  expression position.
+- **`try` is a unary prefix; `catch` a low-precedence tail** with an optional
+  `|e|` binder — LL(1) because `|` is not in FIRST(expr).
+- **`<` is a type-only token in type position** (generics, declared sites). No
+  explicit type args at call sites, so `<` is always less-than in expressions.
+- **Patterns** decide by one lookahead after the leading token: `_`/literal are
+  immediate; a leading `(` is a **tuple pattern**; an `IDENT` followed by another
+  `IDENT` is a type-narrow, by `(` is a constructor, else a binding
+  (constructor-vs-binding for a bare identifier is resolved in the checker against
+  known variants). **One `pattern` grammar serves `let`/`var`/`for`/`match`/`if
+  let`**; `let`/`var`/`for` additionally require it to be *irrefutable* (binders,
+  `_`, nested tuples thereof — never a literal/constructor/type-narrow that could
+  fail to match).
+- **Or-patterns join alternatives with `|`** (`"+" | "-" -> …`): after a
+  `patternAtom`, lookahead `|` continues the or-pattern, anything else ends it.
+  This pattern-position `|` does **not** clash with `catch`'s `|e|` binder (expr
+  position), so the "`|` ∉ FIRST(expr)" tactic still holds. Or-patterns are
+  refutable ⇒ they appear only in `match`/`if let`, and they are how `match`
+  subsumes C `switch` multi-`case`/fallthrough — **arms never fall through**.
+- **`if let`** vs `if`: peek `let` after `if`.
+- **Parallel assignment.** `(a, b) = (b, a)` is an assignment whose LHS is a
+  tuple of lvalues; the RHS tuple is fully evaluated before any target is written
+  (so swap works). Only `=` takes a tuple LHS — not `+=` and friends.
+
+## Semantics defaults
+
+Decided, revisitable — these were settled as defaults rather than formal forks:
+
+- **Integer overflow traps** (halts the run) — correctness over speed for a tool
+  language, where a silently wrapped size/offset is the worse failure. Explicit
+  `+%` / `-%` / `*%` perform two's-complement wrapping where it is actually
+  wanted.
+- **`/`** truncates toward zero; **`%`** sign follows the dividend (C). Mixed
+  `int`/`float` needs `float(x)`.
+- **`==`** compares scalars and strings by value, and lists/maps/records/sums/
+  tuples structurally by contents; identity is a separate `is`.
+- **Iteration:** `for x in list`, `for (k, v) in map`, `for i in range(n)`; one
+  iterator protocol (`next() -> T?`) so user types opt in — `map` yields `(K, V)`
+  tuples, so multi-binding is just a tuple pattern, not a special case.
+- **Unit return:** a function with no `-> T` returns unit (nothing), distinct from
+  `None`.
+- **Strings:** immutable, UTF-8; `.len` is bytes; codepoint/grapheme helpers and
+  patterns operate on bytes.
+- **Closures** capture bindings by reference (mutating a captured `var` is
+  visible). **Loop bindings are fresh per iteration** — each `for`/`while` turn
+  gets its own binding, so a closure created in the loop captures *that*
+  iteration's value (Go 1.22 / JS-`let` semantics). The build example's per-`src`
+  recipe closures depend on this.
+
+## Execution: RSN on the interp
+
+- **Pipeline:** source → lexer → LL(1) parser → AST → type check → CG-API
+  lowering → IR → (interp sink) `InterpInsn`. Typed ops are real IR ops; only
+  `any` and a few container/iterator primitives are `rt/lib/` helper calls.
+- **Register VM, direct-threaded dispatch** — the existing interp.
+- **Fibers** are the interp's explicit `InterpStack`s. A yielding op (`run`,
+  `read`, `sleep`, channel ops) returns `CFREE_INTERP_BLOCKED`; a **scheduler**
+  (living with the host, since readiness depends on host events) resumes the stack
+  when ready. Cooperative: no preemption.
+- **The one limit (like Lua):** a fiber may yield only at RSN/IR instruction
+  boundaries and blessed scheduler-aware host calls — never from inside a native C
+  frame reached via FFI.
+- **Surface:** `spawn { … }` is an *expression* yielding a fiber handle (`let h =
+  spawn { … }`); `parallel(list, limit, fn)` → bounded fan-out, results in input
+  order; `chan()` / `.send` / `.recv`.
+- **Determinism guard:** fibers are execution-phase; build-graph *definition* runs
+  single-fiber, and the build-file capability gate nulls `spawn` and writes
+  (read-only `glob`/`stat` stay live so the graph can discover its inputs).
+
+## Memory and GC
+
+- **Non-moving mark-sweep**, precise via the managed-ref type. Allocation funnels
+  through `rsn_alloc(shape, size)`; collection runs at an allocation threshold.
+- **Stop-the-world is trivial** (single-threaded, cooperative): collect only at
+  allocation safepoints, where all roots are in registers.
+- **Roots:** live ref-PRegs across all fibers (via each function's ref-bitmap),
+  globals, the string-intern table, the C-roots scratch. Tracing uses each
+  object's shape (per-variant for sums).
+- **Finalizers:** `handle`/`chan`/`fiber` and host-resource objects get an
+  optional `finalize` slot in their shape, run on collection and at teardown.
+- **Arena stage (1a):** managed values bump-allocate into a run-scoped arena and
+  the collector is deferred (free at run end); object headers and shapes are
+  designed in from the start, so turning the collector on in 1b is localized.
+
+## Embedding API and capability gating
+
+```c
+typedef struct CfreeRsnHost {
+  const CfreeFileIO* file_io;                 /* build-def: read/glob/stat live, writes gated */
+  int      (*spawn)(void* user, const CfreeRsnSpawn*, CfreeRsnProc** out);
+  int      (*proc_reap)(void* user, CfreeRsnProc*, int* exit_code); /* scheduler */
+  int64_t  (*clock_ns)(void* user);           /* gated */
+  int      (*mkdir_p)(void* user, const char* path);
+  void* user;
+} CfreeRsnHost;
+
+CFREE_API CfreeRsn*   cfree_rsn_new(CfreeContext*, const CfreeRsnHost*);
+CFREE_API CfreeStatus cfree_rsn_eval(CfreeRsn*, CfreeSlice src, const char* name);
+CFREE_API void        cfree_rsn_register(CfreeRsn*, const char* name,
+                                         CfreeRsnNativeFn, void* ud);
+CFREE_API void        cfree_rsn_free(CfreeRsn*);
+```
+
+- **Script mode** (`cfree rsn foo.rsn`): full vtable — replaces the shell glue.
+- **Build-definition mode**: `spawn`/`clock_ns`/write-`file_io` NULL, but
+  **read-only filesystem queries stay live** (`glob`, `stat`, read) — a
+  `build.rsn` must discover its inputs (`glob("src/**/*.c")`) to declare targets,
+  yet cannot `run()` processes, write files, or read the clock, so the graph is
+  hermetic and ad-hoc effects are a load-time error. The capability gate is
+  therefore per-operation (read vs. write vs. spawn), not a single wholesale
+  `file_io` switch. One frontend, two privilege levels, chosen by the caller.
+
+## The build layer (later)
+
+A library *in RSN* plus a thin engine: `target(name, inputs, outputs, recipe)`;
+staleness by **content hash via the CAS** (not mtimes); parallel execution via
+`parallel`/fibers bounded to `ncpu()`; `cfree build` loads `build.rsn` in
+build-definition mode for the graph, then executes it in script mode. Out of scope
+for the milestones below — it is what the language is *for*.
+
+## Implementation phases
+
+**Stage 1a — typed RSN, interpreted, arena-allocated (no collector yet):**
+
+1. `lang/rsn/` lexer + LL(1) parser → AST (full grammar above). Parse-corpus tests
+   under `test/rsn/`.
+2. Type checker: local inference, ADTs (`struct`/`enum`/tuple), builtin-container
+   generics, UFCS resolution, the uniform `pattern` grammar (irrefutability check
+   for `let`/`var`/`for`, exhaustiveness for `match`), `Result`/`Option`/`try`,
+   modules.
+3. RSN → CG-API lowering (typed ops → IR ops; `any`/containers/iterators via
+   `rt/lib/`). Managed values live in a **run-scoped arena, freed at run end** —
+   the interp carries a per-function ref-bitmap as interp-local metadata, but no
+   new CG type kind is added yet, so the C compiler's type system is untouched.
+4. Fibers: complete the `BLOCKED`/yield path + scheduler over `InterpStack`s +
+   host vtable (`spawn`/`run`/`read`/`clock`); `spawn`/`parallel`/`chan`.
+5. Stdlib: strings, list/map, iterators/`range`, Lua-style patterns, JSON (→
+   `any`), `run`/`exec`.
+6. CLI `cfree rsn` + per-operation capability gating.
+
+Goal of 1a: prove the language on real `build.rsn` / test scripts *before* the
+invasive IR surgery. Short-lived runs make arena-free-at-exit genuinely adequate.
+
+**Stage 1b — precise GC (added once runs are long-lived enough to need it):**
+
+7. **CG/IR managed-ref type + GC op dialect** (`src/cg`), gated `CFREE_MANAGED_IR`;
+   object headers + (per-variant) shape descriptors. Firewall test: C-only build
+   green with the dialect compiled out.
+8. Lowering emits managed ops for refs; the interp's root scan + mark-sweep
+   collector + finalizers replace the arena. (The 1a ref-bitmap becomes the
+   precise root map.)
+
+**Stage 2 — native managed codegen (optional, demand-driven):**
+
+9. Safepoints + GC stack maps in the optimizer (liveness/regalloc).
+10. Native emit for managed functions across aa64/x64/rv64.
+11. JIT/AOT RSN: `cfree rsn --jit`, and `cfree build` emitting native build tools.
+12. Wasm/C-backend managed support (furthest out).
+
+**Then:** the build layer.
+
+## Resolved decisions
+
+All of: the architecture (typed CG-API frontend; GC-aware IR; precise non-moving
+mark-sweep; cooperative fibers; staged interp → 1a arena → 1b GC → native);
+`let`/`var`; non-null + `T?`; `Result`/`try`/`catch`; ADTs (`struct`/`enum`) with
+`match`; `any` narrowing via lowercase type-patterns in `match`; **UFCS methods**;
+**string interpolation**; **`#` line comments**; expression `if`/`match`
+with the `{}`-block rule and the struct-literal-in-condition restriction;
+qualified module imports; Lua-style patterns; separate `int`/`float` with explicit
+conversion; **integer overflow traps**; **selective string interning**;
+**per-operation capability gating**; **first-class tuples with one uniform
+`pattern` grammar (Rust-style) + parallel assignment**; **`match` as the single
+multi-way construct (no `switch`) with or-patterns and no fallthrough**; Stage-1
+generics = builtin containers only; the semantics defaults above; NaN-box value
+model dropped.
+
+## Open decisions
+
+None block Stage 1a. Deferred until their phase: **selective imports** (`from …
+import { … }` sugar), **range syntax** (`a..b` sugar vs the `range()` builtin) —
+which would also unlock **range patterns** in `match` (`1..10 -> …`),
+**record-style enum payloads** (`Variant { … }`), and **user-defined generic
+functions** (instantiation strategy + constraints — see Type system). The earlier
+tuple gap is now resolved (first-class `(T, U)`, see Type system / Value model /
+Grammar); the remaining tuple sub-question deferred to its phase is **packed
+all-scalar tuple representation** (e.g. unboxed `(int, int)`), niche-optimized
+alongside packed `list<int>`. **Match guards** (`pattern if cond -> …`) are
+deliberately deferred — `match` stays structural + or-pattern dispatch, and
+conditional logic lives in `if`/`else if`; guards can be added later
+non-breakingly (`matchArm = pattern [ "if" expr ] "->" …`).
+
+## Naming
+
+Language: **RSN — Ryan's Scripting Notation.** Source extension `.rsn`. Frontend
+`lang/rsn/`, CLI `cfree rsn`, frontend flag `CFREE_RSN_ENABLED`; the GC-aware IR
+it depends on is gated separately (`CFREE_MANAGED_IR`). When RSN ships, the
+durable design moves up to `doc/RSN.md` and this roadmap shrinks to what remains
+open.

	kit kit
	git clone https://git.ryansepassi.com/git/kit.git
	Log \| Files \| Refs \| README

A	doc/plan/RQL.md	\|	927	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
A	doc/plan/RSN.md	\|	627	+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++