commit be9be587a0b6e6599ed17c6764e1da2d93f042ec
parent 86d3164a20fead7c9b3e92932a861a271d915e42
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Mon, 8 Jun 2026 11:15:00 -0700
plan: refresh doc index
Diffstat:
3 files changed, 8 insertions(+), 369 deletions(-)
diff --git a/doc/plan/CGFP.md b/doc/plan/CGFP.md
@@ -1,182 +0,0 @@
-# Make the internal `CmpOp` FP-compare-lossless (a "disjoint FP block")
-
-## Context
-
-kit has two stacked codegen interfaces (see `doc/plan/CODEGEN.md`, Track 2). The public
-API (`include/kit/cg.h`) splits compares by class — `KitCgIntCmpOp` (10) and the full
-IEEE-complete `KitCgFpCmpOp` (12: OEQ, ONE, OLT, OLE, OGT, OGE, UEQ, UNE, ULT, ULE, UGT,
-UGE). The internal `CgTarget` vocabulary `CmpOp` (`src/cg/cgtarget.h:60-75`) merges them into
-14 members and is **lossy for floating point**: it has no unordered relationals, and
-`CMP_EQ`/`CMP_NE` are overloaded between int and float (on float, `CMP_EQ`=ordered-equal,
-`CMP_NE`=unordered-not-equal). `api_map_fp_cmp` (`src/cg/value.c:503`) collapses the 12
-public FP predicates down to 6, so the ordered/unordered distinction **cannot reach any
-backend**.
-
-**Why fix it (accurate motivation — there is no live C miscompile):**
-- **Advertise-but-ignore (Principle 2).** A direct cg-api caller — or any future frontend, or
- the C99 `isgreater`/`isless`/`islessgreater`/`isunordered` math builtins if/when the C
- frontend implements them — that requests `ULT/ULE/UGT/UGE/UEQ/ONE` is silently lowered to
- the wrong (ordered) predicate, producing wrong NaN behavior. The public surface promises a
- distinction the backend never receives.
-- **A masked frontend mislabel.** Both the C frontend (`pcg_fp_cmp`, `cg_adapter.c:343`) and
- toy (`expr.c:1432`) map `!=` to **ONE** (ordered-not-equal). C `!=` on floats is
- **unordered** (`NaN != x` is true). This is "correct by accident" today *only* because the
- lossy collapse maps ONE → `CMP_NE`, which every backend implements as UNE. Making the map
- lossless would expose this — so the frontend fix is **coupled** to this change.
-- **A latent inversion bug.** `api_invert_cmp` (`src/cg/fold.c:214`) inverts ordered FP
- relationals to ordered duals (`OLT→OGE`) — wrong around NaN (the correct inverse of ordered
- `a<b` is *unordered* `a>=b`). It is currently **unreachable** because `api_cg_cmp`
- materializes FP compares immediately (`arith.c:181-192`) rather than building a delayed
- `SV_CMP`; it becomes a live miscompile the moment anyone fuses/delays an FP cmp+branch.
-- **A second, parallel lossy path.** The f128/`long double` soft-float path
- (`kit_cg_fp_cmp`, `arith.c:602-651`) collapses ordered/unordered onto shared libcalls the
- same way and must be fixed in lockstep.
-- **It is the correctness content of Track 2, decoupled from the structural split.** This
- change does *not* split the `cmp` hook, double the IR opcodes, or touch binop/unop. It makes
- the merged enum as expressive as the public one already is (a clean disjoint int-block +
- fp-block, isomorphic to `KitCgIntCmpOp ⊎ KitCgFpCmpOp`), so the eventual full split is
- reduced to pure type/hook plumbing with no correctness left in it.
-
-## Design
-
-### 1. The enum: disjoint FP block (`src/cg/cgtarget.h`)
-Replace the FP portion of `CmpOp`. Keep the 10 integer members unchanged (already 1:1 with
-`KitCgIntCmpOp`); replace `CMP_LT_F/LE_F/GT_F/GE_F` and the FP overloading of `CMP_EQ/NE`
-with a clean 12-member FP block laid out **after** the int block, in the **same order** as
-`KitCgFpCmpOp`:
-
-```
-CMP_EQ, CMP_NE, CMP_LT_S, CMP_LE_S, CMP_GT_S, CMP_GE_S, CMP_LT_U, CMP_LE_U, CMP_GT_U, CMP_GE_U, // 0..9 (int, unchanged)
-CMP_OEQ_F, CMP_ONE_F, CMP_OLT_F, CMP_OLE_F, CMP_OGT_F, CMP_OGE_F, // 10..15 (FP ordered)
-CMP_UEQ_F, CMP_UNE_F, CMP_ULT_F, CMP_ULE_F, CMP_UGT_F, CMP_UGE_F // 16..21 (FP unordered)
-```
-
-`CMP_OEQ_F` (=10) is the new FP boundary. The old names `CMP_LT_F/LE_F/GT_F/GE_F` are
-**removed** so `-Werror` hard-errors every stale *name* reference. Update the header comment
-(`cgtarget.h:54-59`) to state the block is now IEEE-complete (drop the "does not encode the
-ordered/unordered distinction" disclaimer).
-
-### 2. Mappings (`src/cg/value.c`)
-- `api_map_int_cmp` — unchanged (already 1:1).
-- `api_map_fp_cmp` — make **1:1** (OEQ→`CMP_OEQ_F`, …, UGE→`CMP_UGE_F`). Since the FP block
- mirrors the public enum order, this is `return (CmpOp)(CMP_OEQ_F + op);` (keep as an
- explicit switch for clarity; the full Track 2 deletes it later).
-
-### 3. NaN-aware inversion (`src/cg/fold.c:214`, `api_invert_cmp`)
-Replace the 4 buggy FP arms with the full 12-member table. Rule: **flip ordered↔unordered and
-negate the relation** (eq↔ne, lt↔ge, le↔gt, gt↔le, ge↔lt):
-
-```
-OEQ→UNE ONE→UEQ OLT→UGE OLE→UGT OGT→ULE OGE→ULT
-UEQ→ONE UNE→OEQ ULT→OGE ULE→OGT UGT→OLE UGE→OLT
-```
-
-### 4. Dispatch simplification
-FP-ness is now self-describing from the opcode: replace the four `op >= CMP_LT_F ||
-((op==CMP_EQ||op==CMP_NE) && loc_is_fp(...))` sites with `op >= CMP_OEQ_F`, dropping the
-operand-sniffing clause (FP eq/ne are now distinct opcodes). Sites: `interp/engine.c:341`,
-`rv64/native.c:998`, `rv64/native.c:1148`, `x64/native.c:1112`.
-**Caveat:** aa64 does **not** use this boundary — `aa_emit_cmp_to_flags`
-(`aa64/native.c:2185-2191`) chooses FCMP-vs-SUBS from the register class via `loc_is_fp(lhs)`;
-that operand-sniff must **stay** (it's precision/class selection, not predicate decode).
-Single/double precision selection (rv64 `RV_FMT_D/S`, x64 `0x66` prefix) is type-driven
-everywhere and is unaffected.
-
-### 5. Backend emission — the 6 new unordered predicates
-Key identity used throughout: **`unordered-R = NOT(ordered-¬R)`** — `ULT=!(OGE)`, `ULE=!(OGT)`,
-`UGT=!(OLE)`, `UGE=!(OLT)`, `UEQ=!(ONE)`, `UNE=!(OEQ)`. Each backend renames its 4 ordered
-arms (`LT_F→OLT_F`, …) + the FP eq/ne arms (old `CMP_EQ/NE`-on-float → `CMP_OEQ_F`/`CMP_UNE_F`)
-and adds the unordered arms. Per-arch reality (cost ~moderate, **not** the 300-500 LOC a naive
-read suggests; rv64 is *not* the long pole once negation is used):
-
-| Backend | File:hook | Strategy + trap |
-|---|---|---|
-| interp | `engine.c:340 do_cmp` | trivial: direct C doubles w/ explicit `x!=x\|\|y!=y`; bump gate to `CMP_OEQ_F` |
-| rv64 | `native.c:991 rv_cmp`, `:1136 rv_cmp_branch` | ordered feq/flt/fle + `xori …,1` for unordered. `flt/fle` are **signaling** (raise NV on NaN) — pre-existing for ordered ops; document, boolean is correct. `default:` is `rv_panic` (traps, not silent) |
-| x64 | `native.c:1161 x64_cmp`, `:1334 cmp_branch`, `:1095 cmp_to_cc` | `ucomisd` sets ZF/PF/CF (PF=unordered). Ordered=AND-with-NP, unordered=OR-with-P. **Build each predicate's flag formula explicitly via De Morgan** — do not blindly `!(opposite)`; the existing GT/GE arms skip parity correction. `cmp_to_cc default:` returns E (**silent** — see methodology) |
-| aa64 | `native.c:634 cmp_cond`, `:2205 aa_cmp`, `:1668 cmp_branch` | keep quiet `FCMP` (`:482`, not FCMPE). Build full 12-entry FP cond table from the ARM FP-condition reference. **`UEQ` (equal-or-unordered) and `ONE` (ordered-and-not-equal) have no single AArch64 condition** → 2 instructions (e.g. fcmp + cset + csinc). `cmp_cond default:` returns 0x0=EQ (**silent, highest-risk**) |
-| c_target | `c_emit.c:1340 cmp_to_c`, `:1404 c_emit_cmp` | operator-negation in generated C: `ULT→!((a)>=(b))`, `ONE→((a)<(b)\|\|(a)>(b))`, `UEQ→(!((a)<(b))&&!((a)>(b)))`, `UNE→(a)!=(b)`; no `isnan`. Must restructure `c_emit_cmp` to emit compound expressions (not just one operator) and wrap `!(...)` around the full cast-bearing comparison. Don't let the host compiler use `-ffast-math` |
-| wasm | `emit.c:2493 cmp_kind` | **FP eq/ne arms are currently absent** — today f32/f64 eq/ne falls through to integer `WASM_INSN_*_EQ/NE` (a validator type error); add all 12 from scratch, not "rename". Unordered via `i32.eqz` of the opposite ordered compare; `ONE`/`UEQ` via or-combine |
-
-### 6. f128 / soft-float path (`src/cg/arith.c:602-651`)
-Independent of the enum (it uses integer compares on libcall results), but the same
-collapse. Make it lossless using `__unordtf2` (nonzero iff either operand is NaN):
-- ordered relationals: unchanged (`OLT→__lttf2 <0`, etc.).
-- unordered relationals: `Ult = (__lttf2(a,b) < 0) || (__unordtf2(a,b) != 0)`, and similarly
- for ULE/UGT/UGE.
-- `UEQ = (__eqtf2==0) || (__unordtf2!=0)`; `OEQ = (__eqtf2==0)`.
-- `ONE = (__netf2!=0) && (__unordtf2==0)`; `UNE = (__netf2!=0)` (note: `__netf2!=0` already
- means unordered-not-equal, so today's `ONE→__netf2+CMP_NE` actually computes UNE — same
- masking as the scalar path). Verify `__unordtf2` is provided by the runtime (`rt/`).
-
-### 7. Coupled frontend correctness fix (lands with the lossless map)
-- `lang/c/parse/cg_adapter.c:343` — `case CMP_NE: return KIT_CG_FP_UNE;` (was `ONE`).
-- `lang/toy/expr.c:1432` — `case TOK_NE: fp_cmp = KIT_CG_FP_UNE;` (was `ONE`).
-These touch only the frontends' own private enums/tokens → public mapping; they do **not**
-depend on or conflict with the internal-enum rename (the C frontend has its *own* `CmpOp` at
-`cg_adapter.h:88`, so the `-Werror` name removal in `cgtarget.h` must not be globally
-sed-applied into `lang/`).
-
-### 8. Dumper + IR storage
-- `src/cg/ir_dump.c:106 cg_ir_cmp_name` — rename/add the 12 FP name strings (no `default:`, so
- `-Wswitch` *will* flag omissions here).
-- IR storage is already wide enough: `CgIrCmpBranchAux.op` is a full `CmpOp` (`ir.h:112`),
- `extra.imm` is `i64`; recorder/lower pass the op through transparently — **no change**.
-
-### Confirmed non-issues (do not touch)
-`pass_native_emit.c` / `cg_ir_lower.c` pass the op through; `pass_simplify.c:simplify_cmp`
-bails on float via `simplify_width` (`:6`) before the switch; `pass_o2.c gvn_fold_cmp` is
-int-only and `gvn_commutative_cmp` is EQ/NE-only (FP eq/ne CSE is an optional future add);
-`pass_jump.c:invert_cmp` (`:177`) has `default: return 0` and stays conservatively safe for all
-FP. No array is indexed by `CmpOp`; no `CmpOp` consumers in `src/asm` or `src/debug`.
-
-## Methodology note — `-Werror` will NOT catch the dangerous sites
-`Makefile` uses `-Wall -Wextra -Werror` but not `-Wswitch-enum`; `-Wswitch` only fires on a
-switch with **no `default:`**. Removing the old names flags every stale *name* use (good), and
-flags missing enumerators in the no-`default` switches (`api_invert_cmp`, `cg_ir_cmp_name`,
-wasm `cmp_kind`, c_target `cmp_to_c`). But the `default:`-bearing dispatch tables **silently
-mishandle** the new ops: `aa64 cmp_cond`→EQ, `x64 cmp_to_cc`→E, `interp do_cmp`, `x64_cmp` FP
-switch. Treat the per-backend table in §5 as the authoritative checklist, and/or compile the
-touched files with `-Wswitch-enum` while developing.
-
-## Files to modify (grouped)
-- **Enum/map/fold/dump:** `src/cg/cgtarget.h`, `src/cg/value.c`, `src/cg/fold.c`,
- `src/cg/ir_dump.c`
-- **Soft-float:** `src/cg/arith.c` (f128 path)
-- **Backends:** `src/arch/{aa64,x64,rv64}/native.c`, `src/arch/c_target/c_emit.c`,
- `src/arch/wasm/emit.c`, `src/interp/engine.c`
-- **Frontends (coupled fix):** `lang/c/parse/cg_adapter.c`, `lang/toy/expr.c`
-- **Tests:** `test/interp/interp_smoke_test.c` (rename refs + extend), new
- `test/api/cg_fp_cmp_test.c`
-
-## Recommended sequencing (red-green, keep every commit green)
-1. **Red:** add the new public-API test (`test/api/cg_fp_cmp_test.c`) driving all 12
- `kit_cg_fp_cmp(KIT_CG_FP_*)` predicates through the in-process JIT/interp with NaN and
- ordinary operands; it fails today on `ULT/ULE/UGT/UGE/UEQ/ONE` (and on f128).
-2. **Core (atomic):** rewrite `CmpOp` (disjoint block), make `api_map_fp_cmp` 1:1, fix
- `api_invert_cmp`, update all 6 backends + interp via the §5 checklist, the 4 boundary
- checks (§4), `cg_ir_cmp_name`, and the internal-name refs in `interp_smoke_test.c`
- (`CMP_LT_F→CMP_OLT_F`, `CMP_GE_F→CMP_OGE_F`, float `CMP_EQ→CMP_OEQ_F`, `CMP_NE→CMP_UNE_F`).
- Single commit so a `make bootstrap` byte-identity check lands on one consistent state.
-3. **Frontend `!=` fix** (§7) — small, independent commit.
-4. **f128 lossless** (§6) — self-contained commit.
-5. Extend `spec_fp_cmp_nan` to all 12 internal predicates × {NaN-lhs, NaN-rhs, both-NaN,
- ordered, -0.0/0.0}.
-
-## Verification
-- **Unit (internal level):** `make test-isa test-opt`; the extended `spec_fp_cmp_nan` in
- `make test-...` for interp guards each backend's emission of the new internal ops. NaN is
- injected via `bitsd(0x7ff8000000000000ull)` (already in the test).
-- **End-to-end public API (the advertise-but-ignore guard):** `make test-cg-api` runs the new
- `cg_fp_cmp_test.c`. It must drive the **public** enum directly — the discriminating cases
- (`ULT/ULE/UGT/UGE/UEQ/ONE`) cannot be produced by any C/toy source, so they are only
- reachable through `kit_cg_fp_cmp`. Assert per-predicate results for `(NaN,1.0)`,
- `(1.0,NaN)`, `(NaN,NaN)`, `(1.0,2.0)`, `(-0.0,0.0)`. Add a **`long double` variant** to cover
- the f128 path (§6) and a **wasm** case to surface the missing FP eq/ne (§5).
-- **Per-arch execution:** run the cg-api/JIT cases on x64/aa64/rv64 via
- `test/lib/exec_target.sh` (`exec_target_queue`/`flush`).
-- **Frontend `!=`:** a C execution test `int g(float a,float b){return a!=b;}` with `a=NaN`
- must return 1 (UNE), guarding §7 against the now-unmasked mislabel.
-- **No regression:** `make test-asm test-lex test-parse test-toy test-smoke-x64
- test-smoke-rv64`; and `make bootstrap` must still reproduce **byte-identical at -O0 AND -O1**
- (a no-regression guard — it won't exercise unordered predicates itself).
diff --git a/doc/plan/CG_OBJ_LIFECYCLE.md b/doc/plan/CG_OBJ_LIFECYCLE.md
@@ -1,184 +0,0 @@
-# CG / ObjBuilder Lifecycle
-
-This is the target lifecycle for semantic code generation and object building.
-It is motivated by LTO, but it should be true for ordinary one-TU compilation
-as well: `ObjBuilder` owns object lifetime, while `KitCg` borrows an object and
-finishes codegen into it.
-
-Status (2026-06-04): the borrowed CG/object lifecycle is implemented as the only
-public CG session interface. `kit_cg_free` aborts and detaches without flushing,
-lowering, debug-emitting, or finalizing the borrowed object. Shared-library LTO
-remains disabled until that output path is exercised.
-
-## Problem
-
-Historically `KitCg` had an object-shaped lifecycle:
-
-```c
-cg_begin_object(cg, ob, code_opts);
-frontend_compile_cg(..., cg);
-cg_end_object(cg);
-kit_obj_builder_finalize(ob);
-```
-
-That was the wrong ownership boundary. `KitCg` does not create, emit, link, or
-free the object; the caller does. In the borrowed lifecycle, `kit_cg_finish`
-finalizes the CG target and emits debug, while `kit_cg_detach` drops the
-borrowed object/target links. `kit_cg_free` follows the abort path and never
-finishes a partial object as a side effect of cleanup.
-
-It also makes LTO harder to finish cleanly. LTO needs to collect multiple source
-units into one object, then finish semantic codegen only after the driver/linker
-has enough information to provide preserved/export policy. That handoff should
-be a `KitCg` finish option, not a driver-owned pseudo-unit abstraction.
-
-## Ownership Model
-
-`ObjBuilder` owns object state:
-
-- symbol identity and the name-to-id index;
-- sections, atoms, relocations, data bodies, common symbols, and object metadata;
-- object-level finalization and emission;
-- object lifetime and cleanup.
-
-`KitCg` owns a semantic codegen session attached to an object:
-
-- the current target/recorder/backend;
-- codegen options and whole-module optimization state;
-- source-unit boundaries and provenance;
-- debug/codegen state that is produced by semantic lowering;
-- the final codegen flush into the borrowed object.
-
-The driver or API caller owns orchestration:
-
-- creating/freeing `ObjBuilder`;
-- deciding source order and which inputs are semantic vs opaque;
-- passing link-picture policy to codegen finish;
-- calling `kit_obj_builder_finalize` and then emitting/linking the object.
-
-## Target API Shape
-
-The exact names can change, but the shape should be explicit:
-
-```c
-KitObjBuilder* ob = NULL;
-KitCg* cg = NULL;
-
-kit_obj_builder_new(compiler, &ob);
-kit_cg_new(compiler, &cg);
-
-kit_cg_begin(cg, ob, &code_opts); /* borrow ob, attach backend */
-kit_cg_begin_unit(cg, &unit_opts); /* source contribution */
-frontend_compile_cg(..., cg);
-kit_cg_end_unit(cg);
-kit_cg_finish(cg, &finish_opts); /* flush/lower/debug into ob */
-kit_cg_detach(cg); /* drop borrowed links */
-
-kit_obj_builder_finalize(ob);
-```
-
-For multi-source LTO, only the unit loop grows:
-
-```c
-kit_obj_builder_new(compiler, &ob);
-kit_cg_new(compiler, &cg);
-kit_cg_begin(cg, ob, &code_opts);
-
-for each semantic source:
- kit_cg_begin_unit(cg, &unit_opts);
- frontend_compile_cg(..., cg);
- kit_cg_end_unit(cg);
-
-kit_cg_finish(cg, &finish_opts);
-kit_cg_detach(cg);
-kit_obj_builder_finalize(ob);
-```
-
-Opaque frontends do not attach to `KitCg`; they compile directly into their own
-`ObjBuilder` and enter link/archive/relocatable order as ordinary objects.
-
-## Object vs Unit
-
-An object is the emitted product. It may contain one source unit or many.
-
-A unit is one semantic source contribution inside the object. Unit boundaries
-are not object boundaries. They exist so codegen can track:
-
-- source name and source identity;
-- ODR/duplicate-definition provenance;
-- debug compilation-unit identity;
-- file-scope asm and file-scope language state boundaries;
-- future per-source codegen options or path-map state;
-- contribution tables for "symbol X was defined by unit N".
-
-## Finish Options
-
-`kit_cg_finish` is where link-picture-dependent policy enters semantic
-optimization. For LTO, finish options should eventually carry:
-
-- preserved symbols: entry, dynamic exports, opaque undefined references,
- `used`, init/fini, asm-named/address-significant symbols, IFUNC, etc.;
-- output policy: executable, shared library, relocatable, archive member;
-- interposition policy: default-visibility shared-library symbols are
- interposable unless hidden/version-script/`-Bsymbolic` policy says otherwise;
-- debug policy for cross-unit inlining.
-
-The finish operation may use internal `ObjSymId` sets when the linker/driver has
-already resolved names into the shared `ObjBuilder`. A public API can offer a
-name-based adapter if needed, but the core should prefer symbol ids once an
-object exists.
-
-`kit_cg_finish` must not call `kit_obj_builder_finalize`. The caller finalizes
-the object after CG has finished writing semantic output into it.
-
-## Failure Model
-
-Cleanup must not finalize by accident.
-
-- `kit_cg_finish` is the only operation that flushes/lower/debug-emits CG state.
-- `kit_cg_abort` drops current CG-side state and detaches from the borrowed
- object without finalizing anything.
-- `kit_cg_free` never calls finish implicitly.
-- The caller decides whether to finalize or free the `ObjBuilder`.
-
-This fixes the old wart where freeing an open `KitCg` could finalize a partial
-object.
-
-## Boundary Rules
-
-Frontends should only see the `KitCg` semantic API or the object-only API they
-explicitly implement. A semantic frontend should not own `ObjBuilder`
-finalization, and an opaque frontend should not need a fake `KitCg`.
-
-`ObjBuilder` should remain the single source of truth for object symbol identity
-and storage. CG may ask it to declare/define/merge contributions, but CG should
-not own object lifetime.
-
-The driver should not implement symbol merge, semantic finalization, or
-internalization policy. It should gather sources, gather opaque inputs, compute
-or request preserved/export policy, and pass that policy to `kit_cg_finish`.
-
-## Migration Plan
-
-1. Introduce borrowed-lifecycle names as the public API:
- `kit_cg_begin`, `kit_cg_finish`, `kit_cg_detach`, and `kit_cg_abort`.
-2. Make one-TU semantic compilation use the same borrowed lifecycle that LTO
- uses: caller creates `ObjBuilder`, CG borrows it, CG finishes, caller
- finalizes the object.
-3. Add `begin_unit` / `end_unit` bookkeeping and use it in ordinary one-TU and
- multi-source LTO paths.
-4. Move output-kind and preserved/export input into `kit_cg_finish` options.
- The driver now passes output-kind/interposition policy for supported outputs;
- preserved-symbol computation, internalization, and shared-library LTO remain
- follow-up work, so global roots stay conservative.
-5. Move duplicate function/data contribution bookkeeping toward the
- `ObjBuilder`/CG contribution boundary so `src/opt` and `src/cg/data.c` do not
- each own fragments of LTO symbol-resolution policy.
-
-## Non-Goals
-
-- This does not introduce a separate public `LtoUnit` abstraction.
-- This does not require serialized IR objects.
-- This does not make frontends own object finalization.
-- This does not make opaque inputs semantic; asm and prebuilt objects remain
- ordinary object participants.
diff --git a/doc/plan/README.md b/doc/plan/README.md
@@ -11,15 +11,20 @@ shrinks to whatever remains open.
| [RELEASE.md](RELEASE.md) | Cross-cutting initial-release punchlist: release scope, deferred features, and per-subsystem completion/validation items. | — |
| [OPTIMIZER.md](OPTIMIZER.md) | Completing the O2 SSA mid-end, expanded inlining, -O0/-O1 performance work, machine register-constraint improvements. | [../OPT.md](../OPT.md) |
| [LINKER.md](LINKER.md) | Incremental linking: the file-based object-link redesign and remaining non-ELF format coverage. | [../LINK.md](../LINK.md) |
-| [RELOC.md](RELOC.md) | Genericizing the canonical-`RelocKind` half of the relocation layer. The arch-identity switches and the reloc-name table already landed (modularity waves); what remains is the structural denormalization — one per-arch `RelocDesc {width, flags}` table replacing the parallel width/GOT switches, and the byte-patcher's encoders moved into the arch backends behind the single public entry. | [../OBJ.md](../OBJ.md), [../LINK.md](../LINK.md) |
+| [RELOC.md](RELOC.md) | Genericizing the canonical-`RelocKind` half of the relocation layer. WS-B/C/E all landed (per-arch `RelocDesc` table, byte-patcher partitioned per-arch, FreeBSD IFUNC/IRELATIVE); only optional WS-A enum collapse remains. | [../OBJ.md](../OBJ.md), [../LINK.md](../LINK.md) |
| [JIT.md](JIT.md) | Function-level hot reload, Go-runtime-style codegen support, and remaining JIT host-portability work. | [../JIT.md](../JIT.md) |
| [DEBUG.md](DEBUG.md) | The Windows debugger host adapter, x64/rv64 displaced single-step, profiling, and DWARF gaps. | [../DBG.md](../DBG.md), [../DWARF.md](../DWARF.md) |
| [WASM.md](WASM.md) | Completing the Wasm object backend and remaining parser/validator coverage. | [../WASM.md](../WASM.md) |
| [ARCH.md](ARCH.md) | Remaining native-backend completeness for x64/rv64 relative to the aa64 reference, and per-call cost follow-ups. | [../ARCH.md](../ARCH.md) |
| [BOOTSTRAP.md](BOOTSTRAP.md) | The 3-stage self-build reproducibility goal and the open `-O1` issues blocking it. | [../BUILD.md](../BUILD.md) |
-| [IMAGE_INSPECT.md](IMAGE_INSPECT.md) | Extending object inspection to executables and shared libraries. | [../OBJ.md](../OBJ.md) |
+| [IMAGE_INSPECT.md](IMAGE_INSPECT.md) | Extending object inspection to executables and shared libraries. COFF/PE image reader is the main remaining gap. | [../OBJ.md](../OBJ.md) |
| [BUILD.md](BUILD.md) | A new content-addressed build coordinator (Bazel/Nix-style incremental builds layered on the CAS) — storage state machine, caching algorithm, recipe protocol. Distinct from `../BUILD.md` (kit's own Makefile build). | — (new subsystem) |
| [BUILD_COMMANDS.md](BUILD_COMMANDS.md) | The kit-native `build-exe`/`build-lib`/`build-obj` verbs that replace `compile`: polyglot, in-memory compile+link with `--group` flag scoping and full link-flag control. Distinct from `BUILD.md` (the CAS coordinator). | [../DRIVER.md](../DRIVER.md) |
| [LLGEN_IMPORT.md](LLGEN_IMPORT.md) | Importing the standalone LL(1)/Pratt parser and lexer generator into libkit, including public API renames, file moves, build gates, and a `kit llgen` command. | — |
-| [BACKTRACE.md](BACKTRACE.md) | Stack-trace support: GCC-compatible `__builtin_return_address`/`__builtin_frame_address` primitives, a freestanding `__kit_backtrace` capture helper, and symbolized backtrace printing. | [../FRONTENDS.md](../FRONTENDS.md), [../RUNTIME.md](../RUNTIME.md), [../DWARF.md](../DWARF.md) |
+| [BACKTRACE.md](BACKTRACE.md) | Stack-trace support: GCC-compatible `__builtin_return_address`/`__builtin_frame_address` primitives, a freestanding `__kit_backtrace` capture helper, and symbolized backtrace printing. L1–L3a/L3c shipped; L3b (in-process self-symbolization) deferred. | [../FRONTENDS.md](../FRONTENDS.md), [../RUNTIME.md](../RUNTIME.md), [../DWARF.md](../DWARF.md) |
+| [LTO.md](LTO.md) | Whole-program optimization: `symresolve` extraction, cross-TU inlining, internalization. Phase 0 (whole-TU opt) and Phase 1 (all-sources-up-front LTO) shipped; Phase 2 (serialized `.kit.ir` objects) open. | [../OPT.md](../OPT.md) |
+| [CODEGEN.md](CODEGEN.md) | CG API interface cleanup: PLACE/VALUE centerpiece, op/intrinsic taxonomy, atomic/order/AsmDir unification, multi-result API, i128/f128-as-VALUE. Tracks 1/3/4/5/6/7 landed; Track 2 (binop/cmp split) and Track 1c open. | [../CODEGEN.md](../CODEGEN.md) |
+| [RV32.md](RV32.md) | riscv32-none-elf backend: all workstreams (WS0–WS9) complete including 64-bit-value legalization at ilp32f/ilp32. Known gaps (`__int128`, i64 atomics, i64 varargs, TLS) are intentionally left red. | [../ARCH.md](../ARCH.md) |
+| [DIST_LIBRARY.md](DIST_LIBRARY.md) | Migrating the CAS/package distribution subsystem into libkit as a gated public API (`kit/cas.h`, `kit/package.h`). Main migration shipped; Stage 3 v2 dead-code deletion deferred. | [../DISTRIBUTE.md](../DISTRIBUTE.md) |
+| [FREEBSD.md](FREEBSD.md) | FreeBSD target support: VM harness, triple parsing, runtime variants, COMDAT/`STB_GNU_UNIQUE` fixes. Static link blocked on archive weak-alias cycle (needs `--start-group` semantics); dynamic link and full VM validation remaining. | — |
| [TODO.md](TODO.md) | Open deferred fixes and code smells only. Completed items are removed instead of checked off. Not a roadmap; a current backlog. | — |