commit bad3d6b438f65e991b1aaa1845c0aeb7651158a0
parent b499c2351e3bc118a107a8693d35413c6e8e0db3
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Fri, 5 Jun 2026 15:03:38 -0700
doc/plan: rescope RELOC.md after the modularity waves landed
The reloc-path identity switches are already gone (LinkArchDesc.tpoff64_reloc
for #25, obj_format_static_ifunc_via_rela_iplt for #18) and the reloc-name
table moved onto ObjElfArchOps.reloc_name (#24, still x86_64-ELF-gated for
objdump golden compat). Mark those as landed and rescope the open work to the
structural denormalization that remains: a per-arch RelocDesc {width, flags}
table replacing the generic reloc_width / reloc_uses_got / reloc_is_tls_got
switches (and the duplicating LinkArchDesc.is_* hooks), and partitioning
link_reloc_apply's encoders into the arch backends behind the single entry.
Diffstat:
2 files changed, 231 insertions(+), 260 deletions(-)
diff --git a/doc/plan/README.md b/doc/plan/README.md
@@ -11,7 +11,7 @@ shrinks to whatever remains open.
| [RELEASE.md](RELEASE.md) | Cross-cutting initial-release punchlist: release scope, deferred features, and per-subsystem completion/validation items. | — |
| [OPTIMIZER.md](OPTIMIZER.md) | Completing the O2 SSA mid-end, expanded inlining, -O0/-O1 performance work, machine register-constraint improvements. | [../OPT.md](../OPT.md) |
| [LINKER.md](LINKER.md) | Incremental linking: the file-based object-link redesign and remaining non-ELF format coverage. | [../LINK.md](../LINK.md) |
-| [RELOC.md](RELOC.md) | Genericizing the canonical-`RelocKind` half of the relocation layer: one per-arch `RelocDesc` table replacing the parallel width/GOT/name switches, the byte-patcher's encoders moved into the arch backends behind the single public entry, and the residual arch/OS identity gates removed. | [../OBJ.md](../OBJ.md), [../LINK.md](../LINK.md) |
+| [RELOC.md](RELOC.md) | Genericizing the canonical-`RelocKind` half of the relocation layer. The arch-identity switches and the reloc-name table already landed (modularity waves); what remains is the structural denormalization — one per-arch `RelocDesc {width, flags}` table replacing the parallel width/GOT switches, and the byte-patcher's encoders moved into the arch backends behind the single public entry. | [../OBJ.md](../OBJ.md), [../LINK.md](../LINK.md) |
| [JIT.md](JIT.md) | Function-level hot reload, Go-runtime-style codegen support, and remaining JIT host-portability work. | [../JIT.md](../JIT.md) |
| [DEBUG.md](DEBUG.md) | The Windows debugger host adapter, x64/rv64 displaced single-step, profiling, and DWARF gaps. | [../DBG.md](../DBG.md), [../DWARF.md](../DWARF.md) |
| [WASM.md](WASM.md) | Completing the Wasm object backend and remaining parser/validator coverage. | [../WASM.md](../WASM.md) |
diff --git a/doc/plan/RELOC.md b/doc/plan/RELOC.md
@@ -1,151 +1,153 @@
# Relocation-layer genericization (planned work)
-## Status — 2026-06-05 — proposed; nothing built yet
+## Status — 2026-06-05 — partially landed; descriptor + encoder-partition remain
This roadmap makes the **canonical-`RelocKind` half** of the relocation subsystem
as modular as the wire half already is. The goal is the project's standing
contract (see [../INTERFACES.md](../INTERFACES.md)): code that depends on a
pluggable item — here, the target **arch** — must never switch on its identity,
and adding or changing an arch's relocations must touch exactly **one place**.
-Today the wire-translation half meets that bar; the canonical half does not.
+
+The "modularity wave" commits (`9d905b3c..769d6ae1`) already closed the two
+identity *switches* in the reloc path and moved the reloc-name table onto a
+per-arch hook, all via the incremental capability-hook style (narrow fields/hooks
+on the existing `LinkArchDesc` / `ObjElfArchOps` vtables). **What remains is the
+structural denormalization**: the per-kind static facts (width, GOT/TLS class) are
+still re-enumerated in generic switches, and the byte-patcher's ISA encoders still
+live in the format-neutral obj layer. This revision marks the landed items as
+baseline and rescopes the open work accordingly.
Design docs this work feeds back into once shipped:
[../OBJ.md](../OBJ.md) ("Relocation model and the shared byte-patcher"),
[../LINK.md](../LINK.md) (the reloc passes), [../INTERFACES.md](../INTERFACES.md)
(the backend contract).
-## The thesis
-
-A relocation kind is a single logical entity with a handful of attributes — byte
-width, whether it is PC-relative, whether it loads a GOT slot, whether it is a
-TLS-GOT load, whether it is a branch needing a veneer, its display name, and how
-to patch its bytes. Today those attributes are **denormalized across five
-parallel `switch`/hook tables** that the compiler cannot keep in sync:
-
-| Attribute | Lives in | Form |
-|-----------|----------|------|
-| how to patch the bytes | `link_reloc_apply()` `src/obj/reloc_apply.c:83` | switch, 77 arms |
-| byte width | `reloc_width()` `src/link/link_reloc_layout.c:256` | switch |
-| uses GOT / is TLS-GOT | `reloc_uses_got()` / `reloc_is_tls_got()` `src/link/link_reloc_layout.c:392,380` | switch |
-| display name | `kit_obj_reloc_kind_name()` `src/api/object_file.c:358` | switch, keyed on (arch, fmt) |
-| branch / got-load / tlvp / direct-page | `LinkArchDesc.is_*` `src/link/link_arch.h:79-83` | per-arch hooks |
-
-Adding one relocation kind means editing up to four of these by hand, with no
-diagnostic if you miss one. Adding an **arch** means editing the three generic
-switches in `link_reloc_layout.c` and `api/object_file.c` even though a per-arch
-hook mechanism (`LinkArchDesc`) already exists right beside them. And because the
-canonical enum **arch-prefixes value-class kinds** that are byte-identical across
-arches (`R_X64_TPOFF64` vs `R_AARCH64_TPOFF64`), generic code is forced into a
-literal arch-identity switch to pick between them:
-
-```c
-/* src/link/link_reloc_layout.c:698 — the one arch-identity leak in the reloc layer */
-rrec.kind = (l->c->target.arch == KIT_ARCH_X86_64) ? R_X64_TPOFF64
- : R_AARCH64_TPOFF64;
-```
-
-This plan normalizes the model: **one per-arch descriptor table** is the single
-source for every static attribute of a relocation kind, the byte-patcher's
-implementation moves to the arch backends that own the ISA knowledge (behind the
-same single public entry), and the two residual OS/format identity gates in the
-reloc path move to obj-layer predicates.
+## Landed since this plan was first written (`9d905b3c..769d6ae1`)
+
+- **The one arch-identity switch is gone (was finding #25).** The
+ `(target.arch == KIT_ARCH_X86_64) ? R_X64_TPOFF64 : R_AARCH64_TPOFF64` ternary in
+ `link_emit_internal_tpoff64` is now `link_arch_desc_for(l->c)->tpoff64_reloc`, a
+ new per-arch `LinkArchDesc` field (`src/link/link_arch.h`, populated in
+ `src/arch/{aa64,x64,riscv}/link.c`). This is WS-A's *functional* fix via the
+ field route rather than the value-class collapse — the collapse remains an
+ optional cleanup (now WS-A below, downgraded).
+- **The FreeBSD static-IFUNC OS gate is gone (was finding #18).** `use_rela_iplt`
+ now calls `obj_format_static_ifunc_via_rela_iplt(c)` (`src/obj/obj.h:819`, impl
+ `src/obj/obj_secnames.c:371`) instead of `os == KIT_OS_FREEBSD && obj ==
+ KIT_OBJ_ELF`. WS-E item 1 is **done**.
+- **The reloc-name table moved to a per-arch hook (was finding #24, partially).**
+ `kit_obj_reloc_kind_name` no longer inlines an x86_64 table; it lowers the
+ canonical kind via `reloc_to` and calls the new `ObjElfArchOps.reloc_name`
+ (`src/obj/format.h:65`; impls `elf_{x86_64,aarch64,riscv}_reloc_name`). **But**
+ the dispatch is still gated `if (fmt != KIT_OBJ_ELF || arch != KIT_ARCH_X86_64)
+ return NULL;` (`src/api/object_file.c:384`): the aarch64/riscv `reloc_name`
+ functions exist but are deliberately *not* consulted, because the rv64/aa64
+ objdump golden corpus expects the arch-neutral spelling ("RV_CALL", not
+ "R_RISCV_CALL"). So the name *table* is now per-arch data, but a residual
+ two-axis identity gate remains, coupled to the test corpus. See WS-E item 3.
+
+Net: the reloc path now contains **no arch-identity branch**, but still
+denormalizes per-kind facts across generic switches (the structural work below).
+
+## The thesis (what still stands)
+
+A relocation kind is a single logical entity. Its static attributes still live in
+parallel tables the compiler cannot keep in sync:
+
+| Attribute | Lives in | Status |
+|-----------|----------|--------|
+| how to patch the bytes | `link_reloc_apply()` `src/obj/reloc_apply.c:83` (switch, ~77 arms) | **open** — WS-C |
+| byte width | `reloc_width()` `src/link/link_reloc_layout.c:256` (switch) | **open** — WS-B |
+| uses GOT / is TLS-GOT | `reloc_uses_got()`/`reloc_is_tls_got()` `src/link/link_reloc_layout.c:392,380` (switch) | **open** — WS-B |
+| branch / got-load / tlvp / direct-page | `LinkArchDesc.is_*` `src/link/link_arch.h:79-82` (per-arch hooks) | duplicated by the above — WS-B |
+| display name | `ObjElfArchOps.reloc_name` `src/obj/format.h:65` (per-arch hook) | **landed** (with a residual gate — WS-E.3) |
+
+Two generic switches (`reloc_width`, `reloc_uses_got`/`is_tls_got`) still enumerate
+every arch's kinds, so adding an arch's relocation edits generic `link` code; and
+the GOT/branch classification is *answered twice* — once by those generic switches
+(consumed by the ELF/static GOT pass) and once by the per-arch `LinkArchDesc.is_*`
+hooks (consumed by the Mach-O linker). The byte-patcher's per-kind encoders — pure
+ISA knowledge — still sit in the format-neutral `src/obj/reloc_apply.c`.
## Baseline — already clean (context, not work)
-The **wire half is the model to imitate** and is out of scope except where noted:
-
-- **Per-(arch,format) wire translators.** `reloc_to`/`reloc_from` (and Mach-O's
- `reloc_pcrel`/`reloc_length`) live in `src/obj/{elf,macho,coff}/reloc_<arch>.c`
- and are reached only through the format sub-ops (`fmt->elf_arch(arch)->reloc_to`,
- etc., `src/obj/format.h:36-67`). Adding a format, or an arch's wire encoding for
- a format, is already a one-table change. These do **not** move.
-- **The single-entry byte-patcher boundary.** `link_reloc_apply(c, kind, P, S, A,
- P)` is reused verbatim by the static linker, the JIT linker, the assembler, and
- the emulator guest loader ([../OBJ.md](../OBJ.md): "one encoder, three loaders").
- That **one-entry, one-encoder invariant is load-bearing** — it is why the three
- loaders can never disagree on an encoding — and WS-D preserves it exactly: only
- the *implementation behind* the entry is partitioned, never the entry itself.
-- **`LinkArchDesc` per-arch PLT/IPLT geometry + stub emitters**
- (`src/link/link_arch.h`) is already the right shape and stays; WS-C/WS-D extend
- it, they do not replace it.
-- **The canonical `RelocKind` enum** (`src/obj/obj.h:108`) as a concept — one
- global enum, backends emit canonical kinds — is correct and stays. WS-A only
- removes the *spurious arch-prefixed duplicates*, not the per-arch families that
- are genuinely arch-specific (AArch64 ADRP/LDST immediates, RISC-V HI20/B-type
- scatter, etc.).
+- **Per-(arch,format) wire translators** (`reloc_to`/`reloc_from`/`reloc_pcrel`/
+ `reloc_length`, and now `reloc_name`) in `src/obj/{elf,macho,coff}/reloc_<arch>.c`,
+ reached only through the format sub-ops (`src/obj/format.h:55-81`). Adding a format
+ or an arch's wire encoding is a one-table change. These do **not** move; the
+ per-arch reloc *name* legitimately belongs here, not in the descriptor below.
+- **The single-entry byte-patcher boundary.** `link_reloc_apply(c, kind, P, S, A, P)`
+ is reused verbatim by the static linker, JIT linker, assembler, and emulator guest
+ loader ([../OBJ.md](../OBJ.md): "one encoder, three loaders"). That **one-entry,
+ one-encoder invariant is load-bearing** and WS-C preserves it: only the
+ implementation behind the entry is partitioned, never the entry.
+- **`LinkArchDesc`** already carries per-arch PLT/IPLT geometry, stub emitters, the
+ `is_*` classifiers, and now `tpoff64_reloc`. It is the proven home for per-arch
+ link facts; WS-B extends it (or a descriptor it points to), it does not replace it.
+- **The canonical `RelocKind` enum** (`src/obj/obj.h:108`) — one global enum,
+ backends emit canonical kinds — is correct and stays.
## The end state (ownership)
```
-src/obj/reloc.c neutral core: RelocDesc rows + byte encoders for
- the arch-independent data-word kinds (R_ABS*,
- R_REL*, R_PC*, R_TPOFF*, R_GOT32, R_PLT32), and
- the single public link_reloc_apply() dispatcher.
-src/arch/<arch>/reloc.c (NEW) that arch's RelocDesc rows (width + flags + name)
- AND its instruction-immediate byte encoders,
- registered on LinkArchDesc.
-src/obj/<fmt>/reloc_<arch>.c UNCHANGED — the per-(arch,fmt) wire translators.
-src/obj/coff/reloc.c COFF-specific kinds' RelocDesc rows (format, not arch).
+src/obj/reloc.c / reloc_apply.c neutral core: byte encoders for the
+ arch-independent data-word kinds (R_ABS*,
+ R_REL*, R_PC*, R_TPOFF*, R_GOT32, R_PLT32) and
+ the single public link_reloc_apply() dispatcher.
+src/arch/<arch>/reloc.c (NEW) that arch's RelocDesc rows (width + class flags)
+ AND its instruction-immediate byte encoders,
+ reached via LinkArchDesc.
+src/obj/<fmt>/reloc_<arch>.c UNCHANGED — the per-(arch,fmt) wire translators,
+ incl. the reloc_name spellings (already landed).
+src/obj/coff/reloc.c COFF-specific kinds' RelocDesc rows (format, not arch).
```
-After this, adding an arch's relocation is **one row** in that arch's `reloc.c`
-(plus its byte encoder and its wire translator, both already arch-local); adding
-an arch is one new `src/arch/<arch>/reloc.c`. No generic file in `src/link` or
-`src/api` enumerates relocation kinds any more.
+After this, adding an arch's relocation is **one row** (width + flags) in that
+arch's `reloc.c`, one byte encoder beside it, and one wire-translator entry — all
+arch-local. No generic file in `src/link` or `src/api` enumerates relocation kinds.
---
-## WS-A — Neutralize cross-arch duplicate value-class kinds (addresses **A**)
-
-**Problem.** `R_X64_TPOFF64` and `R_AARCH64_TPOFF64` are byte-identical in every
-table — same apply arm (`reloc_apply.c:97-107`, both `wr_u64_le(S+A)`), same
-width, same meaning ("64-bit TP-relative offset, written as a data word"). They
-are separate only because the names carry an arch prefix, and that split is what
-forces the arch-identity ternary at `link_reloc_layout.c:698`. `R_AARCH64_TPOFF64`
-is **link-internal only** (no wire mapping; it is minted solely by
-`link_emit_internal_tpoff64` to fill IE GOT slots), and `R_X64_TPOFF64` is the
-x86-64 wire kind that also doubles as that internal fill — so a single neutral
-kind serves both.
-
-**Change.** Introduce a neutral `R_TPOFF64` and delete both arch-prefixed forms.
-
-1. `src/obj/obj.h:108` — add `R_TPOFF64` to the neutral block (near `R_ABS64`);
- remove `R_X64_TPOFF64` and `R_AARCH64_TPOFF64`.
-2. `src/link/link_reloc_layout.c:698-699` — collapse the ternary to
- `rrec.kind = R_TPOFF64;`. **The arch-identity switch is gone.** Update the
- `link_emit_internal_tpoff64` banner comment (`:680-685`) to drop the per-arch
- spelling.
-3. `src/obj/reloc_apply.c:97-99` and `reloc_width()` `:272-279` — fold both old
- cases into the `R_TPOFF64` arm (unchanged body).
-4. `src/obj/elf/reloc_x86_64.c:50-51,109-110` — map `R_TPOFF64 ↔
- ELF_R_X86_64_TPOFF64` (the x64 table is the only one that serializes it; the
- aa64 table needs no entry, matching today).
-5. `src/obj/elf/link.c:352` and `src/obj/obj.c` (`_CASE(R_AARCH64_TPOFF64)` in
- `obj_reloc_kind_name`) — rename to `R_TPOFF64`.
-
-**Also audit** the rest of the enum for the same smell — arch-prefixed kinds whose
-apply/width/class are identical to a neutral kind — and neutralize any found. The
-known-good neutral set already present is `R_ABS32/64`, `R_REL32/64`, `R_PC32/64`,
-`R_GOT32`, `R_PLT32`; TPOFF64 is the one clear remaining duplicate. Kinds that are
-genuinely arch-specific (instruction-embedded immediates) **must not** be touched.
+## WS-A — Value-class kind collapse (addresses **A**) — *#25 done; collapse optional*
+
+**Status.** The identity switch (#25) is **fixed** via `LinkArchDesc.tpoff64_reloc`.
+What remains is the underlying naming smell, now *optional* and lower-value: the
+canonical enum still carries two byte-identical 64-bit-tpoff kinds, and RISC-V
+reuses the AArch64-named one cross-arch (`src/arch/riscv/link.c:131,149:
+.tpoff64_reloc = R_AARCH64_TPOFF64`).
+
+**Optional cleanup.** Collapse `R_X64_TPOFF64` + `R_AARCH64_TPOFF64` → a neutral
+`R_TPOFF64` (apply arm is shared already, `reloc_apply.c:98-99`). This additionally
+**retires the `tpoff64_reloc` field** — once all three arches name the same kind,
+`link_emit_internal_tpoff64` just writes `R_TPOFF64` and the per-arch field has no
+remaining variation. Touch-sites: `obj.h:198,284` (enum), `reloc_apply.c:98-99` +
+`reloc_width` (fold arms), `obj/elf/reloc_x86_64.c` (`R_TPOFF64 ↔
+ELF_R_X86_64_TPOFF64`; aa64 stays wire-less), `obj/elf/link.c:352,388` (the two
+arch-specific tpoff-classification helpers — verify the variant-I/II *coordinate*
+selection there keys on the ABI/arch context, not on the kind name, before
+merging), and `arch/{aa64,x64,riscv}/link.c` (drop `.tpoff64_reloc`).
+
+**Defer unless** doing WS-B/C anyway — it is pure tidiness now and best folded into
+that pass (the descriptor work touches the same enum + apply arms). No urgency:
+there is no remaining identity switch here.
**Oracle.** `make test-link test-elf test-smoke-x64 test-smoke-rv64
-test-aa64-inline`, then a TLS-exercising `test-toy` slice, then `make bootstrap`
-(IE-model TLS in the compiler's own source patches through `R_TPOFF64`).
-This WS removes confirmed audit finding **#25** and is self-contained — ship it
-first.
+test-aa64-inline` + a TLS `test-toy` slice + `make bootstrap` (IE-model TLS).
---
-## WS-B — A single per-arch `RelocDesc` table (addresses **C**, foundation for **B**)
+## WS-B — One per-arch `RelocDesc {width, flags}` table (addresses **B + C**)
-**Problem.** `reloc_width()`, the GOT classifiers, and `kit_obj_reloc_kind_name()`
-are three generic switches each re-enumerating every arch's kinds. `#24` (the
-reloc-name table gated on `fmt == ELF && arch == X86_64` at `object_file.c:358`)
-is the worst of them — an identity switch on *two* axes in format-neutral API code.
+**Problem.** `reloc_width()` and `reloc_uses_got()`/`reloc_is_tls_got()` are generic
+switches re-enumerating every arch's kinds, and the GOT/branch classification is
+answered *twice* (those switches vs the per-arch `LinkArchDesc.is_*` hooks). Adding
+an arch's reloc edits generic `link_reloc_layout.c`; the two classification
+mechanisms can silently disagree.
-**Change.** Introduce one descriptor, owned per-arch, as the single source of a
-kind's static facts.
+**Change.** One descriptor, owned per-arch, as the single source of a kind's static
+*structural* facts. **Name is excluded** — it already landed on the per-arch wire
+ops (`ObjElfArchOps.reloc_name`), which is its correct home; the descriptor carries
+only width + classification.
```c
/* src/obj/reloc.h (new) */
@@ -153,187 +155,156 @@ typedef enum RelocDescFlag {
RELOC_PCREL = 1u << 0,
RELOC_USES_GOT = 1u << 1,
RELOC_IS_TLS_GOT = 1u << 2,
- RELOC_IS_BRANCH = 1u << 3, /* needs a JIT/range veneer */
+ RELOC_IS_BRANCH = 1u << 3, /* needs a JIT/range veneer (== needs_jit_call_stub) */
RELOC_IS_TLVP = 1u << 4, /* Mach-O TLV page/pageoff */
RELOC_DIRECT_PAGE = 1u << 5, /* Mach-O ADRP-direct */
RELOC_MARKER = 1u << 6, /* RELAX/ALIGN/TPREL_ADD — no bytes */
RELOC_WIDTH_DYN = 1u << 7, /* ULEB128 — width read from bytes at apply */
} RelocDescFlag;
-typedef struct RelocDesc {
- u8 width; /* 0 only for R_NONE; markers/dyn use a sentinel */
- u8 flags; /* RelocDescFlag bitmask */
- const char* name; /* canonical spelling, e.g. "AARCH64_CALL26" */
-} RelocDesc;
+typedef struct RelocDesc { u8 width; u8 flags; } RelocDesc;
-/* Single lookup; caller always holds the target arch via c. */
-const RelocDesc* reloc_desc(const Compiler* c, RelocKind k);
+const RelocDesc* reloc_desc(const Compiler* c, RelocKind k); /* caller holds target arch */
```
-**Ownership / assembly.** `reloc_desc()` resolves the neutral core kinds from a
-table in `src/obj/reloc.c`; for arch-family kinds it dispatches to
-`link_arch_desc_for(c)->reloc_desc(k)` (a new `LinkArchDesc` hook returning that
-arch's slice); COFF-family kinds resolve from a COFF-format slice. To add an arch
-you add its slice in `src/arch/<arch>/reloc.c` — no generic edit.
+**Ownership / assembly.** `reloc_desc()` resolves neutral-core kinds from a table in
+`src/obj/reloc.c`; arch-family kinds dispatch to `link_arch_desc_for(c)->reloc_desc(k)`
+(a new `LinkArchDesc` hook returning that arch's slice, the same shape as the
+existing `is_*`/`tpoff64_reloc` entries); COFF-family kinds resolve from a COFF slice.
+Adding an arch is one slice in `src/arch/<arch>/reloc.c` — no generic edit.
-**Migrate consumers to the descriptor:**
+**Migrate consumers, then delete the generic switches:**
- `reloc_width()` (`link_reloc_layout.c:256`) → delete; callers read
- `reloc_desc(c, k)->width`. Keep the `RELOC_WIDTH_DYN` sentinel + the existing
- ULEB128 offset-bounds guard (`link_reloc_layout.c:1118-1126`).
-- `kit_obj_reloc_kind_name()` (`object_file.c:358`) → delete the (arch,fmt) switch;
- return `reloc_desc(file's c, k)->name`. Folds **#24**. The pre-existing neutral
- `obj_reloc_kind_name` (`obj.h:559`) becomes the fallback for kinds with no
- per-arch spelling.
-
-**Exhaustiveness test (the red-green anchor).** Add `test/obj/reloc_desc` that
-iterates **every** `RelocKind` value for each enabled arch and asserts
-`reloc_desc()` returns a row (non-NULL, `width != 0` unless `MARKER`/`WIDTH_DYN`).
-This converts "forgot a row when adding a kind" from a silent runtime default into
-a failing test — the durable guard that keeps the table honest. Seed it red by
-writing the test before the table is complete.
-
-**Oracle.** The new exhaustiveness test, then `make test-link test-elf test-macho
-test-ar`, then `make bootstrap` (byte-identity catches any width/name drift).
-
----
-
-## WS-C — Route classification through the descriptor (addresses **B**)
-
-**Problem.** The generic GOT-layout pass hand-maintains `reloc_uses_got()` /
-`reloc_is_tls_got()` (`link_reloc_layout.c:380-404`) enumerating every arch's GOT
-relocs, while the Mach-O linker asks the per-arch `LinkArchDesc.is_got_load_reloc`
-/ `is_branch_reloc` / `is_tlvp_reloc` / `is_direct_page_reloc` hooks
-(`src/obj/macho/link.c`) for the *same* classification. Two mechanisms, one
-question, and the generic one leaks the arch enumeration into shared link code.
-
-**Change.** With WS-B's descriptor in place, classification is just a flag read:
-- `reloc_uses_got(k)` → `reloc_desc(c,k)->flags & RELOC_USES_GOT`.
-- `reloc_is_tls_got(k)` → `... & RELOC_IS_TLS_GOT`.
-- Delete the four `LinkArchDesc.is_*` hooks (`link_arch.h:79-82`) and their
- per-arch impls in `src/arch/{aa64,x64,riscv}/link.c`; the Mach-O linker callers
- (`macho/link.c:420,492,566,1483,1496,1505,1514,1563`) read the descriptor flags
- instead. `needs_jit_call_stub` (still used at `link_reloc_layout.c:594,1095`)
- becomes `RELOC_IS_BRANCH` (today it aliases `is_branch_reloc` on every arch).
-
-End state: **no generic file classifies relocations by enumerating arch kinds.**
-The per-arch knowledge that was split between the `is_*` hooks and the generic
-switches now lives once, as flags, in each arch's descriptor slice.
+ `reloc_desc(c,k)->width`. Keep the `RELOC_WIDTH_DYN` sentinel + the ULEB128
+ offset-bounds guard (`link_reloc_layout.c:1117-1126`).
+- `reloc_uses_got()`/`reloc_is_tls_got()` (`link_reloc_layout.c:392,380`) → delete;
+ the GOT pass reads `reloc_desc(c,k)->flags & RELOC_USES_GOT / RELOC_IS_TLS_GOT`.
+- The four `LinkArchDesc.is_*` hooks (`link_arch.h:79-82`) + their impls in
+ `src/arch/{aa64,x64,riscv}/link.c` → delete; the Mach-O linker callers
+ (`src/obj/macho/link.c:420,492,566,1483,1496,1505,1514,1563`) read descriptor
+ flags. `needs_jit_call_stub` (`link_reloc_layout.c:594,1095`) → `RELOC_IS_BRANCH`
+ (it aliases `is_branch_reloc` on every arch today).
+
+End state: **no generic file classifies or sizes relocations by enumerating arch
+kinds, and each fact has exactly one source** — width/flags in the descriptor,
+name on the wire ops.
+
+**Exhaustiveness test (the red-green anchor).** Add `test/obj/reloc_desc` iterating
+**every** `RelocKind` for each enabled arch, asserting `reloc_desc()` returns a row
+(`width != 0` unless `MARKER`/`WIDTH_DYN`). Cross-check that, for every kind the old
+`reloc_width()` covered, the descriptor returns the *same* width (a migration guard).
+This makes "forgot a row" a failing test instead of a silent default. Write it red
+first.
-**Oracle.** `make test-link test-macho` (Mach-O exercises every `is_*` path),
-`test-smoke-x64 test-smoke-rv64`, `make bootstrap`. macOS/aa64 bootstrap is the
-strongest check here since it drives the Mach-O GOT/TLVP/branch classifiers.
+**Oracle.** The exhaustiveness/migration test, then `make test-link test-elf
+test-macho test-ar test-smoke-x64 test-smoke-rv64`, then `make bootstrap`
+(macOS/aa64 bootstrap drives the Mach-O GOT/TLVP/branch classifiers that the `is_*`
+deletion touches; byte-identity catches any width drift).
---
-## WS-D — Partition the byte-patcher per-arch behind the single entry (addresses **D**)
+## WS-C — Partition the byte-patcher per-arch behind the single entry (addresses **D**)
**Problem.** `src/obj/reloc_apply.c` lives in the format-neutral obj layer but
encodes pure ISA knowledge — AArch64 imm19/imm26/ADRP page math, RISC-V U/I/S/B/J
-immediate scatter and the 0x800 HI20 bias, x64 field writes. Adding an arch means
-editing this shared file; the encoders belong in the backends, alongside that
-arch's MC emitter and wire translator (consistent with `link_arch.h`: "each
-backend's descriptor lives under `src/arch/<arch>/`").
+immediate scatter and the 0x800 HI20 bias, x64 field writes. Adding an arch edits
+this shared file; the encoders belong in the backends, beside that arch's MC emitter
+and (post-WS-B) its `reloc.c` descriptor slice.
**Constraint (must not break).** `link_reloc_apply(c, kind, ...)` stays the **one
-public entry**, called unchanged by all four loaders (`asm.c:1296`, `emu/dl.c:15`,
-`link_jit.c`, `elf/macho/coff/link.c`). The "one encoder, three loaders" invariant
-in [../OBJ.md](../OBJ.md) is preserved — there is still exactly one encoder per
-kind; it just moves to the owning backend.
+public entry**, called unchanged by all four loaders (`src/asm/asm.c:1296`,
+`src/emu/dl.c:15`, `src/link/link_jit.c`, `src/obj/{elf,macho,coff}/link.c`). The
+"one encoder, three loaders" invariant ([../OBJ.md](../OBJ.md)) is preserved — there
+is still exactly one encoder per kind; it moves to the owning backend.
**Change.**
-1. Keep `link_reloc_apply` in `src/obj/reloc.c` as the dispatcher. It handles the
+1. Keep `link_reloc_apply` in `src/obj/reloc.c` as the dispatcher; it handles the
**arch-neutral data-word arms inline** (`R_ABS32/64`, `R_REL*/PC*`, `R_TPOFF*`,
- `R_GOT32`, `R_PLT32` data writes, the ULEB128 codec) — these are plain
- `wr_uN_le` with no ISA knowledge and have no reason to live per-arch.
-2. For instruction-embedded kinds, dispatch to a new
- `LinkArchDesc.reloc_apply_insn(c, k, P, S, A, P)` hook. Move the AArch64 arms
- to `src/arch/aa64/reloc.c`, the RISC-V arms to `src/arch/riscv/reloc.c`, and
- the x64 instruction arms (e.g. `R_X64_PC8`) to `src/arch/x64/reloc.c`. The
- dispatcher selects the hook via `link_arch_desc_for(c)` — `c` (hence
- `target.arch`) is available at every call site, verified across all callers.
-3. COFF-specific kinds (`R_COFF_*`) route to a COFF encoder slice.
-
-The arch `reloc.c` files created in WS-B (descriptor slices) become the natural
-home for these encoders too — each backend's `reloc.c` owns {desc rows, classifier
-flags, byte encoders} for its kinds, one file per arch.
-
-**Oracle.** This is the highest-blast-radius WS; lean on the exhaustiveness test +
-full matrix: `make test-link test-elf test-macho test-isa test-asm test-smoke-x64
-test-smoke-rv64 test-aa64-inline`, the JIT/emu reloc paths (`test-cg-api`, any
+ `R_GOT32`, `R_PLT32`, the ULEB128 codec) — plain `wr_uN_le`, no ISA knowledge.
+2. Instruction-embedded kinds dispatch to a new `LinkArchDesc.reloc_apply_insn(c, k,
+ P, S, A, P)` hook. Move the AArch64 arms to `src/arch/aa64/reloc.c`, RISC-V to
+ `src/arch/riscv/reloc.c`, x64 instruction arms (`R_X64_PC8`) to
+ `src/arch/x64/reloc.c`. `c` (hence `target.arch`) is available at every call site
+ (verified: all `link_reloc_apply` callers pass a `Compiler*`).
+3. COFF-specific kinds route to a COFF encoder slice.
+
+Each backend's `reloc.c` then owns {desc rows (WS-B), class flags (WS-B), byte
+encoders (WS-C)} for its kinds — one file per arch.
+
+**Oracle.** Highest blast radius; lean on the WS-B exhaustiveness test + the full
+matrix: `make test-link test-elf test-macho test-isa test-asm test-smoke-x64
+test-smoke-rv64 test-aa64-inline`, the JIT/emu reloc paths (`test-cg-api`, a
`run`/`emu` smoke), then **both** bootstrap chains (`make bootstrap-debug
bootstrap-release`) — byte-identity over the compiler's own object output is the
-definitive proof no encoding shifted. Do WS-D last and in one arch at a time
-(neutral-core extraction first, then aa64, then x64, then rv), keeping the old
-switch arms live until each arch's hook is proven, so every step is bisectable.
+definitive proof no encoding shifted. Do this last, one arch at a time (neutral-core
+extraction first, then aa64, x64, rv), keeping old switch arms live until each arch's
+hook is proven, so every step bisects to one arch.
---
-## WS-E — Remove the residual OS/format identity gates in the reloc path (addresses **E**)
-
-Two non-arch identity checks remain in `link_reloc_layout.c`:
-
-1. **FreeBSD static-IFUNC mechanism** (`:833-834`, audit finding **#18**):
- ```c
- int use_rela_iplt = l->emit_static_exe && l->c->target.os == KIT_OS_FREEBSD &&
- l->c->target.obj == KIT_OBJ_ELF;
- ```
- "Does this OS's crt walk `[__rela_iplt_start, __rela_iplt_end)` before ctors"
- is an OS/crt-personality property (the in-code comment notes glibc shares it).
- Replace with an obj-layer predicate `obj_format_static_ifunc_via_rela_iplt(c)`
- living beside the existing `obj_format_*` policy family (`src/obj/obj.h:686`),
- where the `(FREEBSD, ELF)` knowledge legitimately resides. A future libc sharing
- the mechanism is then a one-line table change.
-
-2. **IRELATIVE wire type via hardcoded `KIT_OBJ_ELF`** (`link_elf_irelative_type`,
- `:808-813`): `obj_format_lookup(KIT_OBJ_ELF)->elf_arch(arch)->r_irelative`. Low
- priority — it is already `use_rela_iplt`-gated and `.rela.plt` is intrinsically
- ELF — but fold it under the WS-E predicate so the generic pass names no format
- constant directly: have the predicate return both the boolean and the resolver
- reloc, or query the resolved format rather than the literal `KIT_OBJ_ELF`.
-
-**Oracle.** `make test-link`, the FreeBSD IFUNC path via the FreeBSD VM lane
-(`scripts/freebsd_vm.sh` / `test-toy-freebsd-vm`; see
-[FREEBSD.md](FREEBSD.md)), and a musl/freestanding static-IFUNC case to confirm
-the ctor path is unchanged.
+## WS-E — Residual format gates (addresses **E**) — *item 1 done; items 2–3 open, low priority*
+
+1. **FreeBSD static-IFUNC mechanism (#18).** **Done** — now
+ `obj_format_static_ifunc_via_rela_iplt(c)` (`src/obj/obj_secnames.c:371`).
+2. **IRELATIVE wire type via hardcoded `KIT_OBJ_ELF`.** Still open:
+ `link_elf_irelative_type` (`src/link/link_reloc_layout.c:807`) does
+ `obj_format_lookup(KIT_OBJ_ELF)->elf_arch(arch)->r_irelative`. Low priority — it
+ is `use_rela_iplt`-gated and `.rela.plt` is intrinsically ELF — but fold it under
+ the WS-E.1 predicate so the generic pass names no format constant: have
+ `obj_format_static_ifunc_via_rela_iplt` (or a sibling) also surface the resolver
+ reloc, or query the *resolved* format rather than the literal `KIT_OBJ_ELF`.
+3. **`reloc_name` dispatch gate (#24 residual).** `kit_obj_reloc_kind_name`
+ (`src/api/object_file.c:384`) still guards `fmt != KIT_OBJ_ELF || arch !=
+ KIT_ARCH_X86_64`, suppressing the already-implemented `elf_aarch64_reloc_name` /
+ `elf_riscv_reloc_name` to preserve the rv64/aa64 objdump golden corpus
+ ("RV_CALL" vs "R_RISCV_CALL"). Closing it means: drop the gate to
+ `if (fmt != KIT_OBJ_ELF) return NULL;` (or remove it entirely once macho/coff
+ `reloc_name` exist), then **refresh the affected objdump golden files** to the
+ format-canonical spellings. Test-corpus-coupled, so schedule it deliberately;
+ note the coupling in `doc/plan/TODO.md` if not done with this pass.
+
+**Oracle.** `make test-link`; for item 3, `make test-tools`/the objdump corpus
+(expect golden churn — review the diff is purely the reloc spelling); item 2 via the
+FreeBSD VM lane (`scripts/freebsd_vm.sh` / `test-toy-freebsd-vm`, see
+[FREEBSD.md](FREEBSD.md)).
---
## Sequencing & risk
-Execution order is dependency-sound and each step is independently shippable:
-
-1. **WS-A** — small, self-contained, removes finding #25. No dependency. Ship first.
-2. **WS-B** — builds the descriptor + the exhaustiveness test (the safety net the
- rest leans on). Folds #24.
-3. **WS-C** — consumes WS-B's flags; deletes the generic classifiers and the
- `LinkArchDesc.is_*` hooks. Folds #18's sibling smell.
-4. **WS-D** — the deep refactor; gated behind WS-B's test, done one arch at a time.
-5. **WS-E** — independent of A–D; can land any time, grouped here for topicality.
-
-**Risk controls.** Every WS is red-green: WS-B's exhaustiveness test is written
-first and fails until each arch's table is complete. The **bootstrap** is the
-load-bearing oracle throughout — it patches every relocation kind the compiler
-emits for its own source, so a byte-identical stage2/stage3 is proof the encoding
-path is unchanged. Per CLAUDE.md, prefer targeted runs (specific arch/format
-suites) during iteration and redirect output to a file; reserve full `make
-bootstrap` for end-of-WS gates. Keep old code paths live beside new ones within a
-WS (especially WS-D, per-arch) so any regression bisects to a single arch's hook.
+1. **WS-B** — the central remaining change: the `RelocDesc {width, flags}` table +
+ exhaustiveness test, deleting both generic switches and the duplicating `is_*`
+ hooks. This is now the highest-value open item (the identity switches are already
+ gone). Fold WS-A's value-class collapse in here since it touches the same enum/arms.
+2. **WS-C** — encoder partition; gated behind WS-B's test, one arch at a time.
+3. **WS-E.2 / WS-E.3** — independent, low priority; land WS-E.3 (golden refresh) on
+ its own so the corpus diff is reviewed in isolation.
+
+**Risk controls.** Every WS is red-green: WS-B's exhaustiveness + width-migration test
+is written first and fails until each arch's slice is complete. The **bootstrap** is
+the load-bearing oracle — it patches every relocation kind the compiler emits for its
+own source, so a byte-identical stage2/stage3 proves the encoding path is unchanged.
+Per CLAUDE.md, prefer targeted suites during iteration (redirect output to a file);
+reserve `make bootstrap` for end-of-WS gates. Keep old paths live beside new within a
+WS (especially WS-C, per-arch) so any regression bisects to one arch's hook.
## Done criteria
-- No file under `src/link/` or `src/api/` enumerates `RelocKind` arms or switches
- on `target.arch` / `target.obj` in the relocation path. (`rg "case R_(AARCH64|X64|RV)_"
- src/link src/api` returns nothing; the `link_reloc_layout.c:698` ternary is gone.)
-- Every relocation static attribute (width, name, GOT/TLS/branch/tlvp class) has
- exactly one source: the per-arch `RelocDesc` slice. The `reloc_width`,
- `reloc_uses_got`, `reloc_is_tls_got`, and `LinkArchDesc.is_*` enumerations are
- deleted; `kit_obj_reloc_kind_name`'s (arch,fmt) switch is deleted.
+- No file under `src/link/` enumerates `RelocKind` arms: `reloc_width`,
+ `reloc_uses_got`, `reloc_is_tls_got`, and the `LinkArchDesc.is_*` hooks are
+ deleted; their consumers read the per-arch `RelocDesc`. (`rg "case R_(AARCH64|X64|RV)_"
+ src/link` returns nothing.)
+- Every relocation static fact has exactly one source: width + class flags in the
+ per-arch `RelocDesc` slice, wire encoding + name in `src/obj/<fmt>/reloc_<arch>.c`.
- `link_reloc_apply` remains the single public byte-patcher entry; its
instruction-encoding arms live in `src/arch/<arch>/reloc.c`, the obj layer keeps
only the arch-neutral data-word arms.
- Adding a hypothetical new arch's relocation touches only that arch's
- `src/arch/<arch>/reloc.c` and its `src/obj/<fmt>/reloc_<arch>.c` wire table —
- verified by the `test/obj/reloc_desc` exhaustiveness test failing until the new
- rows exist, and by no generic file needing edits.
-- `make bootstrap` (debug + release) reaches the byte-identical fixed point; the
- full link/elf/macho/coff/isa/asm/smoke matrix passes.
+ `src/arch/<arch>/reloc.c` and its `src/obj/<fmt>/reloc_<arch>.c` — verified by the
+ `test/obj/reloc_desc` exhaustiveness test failing until the new rows exist, and by
+ no generic file needing edits.
+- (Optional/low-pri) the `tpoff64_reloc` field is retired by the `R_TPOFF64`
+ collapse; the `object_file.c` `reloc_name` gate is removed and the objdump golden
+ corpus refreshed; `link_elf_irelative_type` names no literal format.
+- `make bootstrap` (debug + release) reaches the byte-identical fixed point; the full
+ link/elf/macho/coff/isa/asm/smoke matrix passes.