commit 65545dc2cc2be1f6fb2de2149e2702a267aa7960
parent 7a985cce08172385f81d3fa7e74540ffc3ed037a
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Fri, 5 Jun 2026 16:31:53 -0700
Implement per-arch RelocDesc table + residual gates (RELOC WS-B, WS-E.2/E.3)
WS-B — make the canonical-RelocKind half of the relocation subsystem as
modular as the wire half. Replace the generic reloc_width / reloc_uses_got /
reloc_is_tls_got switches in link_reloc_layout.c, link_jit's
jit_reloc_width_local, and the five per-arch LinkArchDesc.is_* hooks with a
single per-arch RelocDesc {u8 width; u8 flags} resolved arch-aware by
reloc_desc(c, k):
- neutral data-word kinds in src/obj/reloc.{h,c} (pure obj-core);
- per-arch slices in src/arch/{aa64,x64,riscv}/reloc.c via a new
LinkArchDesc.reloc_desc hook;
- dispatcher + reloc_kind_* predicates in src/link/link_reloc_desc.{h,c}
(kept in src/link, not obj-core, since resolving the slice needs
link_arch_desc_for()).
All consumers (the GOT / JIT-stub / width passes and the nine Mach-O is_*
call sites) now read one descriptor flag. Adding an arch's relocation is one
row in that arch's slice. Migration guard: test/link/reloc_desc_test.c pins
behavioural equivalence to frozen oracle_* snapshots of the deleted code
across every kind x every backend arch (3016 checks).
WS-E.2 — the static-IFUNC __rela_iplt IRELATIVE wire type no longer names a
literal KIT_OBJ_ELF: link_elf_irelative_type is gone, replaced by
obj_format_static_ifunc_irelative_type(c) (sibling of the WS-E.1 predicate),
which resolves through the target object format.
WS-E.3 — drop the arch axis from kit_obj_reloc_kind_name's gate
(fmt != KIT_OBJ_ELF only), so aarch64/riscv ELF relocs print via their
ObjElfArchOps.reloc_name tables (binutils-faithful: R_AARCH64_CALL26,
R_RISCV_CALL). One objdump golden refreshed; the -d disasm annotation keeps
the arch-neutral spelling (separate disassembler path).
rg "case R_(AARCH64|X64|RV)_" src/link is now empty. Verified: reloc_desc
parity 3016/0, test-link/elf/macho/ar/isa/aa64-inline/driver-objdump pass,
make bootstrap-debug byte-identical.
Diffstat:
24 files changed, 805 insertions(+), 337 deletions(-)
diff --git a/doc/plan/RELOC.md b/doc/plan/RELOC.md
@@ -1,6 +1,6 @@
# Relocation-layer genericization (planned work)
-## Status — 2026-06-05 — partially landed; descriptor + encoder-partition remain
+## Status — 2026-06-05 — WS-B (descriptor table) + WS-E.2/E.3 (residual gates) landed; only the WS-C byte-patcher partition remains
This roadmap makes the **canonical-`RelocKind` half** of the relocation subsystem
as modular as the wire half already is. The goal is the project's standing
@@ -57,9 +57,9 @@ parallel tables the compiler cannot keep in sync:
| Attribute | Lives in | Status |
|-----------|----------|--------|
| how to patch the bytes | `link_reloc_apply()` `src/obj/reloc_apply.c:83` (switch, ~77 arms) | **open** — WS-C |
-| byte width | `reloc_width()` `src/link/link_reloc_layout.c:256` (switch) | **open** — WS-B |
-| uses GOT / is TLS-GOT | `reloc_uses_got()`/`reloc_is_tls_got()` `src/link/link_reloc_layout.c:392,380` (switch) | **open** — WS-B |
-| branch / got-load / tlvp / direct-page | `LinkArchDesc.is_*` `src/link/link_arch.h:79-82` (per-arch hooks) | duplicated by the above — WS-B |
+| byte width | `RelocDesc.width` (per-arch `src/arch/<arch>/reloc.c` + neutral `src/obj/reloc.c`) | **landed** — WS-B |
+| uses GOT / is TLS-GOT | `RelocDesc.flags` `RELOC_USES_GOT`/`RELOC_IS_TLS_GOT` | **landed** — WS-B |
+| branch / got-load / tlvp / direct-page | `RelocDesc.flags` `RELOC_IS_BRANCH`/`USES_GOT`/`IS_TLVP`/`DIRECT_PAGE` | **landed** — WS-B |
| display name | `ObjElfArchOps.reloc_name` `src/obj/format.h:65` (per-arch hook) | **landed** (with a residual gate — WS-E.3) |
Two generic switches (`reloc_width`, `reloc_uses_got`/`is_tls_got`) still enumerate
@@ -136,9 +136,32 @@ test-aa64-inline` + a TLS `test-toy` slice + `make bootstrap` (IE-model TLS).
---
-## WS-B — One per-arch `RelocDesc {width, flags}` table (addresses **B + C**)
-
-**Problem.** `reloc_width()` and `reloc_uses_got()`/`reloc_is_tls_got()` are generic
+## WS-B — One per-arch `RelocDesc {width, flags}` table (addresses **B + C**) — *LANDED*
+
+**Status (landed).** `RelocDesc {u8 width; u8 flags}` resolved arch-aware by
+`reloc_desc(c, k)`:
+- neutral data-word kinds → `src/obj/reloc.c` (`reloc_desc_neutral`, pure obj-core);
+- per-arch slices → `src/arch/{aa64,x64,riscv}/reloc.c`, reached through a new
+ `LinkArchDesc.reloc_desc` hook that replaces the five `is_*` hooks;
+- dispatcher + `reloc_kind_*` predicates → `src/link/link_reloc_desc.{h,c}`.
+
+Placement note: the **dispatcher** lives in `src/link`, not the plan's
+`src/obj/reloc.c`, because resolving the per-arch slice needs `link_arch_desc_for()`
+— housing it in obj-core would invert the obj→link boundary (CLAUDE.md). The neutral
+descriptor *data* is still pure obj-core (`src/obj/reloc.c`). The arch slice wins over
+neutral so `R_PLT32` can be a branch on x86-64/RISC-V but flag-free on AArch64 while
+sharing the neutral width.
+
+Deleted: `reloc_width` / `reloc_uses_got` / `reloc_is_tls_got` (link_reloc_layout) and
+`jit_reloc_width_local` (link_jit). Migrated consumers (GOT/stub/width passes,
+`link_jit`, and the Mach-O `is_*` call sites) read `reloc_kind_*`. Migration guard:
+`test/link/reloc_desc_test.c` — frozen-oracle parity over every kind × every backend
+arch (3016 checks). `rg "case R_(AARCH64|X64|RV)_" src/link` is now empty; full
+link/elf/macho/ar/isa/aa64-inline suites + `make bootstrap` (debug+release,
+byte-identical) pass. WS-A's enum collapse stays deferred — `tpoff64_reloc` remains a
+per-arch field.
+
+**Problem (original).** `reloc_width()` and `reloc_uses_got()`/`reloc_is_tls_got()` are generic
switches re-enumerating every arch's kinds, and the GOT/branch classification is
answered *twice* (those switches vs the per-arch `LinkArchDesc.is_*` hooks). Adding
an arch's reloc edits generic `link_reloc_layout.c`; the two classification
@@ -242,31 +265,32 @@ hook is proven, so every step bisects to one arch.
---
-## WS-E — Residual format gates (addresses **E**) — *item 1 done; items 2–3 open, low priority*
+## WS-E — Residual format gates (addresses **E**) — *all items LANDED*
1. **FreeBSD static-IFUNC mechanism (#18).** **Done** — now
`obj_format_static_ifunc_via_rela_iplt(c)` (`src/obj/obj_secnames.c:371`).
-2. **IRELATIVE wire type via hardcoded `KIT_OBJ_ELF`.** Still open:
- `link_elf_irelative_type` (`src/link/link_reloc_layout.c:807`) does
- `obj_format_lookup(KIT_OBJ_ELF)->elf_arch(arch)->r_irelative`. Low priority — it
- is `use_rela_iplt`-gated and `.rela.plt` is intrinsically ELF — but fold it under
- the WS-E.1 predicate so the generic pass names no format constant: have
- `obj_format_static_ifunc_via_rela_iplt` (or a sibling) also surface the resolver
- reloc, or query the *resolved* format rather than the literal `KIT_OBJ_ELF`.
-3. **`reloc_name` dispatch gate (#24 residual).** `kit_obj_reloc_kind_name`
- (`src/api/object_file.c:384`) still guards `fmt != KIT_OBJ_ELF || arch !=
- KIT_ARCH_X86_64`, suppressing the already-implemented `elf_aarch64_reloc_name` /
- `elf_riscv_reloc_name` to preserve the rv64/aa64 objdump golden corpus
- ("RV_CALL" vs "R_RISCV_CALL"). Closing it means: drop the gate to
- `if (fmt != KIT_OBJ_ELF) return NULL;` (or remove it entirely once macho/coff
- `reloc_name` exist), then **refresh the affected objdump golden files** to the
- format-canonical spellings. Test-corpus-coupled, so schedule it deliberately;
- note the coupling in `doc/plan/TODO.md` if not done with this pass.
-
-**Oracle.** `make test-link`; for item 3, `make test-tools`/the objdump corpus
-(expect golden churn — review the diff is purely the reloc spelling); item 2 via the
-FreeBSD VM lane (`scripts/freebsd_vm.sh` / `test-toy-freebsd-vm`, see
-[FREEBSD.md](FREEBSD.md)).
+2. **IRELATIVE wire type via hardcoded `KIT_OBJ_ELF`.** **Done.** The generic
+ `link_elf_irelative_type` is deleted; the iplt pass now calls
+ `obj_format_static_ifunc_irelative_type(l->c)` (sibling of the WS-E.1 predicate in
+ `src/obj/obj_secnames.c`), which resolves the resolver reloc through the *target*
+ object format (`c->target.obj`) rather than the literal `KIT_OBJ_ELF`. The generic
+ link pass names no format constant.
+3. **`reloc_name` dispatch gate (#24 residual).** **Done.** `kit_obj_reloc_kind_name`
+ (`src/api/object_file.c`) now guards only `if (fmt != KIT_OBJ_ELF) return NULL;` —
+ the `arch != KIT_ARCH_X86_64` axis is gone, so aarch64/riscv ELF relocs print via
+ their `ObjElfArchOps.reloc_name` tables (matching binutils objdump:
+ `R_AARCH64_CALL26`, `R_RISCV_CALL`). One golden refreshed
+ (`test/objdump/rv64/cases/03-reloc-annotations`: `RV_CALL` → `R_RISCV_CALL` in the
+ `-r` records; the `-d` disasm annotation keeps the arch-neutral `[RV_CALL]`, which
+ comes from the disassembler's `reloc_kind_name`, a separate path). Mach-O/COFF have
+ no `reloc_name` table yet, so they still fall back to the neutral spelling.
+
+**Oracle.** `make test-link test-elf test-driver-objdump` — all pass (item 3's golden
+churn was the single rv64 reloc-annotations case, purely the reloc spelling). Item 2's
+FreeBSD static-IFUNC path is unexercised on the macOS host but the change is a
+behaviour-preserving refactor (same per-arch `r_irelative`, resolved format == ELF
+wherever `use_rela_iplt` is true); deeper coverage is the FreeBSD VM lane
+(`scripts/freebsd_vm.sh` / `test-toy-freebsd-vm`, see [FREEBSD.md](FREEBSD.md)).
---
diff --git a/doc/plan/TODO.md b/doc/plan/TODO.md
@@ -5,6 +5,10 @@ fixed, remove it instead of checking it off or keeping a closed entry.
Add new deferred fixes below as they are discovered.
+## MISC
+
+- [ ] test-toy failures: 141_threadlocal_mutate/X-O0:rv64 141_threadlocal_mutate/X-O1:rv64
+
## aarch64-windows: `118_decl_extra_attrs` fails to link (ADRP out of range)
`kit cc -target aarch64-windows` on `test/toy/cases/118_decl_extra_attrs.toy`
diff --git a/mk/test.mk b/mk/test.mk
@@ -94,6 +94,7 @@ TEST_TARGETS = \
test-libc-musl-rv64 \
test-link \
test-link-reloc-uleb128 \
+ test-link-reloc-desc \
test-macho \
test-native-direct-target \
test-opt \
@@ -159,6 +160,7 @@ DEFAULT_TEST_TARGETS = \
test-opt \
test-asm-symmetry \
test-link-reloc-uleb128 \
+ test-link-reloc-desc \
test-dbg \
test-disasm-complete \
test-macho \
@@ -372,6 +374,15 @@ RELOC_ULEB128_TEST_BIN = build/test/reloc_uleb128_unit
test-link-reloc-uleb128: $(RELOC_ULEB128_TEST_BIN)
$(RELOC_ULEB128_TEST_BIN)
+# Relocation-descriptor migration guard (doc/plan/RELOC.md, WS-B): pins
+# reloc_desc()/reloc_kind_* to the frozen pre-refactor width + classification
+# behaviour across every backend arch. Internal-surface, so links the raw lib
+# objects like the other internal unit tests.
+RELOC_DESC_TEST_BIN = build/test/reloc_desc_test
+
+test-link-reloc-desc: $(RELOC_DESC_TEST_BIN)
+ $(RELOC_DESC_TEST_BIN)
+
# test-emu-unit: white-box unit tests for the emulator's INTERNAL units (rv64
# decoder, EmuAddrSpace, Linux syscall handler) that have no public API. Reaches
# internal symbols -> links $(LIB_OBJS) (mirrors test-interp), not the archive.
diff --git a/mk/test_unit.mk b/mk/test_unit.mk
@@ -49,7 +49,7 @@ x64_inline_test_SRC := test/arch/x64_inline_test.c
UNIT_TESTS_INTERNAL := \
dwarf_test debug_roundtrip_unit debug_cfi_unit \
aa64_isa_test rv64_decode_test rv32_decode_test aa64_sweep_gen \
- reloc_uleb128_unit emu_rv64_unit_test interp_smoke_test \
+ reloc_uleb128_unit reloc_desc_test emu_rv64_unit_test interp_smoke_test \
rv64_interp_smoke_test abi_classify_test ir_recorder_test \
native_direct_target_test x64_dbg_test cg_ir_lower_test tiny_inline_test
dwarf_test_SRC := test/dwarf/dwarf_test.c
@@ -60,6 +60,7 @@ rv64_decode_test_SRC := test/arch/rv64_decode_test.c
rv32_decode_test_SRC := test/arch/rv32_decode_test.c
aa64_sweep_gen_SRC := test/arch/aa64_sweep_gen.c
reloc_uleb128_unit_SRC := test/link/reloc_uleb128_unit.c
+reloc_desc_test_SRC := test/link/reloc_desc_test.c
emu_rv64_unit_test_SRC := test/emu/rv64_vm_unit_test.c
interp_smoke_test_SRC := test/interp/interp_smoke_test.c
rv64_interp_smoke_test_SRC := test/emu/rv64_interp_smoke_test.c
diff --git a/src/api/object_file.c b/src/api/object_file.c
@@ -363,25 +363,24 @@ KitStatus kit_obj_dynreliter_new(KitObjFile* f, KitObjRelocIter** out) {
return reliter_make(f, 1, out);
}
-/* Format-specific canonical spelling of a reloc kind (e.g. "R_X86_64_PLT32"),
- * or NULL when the format/arch has no per-arch name table (callers fall back
- * to the arch-neutral reloc_kind_name).
+/* Format-specific canonical spelling of a reloc kind (e.g. "R_X86_64_PLT32",
+ * "R_AARCH64_CALL26", "R_RISCV_CALL"), or NULL when the format has no per-arch
+ * name table (callers fall back to the arch-neutral reloc_kind_name).
*
- * The per-arch ELF reloc-name table lives on ObjElfArchOps.reloc_name; the
- * kit-canonical RelocKind is lowered to its ELF wire type via reloc_to first.
- * Only the x86_64 ELF table is consulted here, matching the historical
- * behavior exactly: other ELF arches (aarch64 / riscv) and the Mach-O / COFF
- * formats keep the arch-neutral spelling their callers and tests expect (e.g.
- * the rv64 objdump corpus prints "RV_CALL", not "R_RISCV_CALL"). reloc_to maps
- * unsupported kinds to wire type 0; only R_NONE legitimately names that slot,
- * so anything else falling through to 0 is reported as "no per-arch name"
- * (NULL) rather than the format's NONE spelling. */
+ * Consulted for every ELF arch whose ObjElfArchOps carries a reloc_name table
+ * (x86_64 / aarch64 / riscv): the kit-canonical RelocKind is lowered to its
+ * ELF wire type via reloc_to, then named — matching binutils objdump's
+ * spelling. The Mach-O / COFF formats have no reloc_name table yet and keep
+ * the arch-neutral spelling. reloc_to maps unsupported kinds to wire type 0;
+ * only R_NONE legitimately names that slot, so anything else falling through
+ * to 0 is reported as "no per-arch name" (NULL) rather than the format's NONE
+ * spelling. */
static const char* kit_obj_reloc_kind_name(KitArchKind arch, KitObjFmt fmt,
u32 kind) {
const ObjFormatImpl* impl;
const ObjElfArchOps* ops;
u32 wire;
- if (fmt != KIT_OBJ_ELF || arch != KIT_ARCH_X86_64) return NULL;
+ if (fmt != KIT_OBJ_ELF) return NULL;
impl = obj_format_lookup(fmt);
if (!impl || !impl->elf_arch) return NULL;
ops = impl->elf_arch(arch);
diff --git a/src/arch/aa64/link.c b/src/arch/aa64/link.c
@@ -172,34 +172,9 @@ void aa64_emit_macho_stub(u8* out, u64 stub_vaddr, u64 got_slot_vaddr) {
wr_u32_le(out + 8, aa64_br(AA64_PLT_SCRATCH_X16));
}
-static int aa64_is_branch_reloc(RelocKind kind) {
- return kind == R_AARCH64_CALL26 || kind == R_AARCH64_JUMP26;
-}
-
-static int aa64_is_got_load_reloc(RelocKind kind) {
- return kind == R_AARCH64_ADR_GOT_PAGE || kind == R_AARCH64_LD64_GOT_LO12_NC;
-}
-
-static int aa64_is_tlvp_reloc(RelocKind kind) {
- return kind == R_AARCH64_TLVP_LOAD_PAGE21 ||
- kind == R_AARCH64_TLVP_LOAD_PAGEOFF12;
-}
-
-static int aa64_is_direct_page_reloc(RelocKind kind) {
- switch (kind) {
- case R_AARCH64_ADR_PREL_PG_HI21:
- case R_AARCH64_ADR_PREL_PG_HI21_NC:
- case R_AARCH64_ADD_ABS_LO12_NC:
- case R_AARCH64_LDST8_ABS_LO12_NC:
- case R_AARCH64_LDST16_ABS_LO12_NC:
- case R_AARCH64_LDST32_ABS_LO12_NC:
- case R_AARCH64_LDST64_ABS_LO12_NC:
- case R_AARCH64_LDST128_ABS_LO12_NC:
- return 1;
- default:
- return 0;
- }
-}
+/* Width + classification rows for AArch64's relocation kinds; defined in
+ * src/arch/aa64/reloc.c and consulted through the .reloc_desc hook. */
+const RelocDesc* aa64_reloc_desc(RelocKind);
/* AArch64 __chkstk for PE/COFF: probes `x15 * 16` bytes of stack one page at a
* time, then returns. Mirrors the LLVM compiler-rt implementation (chkstk.S in
@@ -225,11 +200,7 @@ const LinkArchDesc link_arch_aa64 = {
.emit_plt_entry = aa64_emit_plt_entry,
.emit_iplt_stub = aa64_emit_iplt_stub,
- .is_branch_reloc = aa64_is_branch_reloc,
- .is_got_load_reloc = aa64_is_got_load_reloc,
- .is_tlvp_reloc = aa64_is_tlvp_reloc,
- .is_direct_page_reloc = aa64_is_direct_page_reloc,
- .needs_jit_call_stub = aa64_is_branch_reloc,
+ .reloc_desc = aa64_reloc_desc,
/* AAPCS64 variant I: GOT TLS-IE slots hold (X - tls_vaddr) + TCB. */
.tpoff64_reloc = R_AARCH64_TPOFF64,
diff --git a/src/arch/aa64/reloc.c b/src/arch/aa64/reloc.c
@@ -0,0 +1,50 @@
+/* AArch64 relocation descriptors (width + classification).
+ *
+ * One row per relocation kind this backend applies. Reached through
+ * LinkArchDesc.reloc_desc (wired in link.c) and the arch-aware reloc_desc()
+ * dispatcher. The wire encoding + diagnostic name live in
+ * src/obj/<fmt>/reloc_aarch64.c; the instruction byte encoders live in the
+ * shared byte-patcher (src/obj/reloc_apply.c) until WS-C moves them here.
+ *
+ * Kinds with no row (the dynamic-only GLOB_DAT/JUMP_SLOT/RELATIVE/COPY, the
+ * MCEmitter-only INTRA_LABEL_ADDR, the internal slot-fill TPOFF64, and the
+ * unused TLSLE LDST variants) are never applied through the static reloc
+ * record path and intentionally carry no descriptor. */
+
+#include "obj/reloc.h"
+
+static const RelocDescRow aa64_rows[] = {
+ {R_AARCH64_ABS16, {2, 0}},
+ {R_AARCH64_PREL16, {2, 0}},
+ {R_AARCH64_JUMP26, {4, RELOC_IS_BRANCH}},
+ {R_AARCH64_CALL26, {4, RELOC_IS_BRANCH}},
+ {R_AARCH64_CONDBR19, {4, 0}},
+ {R_AARCH64_TSTBR14, {4, 0}},
+ {R_AARCH64_LD_PREL_LO19, {4, 0}},
+ {R_AARCH64_ADR_PREL_LO21, {4, 0}},
+ {R_AARCH64_ADR_PREL_PG_HI21, {4, RELOC_DIRECT_PAGE}},
+ {R_AARCH64_ADR_PREL_PG_HI21_NC, {4, RELOC_DIRECT_PAGE}},
+ {R_AARCH64_ADD_ABS_LO12_NC, {4, RELOC_DIRECT_PAGE}},
+ {R_AARCH64_LDST8_ABS_LO12_NC, {4, RELOC_DIRECT_PAGE}},
+ {R_AARCH64_LDST16_ABS_LO12_NC, {4, RELOC_DIRECT_PAGE}},
+ {R_AARCH64_LDST32_ABS_LO12_NC, {4, RELOC_DIRECT_PAGE}},
+ {R_AARCH64_LDST64_ABS_LO12_NC, {4, RELOC_DIRECT_PAGE}},
+ {R_AARCH64_LDST128_ABS_LO12_NC, {4, RELOC_DIRECT_PAGE}},
+ {R_AARCH64_ADR_GOT_PAGE, {4, RELOC_USES_GOT}},
+ {R_AARCH64_LD64_GOT_LO12_NC, {4, RELOC_USES_GOT}},
+ {R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21, {4, RELOC_IS_TLS_GOT}},
+ {R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC, {4, RELOC_IS_TLS_GOT}},
+ {R_AARCH64_TLSLE_ADD_TPREL_HI12, {4, 0}},
+ {R_AARCH64_TLSLE_ADD_TPREL_LO12_NC, {4, 0}},
+ {R_AARCH64_TLVP_LOAD_PAGE21, {4, RELOC_IS_TLVP}},
+ {R_AARCH64_TLVP_LOAD_PAGEOFF12, {4, RELOC_IS_TLVP}},
+ /* COFF AArch64 TLS SECREL imm12 pair: ADD-imm12 instruction relocs,
+ * AArch64-only, applied only into PE/COFF output. */
+ {R_COFF_AARCH64_SECREL_LOW12A, {4, 0}},
+ {R_COFF_AARCH64_SECREL_HIGH12A, {4, 0}},
+};
+
+const RelocDesc* aa64_reloc_desc(RelocKind k) {
+ return reloc_desc_row_find(aa64_rows,
+ (u32)(sizeof aa64_rows / sizeof aa64_rows[0]), k);
+}
diff --git a/src/arch/riscv/link.c b/src/arch/riscv/link.c
@@ -105,16 +105,13 @@ static u32 rv32_emit_iplt_stub(u8* dst, u64 stub_vaddr, u64 slot_vaddr,
return 0u;
}
-/* A direct rv64 call (R_RV_CALL = AUIPC+JALR) reaches only ±2GiB. In the JIT,
- * an external SK_ABS target (a host libc symbol resolved to an arbitrary
- * address) can lie farther than that from the JIT-allocated code region, where
- * link_reloc_apply would panic "RV CALL out of range". Reporting these as
- * branch relocs routes them through the JIT call-stub pass, which reuses
- * emit_iplt_stub (AUIPC+LD+JR) to reach an arbitrary address held in an
- * in-image slot — the same safety net aa64 and x64 already wire. */
-static int rv64_is_branch_reloc(RelocKind kind) {
- return kind == R_RV_CALL || kind == R_PLT32;
-}
+/* Width + classification rows for RISC-V's relocation kinds (shared by rv64
+ * and rv32); defined in src/arch/riscv/reloc.c and consulted through the
+ * .reloc_desc hook. R_RV_CALL / R_PLT32 carry RELOC_IS_BRANCH: a direct
+ * AUIPC+JALR reaches only ±2GiB, so a too-far target (e.g. a JIT-resolved
+ * host libc symbol) routes through the call-stub pass, the same safety net
+ * aa64 and x64 wire. */
+const RelocDesc* rv_reloc_desc(RelocKind);
const LinkArchDesc link_arch_rv64 = {
.plt0_size = RV64_PLT0_SIZE,
@@ -125,7 +122,7 @@ const LinkArchDesc link_arch_rv64 = {
.emit_plt0 = rv64_emit_plt0,
.emit_plt_entry = rv64_emit_plt_entry,
.emit_iplt_stub = rv64_emit_iplt_stub,
- .needs_jit_call_stub = rv64_is_branch_reloc,
+ .reloc_desc = rv_reloc_desc,
/* RISC-V variant I shares the internal raw-64-bit variant-I tpoff with
* AArch64 ((X - tls_vaddr) + TCB); there is no R_RV_TPOFF64. */
.tpoff64_reloc = R_AARCH64_TPOFF64,
@@ -144,7 +141,7 @@ const LinkArchDesc link_arch_rv32 = {
.emit_plt0 = rv64_emit_plt0,
.emit_plt_entry = rv32_emit_plt_entry,
.emit_iplt_stub = rv32_emit_iplt_stub,
- .needs_jit_call_stub = rv64_is_branch_reloc,
+ .reloc_desc = rv_reloc_desc,
/* See rv64: shares R_AARCH64_TPOFF64 (variant-I internal tpoff). */
.tpoff64_reloc = R_AARCH64_TPOFF64,
};
diff --git a/src/arch/riscv/reloc.c b/src/arch/riscv/reloc.c
@@ -0,0 +1,62 @@
+/* RISC-V relocation descriptors (width + classification), shared by the
+ * rv64 and rv32 backends (their reloc kinds are identical).
+ *
+ * One row per relocation kind this backend applies. Reached through
+ * LinkArchDesc.reloc_desc (wired in link.c for both rv64 and rv32) and the
+ * arch-aware reloc_desc() dispatcher. Wire encoding + name live in
+ * src/obj/<fmt>/reloc_riscv{32,64}.c.
+ *
+ * R_RV_CALL and the arch-neutral R_PLT32 (the kit-canonical kind
+ * R_RISCV_CALL_PLT maps onto) are both branches: too-far targets route
+ * through the JIT/range call-stub pass. R_RV_CALL patches an 8-byte
+ * AUIPC+JALR pair; R_PLT32 keeps its neutral 4-byte width (a gate value —
+ * the apply path re-derives the real span from the kind).
+ *
+ * RELAX / TPREL_ADD are relaxation markers (no bytes); SET/SUB_ULEB128 are
+ * variable-width (the apply path reads the true field length from the bytes
+ * — the width here is the nominal gate value). R_RV_ALIGN is skipped
+ * before the reloc record is built and carries no descriptor. */
+
+#include "obj/reloc.h"
+
+static const RelocDescRow rv_rows[] = {
+ {R_RV_HI20, {4, 0}},
+ {R_RV_LO12_I, {4, 0}},
+ {R_RV_LO12_S, {4, 0}},
+ {R_RV_BRANCH, {4, 0}},
+ {R_RV_JAL, {4, 0}},
+ {R_RV_PCREL_HI20, {4, 0}},
+ {R_RV_PCREL_LO12_I, {4, 0}},
+ {R_RV_PCREL_LO12_S, {4, 0}},
+ {R_RV_GOT_HI20, {4, RELOC_USES_GOT}},
+ {R_RV_TLS_GOT_HI20, {4, RELOC_IS_TLS_GOT}},
+ {R_RV_TPREL_HI20, {4, 0}},
+ {R_RV_TPREL_LO12_I, {4, 0}},
+ {R_RV_TPREL_LO12_S, {4, 0}},
+ {R_RV_CALL, {8, RELOC_IS_BRANCH}},
+ {R_PLT32, {4, RELOC_IS_BRANCH}},
+ {R_RV_RVC_BRANCH, {2, 0}},
+ {R_RV_RVC_JUMP, {2, 0}},
+ {R_RV_RELAX, {4, RELOC_MARKER}},
+ {R_RV_TPREL_ADD, {4, RELOC_MARKER}},
+ {R_RV_ADD8, {1, 0}},
+ {R_RV_SUB8, {1, 0}},
+ {R_RV_SUB6, {1, 0}},
+ {R_RV_SET6, {1, 0}},
+ {R_RV_SET8, {1, 0}},
+ {R_RV_ADD16, {2, 0}},
+ {R_RV_SUB16, {2, 0}},
+ {R_RV_SET16, {2, 0}},
+ {R_RV_ADD32, {4, 0}},
+ {R_RV_SUB32, {4, 0}},
+ {R_RV_SET32, {4, 0}},
+ {R_RV_ADD64, {8, 0}},
+ {R_RV_SUB64, {8, 0}},
+ {R_RV_SET_ULEB128, {1, RELOC_WIDTH_DYN}},
+ {R_RV_SUB_ULEB128, {1, RELOC_WIDTH_DYN}},
+};
+
+const RelocDesc* rv_reloc_desc(RelocKind k) {
+ return reloc_desc_row_find(rv_rows, (u32)(sizeof rv_rows / sizeof rv_rows[0]),
+ k);
+}
diff --git a/src/arch/x64/link.c b/src/arch/x64/link.c
@@ -56,14 +56,9 @@ static u32 x64_emit_iplt_stub(u8* dst, u64 stub_vaddr, u64 slot_vaddr,
return 0;
}
-static int x64_is_branch_reloc(RelocKind kind) {
- return kind == R_X64_PLT32 || kind == R_PLT32;
-}
-
-static int x64_is_got_load_reloc(RelocKind kind) {
- return kind == R_X64_GOTPCREL || kind == R_X64_GOTPCRELX ||
- kind == R_X64_REX_GOTPCRELX;
-}
+/* Width + classification rows for x86-64's relocation kinds; defined in
+ * src/arch/x64/reloc.c and consulted through the .reloc_desc hook. */
+const RelocDesc* x64_reloc_desc(RelocKind);
/* PE/COFF IAT stub for x86_64 (6 B):
*
@@ -89,9 +84,7 @@ const LinkArchDesc link_arch_x64 = {
.emit_plt_entry = x64_emit_plt_entry,
.emit_iplt_stub = x64_emit_iplt_stub,
- .is_branch_reloc = x64_is_branch_reloc,
- .is_got_load_reloc = x64_is_got_load_reloc,
- .needs_jit_call_stub = x64_is_branch_reloc,
+ .reloc_desc = x64_reloc_desc,
/* x86_64 variant II: GOT TLS-IE slots hold (X - tls_memsz). */
.tpoff64_reloc = R_X64_TPOFF64,
diff --git a/src/arch/x64/reloc.c b/src/arch/x64/reloc.c
@@ -0,0 +1,38 @@
+/* x86-64 relocation descriptors (width + classification).
+ *
+ * One row per relocation kind this backend applies. Reached through
+ * LinkArchDesc.reloc_desc (wired in link.c) and the arch-aware reloc_desc()
+ * dispatcher. Wire encoding + name live in src/obj/<fmt>/reloc_x86_64.c.
+ *
+ * R_PLT32 is the arch-neutral canonical PLT call kind; x86-64 classifies it
+ * as a branch (it shares x64's branch handling with R_X64_PLT32), so it
+ * gets a slice row that overrides the neutral table's flag-free entry while
+ * keeping the same 4-byte width.
+ *
+ * The general-/local-dynamic TLS kinds (TLSGD/TLSLD/DTP*), GOTOFF64, and
+ * COPY are never applied through the static reloc record path and carry no
+ * descriptor. */
+
+#include "obj/reloc.h"
+
+static const RelocDescRow x64_rows[] = {
+ {R_X64_PC8, {1, 0}},
+ {R_X64_32S, {4, 0}},
+ {R_X64_PLT32, {4, RELOC_IS_BRANCH}},
+ {R_PLT32, {4, RELOC_IS_BRANCH}},
+ {R_X64_GOTPCREL, {4, RELOC_USES_GOT}},
+ {R_X64_GOTPCRELX, {4, RELOC_USES_GOT}},
+ {R_X64_REX_GOTPCRELX, {4, RELOC_USES_GOT}},
+ {R_X64_GOTPC32, {4, 0}},
+ {R_X64_GOTTPOFF, {4, RELOC_IS_TLS_GOT}},
+ {R_X64_TPOFF32, {4, 0}},
+ {R_X64_TPOFF64, {8, 0}},
+ {R_X64_GLOB_DAT, {8, 0}},
+ {R_X64_JUMP_SLOT, {8, 0}},
+ {R_X64_RELATIVE, {8, 0}},
+};
+
+const RelocDesc* x64_reloc_desc(RelocKind k) {
+ return reloc_desc_row_find(x64_rows,
+ (u32)(sizeof x64_rows / sizeof x64_rows[0]), k);
+}
diff --git a/src/link/link_arch.h b/src/link/link_arch.h
@@ -16,6 +16,7 @@
#include "core/core.h"
#include "obj/obj.h"
+#include "obj/reloc.h"
/* IPLT relocation slot reported by emit_iplt_stub. Some arches
* (aarch64) cannot encode the stub->slot displacement inline and need
@@ -75,12 +76,14 @@ typedef struct LinkArchDesc {
u32 (*emit_iplt_stub)(u8* dst, u64 stub_vaddr, u64 slot_vaddr,
LinkArchIPltReloc out[2]);
- /* Relocation classification used by format-specific linker passes. */
- int (*is_branch_reloc)(RelocKind);
- int (*is_got_load_reloc)(RelocKind);
- int (*is_tlvp_reloc)(RelocKind);
- int (*is_direct_page_reloc)(RelocKind);
- int (*needs_jit_call_stub)(RelocKind);
+ /* This arch's relocation-descriptor slice: width + classification flags
+ * for each kind it applies, or NULL for kinds it does not own. The
+ * arch-aware reloc_desc() dispatcher (link_reloc_desc.c) consults this
+ * before the neutral table, and the reloc_kind_* predicates read the
+ * flags — replacing the former is_branch / is_got_load / is_tlvp /
+ * is_direct_page / needs_jit_call_stub hooks and the generic
+ * reloc_width / reloc_uses_got / reloc_is_tls_got switches. */
+ const RelocDesc* (*reloc_desc)(RelocKind);
/* ---- TLS Initial-Exec GOT slot fill ----
* The internal raw-64-bit local-exec tpoff written into a TLS GOT slot
diff --git a/src/link/link_jit.c b/src/link/link_jit.c
@@ -23,6 +23,7 @@
#include "jit/tlv_thunk.h"
#include "link/link.h"
#include "link/link_internal.h"
+#include "link/link_reloc_desc.h"
#include "obj/obj.h"
/* Defined in src/api/objfile.c — exposes the underlying ObjBuilder of a
@@ -37,7 +38,8 @@ void kit_objfile_internal_free(KitObjFile*);
static const KitJitHost* jit_host_from_linker(Linker* l, Compiler* c) {
const KitJitHost* host = l ? l->jit_host : NULL;
if (!host)
- compiler_panic(c, SRCLOC_NONE, "kit_jit: link jit host is required for JIT");
+ compiler_panic(c, SRCLOC_NONE,
+ "kit_jit: link jit host is required for JIT");
return host;
}
@@ -356,7 +358,8 @@ static void jit_patch_tlv_descriptors(KitJit* jit) {
u8* write = (u8*)vaddr_to_write(img, jit->segs, desc_vaddr);
if (!write)
- compiler_panic(c, SRCLOC_NONE, "kit_jit: TLV descriptor vaddr does not map");
+ compiler_panic(c, SRCLOC_NONE,
+ "kit_jit: TLV descriptor vaddr does not map");
wr_u64_le(write + 0u, (u64)thunk_addr);
wr_u64_le(write + 8u, (u64)(uintptr_t)ctx);
wr_u64_le(write + 16u, offset_in_image);
@@ -429,7 +432,8 @@ KitJit* kit_jit_from_image(LinkImage* img) {
metrics_scope_begin(c, "jit.reserve");
if (mem->reserve(mem->user, (size_t)master_size, master_prot, &master) !=
0) {
- compiler_panic(c, SRCLOC_NONE, "kit_jit_from_image: execmem.reserve failed");
+ compiler_panic(c, SRCLOC_NONE,
+ "kit_jit_from_image: execmem.reserve failed");
}
metrics_scope_end(c, "jit.reserve");
}
@@ -440,7 +444,8 @@ KitJit* kit_jit_from_image(LinkImage* img) {
_Alignof(KitExecMemRegion));
if (!segs) {
mem->release(mem->user, &master);
- compiler_panic(c, SRCLOC_NONE, "kit_jit_from_image: oom on segment table");
+ compiler_panic(c, SRCLOC_NONE,
+ "kit_jit_from_image: oom on segment table");
}
memset(segs, 0, sizeof(*segs) * img->nsegments);
}
@@ -580,7 +585,8 @@ KitJit* kit_jit_from_image(LinkImage* img) {
perms_for(seg->flags)) != 0) {
mem->release(mem->user, &master);
if (segs) heap->free(heap, segs, sizeof(*segs) * img->nsegments);
- compiler_panic(c, SRCLOC_NONE, "kit_jit_from_image: execmem.protect failed");
+ compiler_panic(c, SRCLOC_NONE,
+ "kit_jit_from_image: execmem.protect failed");
}
}
if (append_total) {
@@ -807,29 +813,6 @@ typedef struct JitAppendSec {
SEGVEC_DEFINE(JitAppendSecs, JitAppendSec, 4);
-static u8 jit_reloc_width_local(RelocKind k) {
- switch (k) {
- case R_ABS64:
- case R_REL64:
- case R_PC64:
- case R_X64_TPOFF64:
- return 8;
- case R_AARCH64_ABS16:
- case R_AARCH64_PREL16:
- case R_RV_RVC_BRANCH:
- case R_RV_RVC_JUMP:
- return 2;
- case R_RV_ADD8:
- case R_RV_SUB8:
- case R_RV_SUB6:
- case R_RV_SET6:
- case R_RV_SET8:
- return 1;
- default:
- return 4;
- }
-}
-
static InputMap jit_input_map_alloc(KitJit* jit, ObjBuilder* ob) {
InputMap m;
ObjSymIter* it;
@@ -1325,7 +1308,7 @@ static void jit_append_obj_inner(KitJit* jit, ObjBuilder* ob) {
rec.section_id = r->section_id;
rec.link_section_id = ls_id;
rec.offset = r->offset - (u32)ls->obj_offset;
- rec.width = jit_reloc_width_local((RelocKind)r->kind);
+ rec.width = reloc_kind_width(jit->c, (RelocKind)r->kind);
rec.write_vaddr = ls->vaddr + rec.offset;
rec.write_file_offset = rec.write_vaddr;
rec.kind = (RelocKind)r->kind;
diff --git a/src/link/link_reloc_desc.c b/src/link/link_reloc_desc.c
@@ -0,0 +1,20 @@
+/* Arch-aware relocation-descriptor dispatcher.
+ *
+ * Stitches the per-arch slices (src/arch/<arch>/reloc.c, reached through
+ * LinkArchDesc.reloc_desc) onto the arch-neutral table (src/obj/reloc.c).
+ * The arch slice wins so an arch can refine the classification of an
+ * otherwise-neutral kind (e.g. R_PLT32 is a branch on x86-64 / RISC-V but
+ * not on AArch64) while still sharing the neutral width. */
+
+#include "link/link_reloc_desc.h"
+
+#include "link/link_arch.h"
+
+const RelocDesc* reloc_desc(const Compiler* c, RelocKind k) {
+ const LinkArchDesc* d = link_arch_desc_for(c);
+ if (d && d->reloc_desc) {
+ const RelocDesc* r = d->reloc_desc(k);
+ if (r) return r;
+ }
+ return reloc_desc_neutral(k);
+}
diff --git a/src/link/link_reloc_desc.h b/src/link/link_reloc_desc.h
@@ -0,0 +1,68 @@
+#ifndef KIT_LINK_RELOC_DESC_H
+#define KIT_LINK_RELOC_DESC_H
+
+#include "core/core.h"
+#include "obj/obj.h"
+#include "obj/reloc.h"
+
+/* Arch-aware resolution of a relocation kind's static descriptor.
+ *
+ * The per-arch slice (via link_arch_desc_for(c)->reloc_desc) takes
+ * precedence; it falls back to the arch-neutral table. Returns NULL when k
+ * is not a relocation this target applies. This is the single source of a
+ * kind's width + classification — the wire encoding and the diagnostic name
+ * live on the per-(arch,format) ops. See doc/plan/RELOC.md (WS-B).
+ *
+ * The thin reloc_kind_* predicates below replace the former generic
+ * switches (reloc_width / reloc_uses_got / reloc_is_tls_got) and the
+ * per-arch LinkArchDesc.is_* hooks; each reads one descriptor flag. */
+const RelocDesc* reloc_desc(const Compiler* c, RelocKind k);
+
+/* Patched-field width in bytes, or 0 when k has no descriptor — the
+ * "unsupported relocation kind" gate the linker enforces. Nominal (gate)
+ * value for RELOC_WIDTH_DYN kinds, whose true span is read at apply time. */
+static inline u8 reloc_kind_width(const Compiler* c, RelocKind k) {
+ const RelocDesc* d = reloc_desc(c, k);
+ return d ? d->width : 0u;
+}
+
+/* Needs a GOT slot (the ELF / static GOT pass's notion): a direct GOT load
+ * OR a TLS-IE GOT slot holding a TP-relative offset. */
+static inline int reloc_kind_uses_got(const Compiler* c, RelocKind k) {
+ const RelocDesc* d = reloc_desc(c, k);
+ return d && (d->flags & (RELOC_USES_GOT | RELOC_IS_TLS_GOT)) ? 1 : 0;
+}
+
+/* GOT slot is filled with the symbol's TP-relative offset (TLS Initial-Exec)
+ * rather than its address. */
+static inline int reloc_kind_is_tls_got(const Compiler* c, RelocKind k) {
+ const RelocDesc* d = reloc_desc(c, k);
+ return d && (d->flags & RELOC_IS_TLS_GOT) ? 1 : 0;
+}
+
+/* A direct GOT-load instruction reloc (the Mach-O linker's notion): non-TLS
+ * GOT load. Distinct from reloc_kind_uses_got, which also counts TLS-IE. */
+static inline int reloc_kind_is_got_load(const Compiler* c, RelocKind k) {
+ const RelocDesc* d = reloc_desc(c, k);
+ return d && (d->flags & RELOC_USES_GOT) ? 1 : 0;
+}
+
+/* Range-limited call/jump that may need a JIT/range call stub or veneer. */
+static inline int reloc_kind_is_branch(const Compiler* c, RelocKind k) {
+ const RelocDesc* d = reloc_desc(c, k);
+ return d && (d->flags & RELOC_IS_BRANCH) ? 1 : 0;
+}
+
+/* Mach-O TLV descriptor page / pageoff reloc. */
+static inline int reloc_kind_is_tlvp(const Compiler* c, RelocKind k) {
+ const RelocDesc* d = reloc_desc(c, k);
+ return d && (d->flags & RELOC_IS_TLVP) ? 1 : 0;
+}
+
+/* Mach-O ADRP-direct (non-GOT) page / pageoff reloc. */
+static inline int reloc_kind_is_direct_page(const Compiler* c, RelocKind k) {
+ const RelocDesc* d = reloc_desc(c, k);
+ return d && (d->flags & RELOC_DIRECT_PAGE) ? 1 : 0;
+}
+
+#endif
diff --git a/src/link/link_reloc_layout.c b/src/link/link_reloc_layout.c
@@ -23,14 +23,9 @@
#include "link/link.h"
#include "link/link_arch.h"
#include "link/link_internal.h"
+#include "link/link_reloc_desc.h"
#include "obj/format.h"
-/* Nominal (non-zero) width reported for the variable-length ULEB128
- * RISC-V relocs. See the comment in reloc_width(): this value only has
- * to be non-zero to pass the "supported kind" gate — the byte-exact
- * width is determined at apply time from the field bytes themselves. */
-#define RELOC_RV_ULEB128_NOMINAL_WIDTH 1u
-
/* ---- pass 3: assign symbol vaddrs ---- */
void link_assign_symbol_vaddrs(Linker* l, LinkImage* img) {
@@ -251,158 +246,6 @@ void link_emit_encoding_section_boundaries(Linker* l, LinkImage* img) {
}
}
-/* ---- pass 4: reloc records ---- */
-
-static u8 reloc_width(RelocKind k) {
- switch (k) {
- case R_ABS32:
- case R_REL32:
- case R_PC32:
- case R_GOT32:
- case R_PLT32:
- case R_X64_PLT32:
- case R_X64_32S:
- case R_X64_TPOFF32:
- case R_X64_GOTPCREL:
- case R_X64_GOTPCRELX:
- case R_X64_REX_GOTPCRELX:
- case R_X64_GOTPC32:
- case R_X64_GOTTPOFF:
- return 4;
- case R_ABS64:
- case R_REL64:
- case R_PC64:
- case R_X64_TPOFF64:
- case R_X64_GLOB_DAT:
- case R_X64_JUMP_SLOT:
- case R_X64_RELATIVE:
- return 8;
- case R_AARCH64_ABS16:
- case R_AARCH64_PREL16:
- return 2;
- case R_X64_PC8:
- return 1;
- case R_AARCH64_JUMP26:
- case R_AARCH64_CALL26:
- case R_AARCH64_CONDBR19:
- case R_AARCH64_TSTBR14:
- case R_AARCH64_LD_PREL_LO19:
- case R_AARCH64_ADR_PREL_LO21:
- case R_AARCH64_ADR_PREL_PG_HI21:
- case R_AARCH64_ADR_PREL_PG_HI21_NC:
- case R_AARCH64_ADD_ABS_LO12_NC:
- case R_AARCH64_LDST8_ABS_LO12_NC:
- case R_AARCH64_LDST16_ABS_LO12_NC:
- case R_AARCH64_LDST32_ABS_LO12_NC:
- case R_AARCH64_LDST64_ABS_LO12_NC:
- case R_AARCH64_LDST128_ABS_LO12_NC:
- case R_AARCH64_ADR_GOT_PAGE:
- case R_AARCH64_LD64_GOT_LO12_NC:
- case R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21:
- case R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC:
- case R_AARCH64_TLSLE_ADD_TPREL_HI12:
- case R_AARCH64_TLSLE_ADD_TPREL_LO12_NC:
- case R_AARCH64_TLVP_LOAD_PAGE21:
- case R_AARCH64_TLVP_LOAD_PAGEOFF12:
- return 4;
- case R_RV_HI20:
- case R_RV_LO12_I:
- case R_RV_LO12_S:
- case R_RV_BRANCH:
- case R_RV_JAL:
- case R_RV_PCREL_HI20:
- case R_RV_PCREL_LO12_I:
- case R_RV_PCREL_LO12_S:
- case R_RV_GOT_HI20:
- case R_RV_TLS_GOT_HI20:
- case R_RV_TPREL_HI20:
- case R_RV_TPREL_LO12_I:
- case R_RV_TPREL_LO12_S:
- return 4;
- case R_RV_CALL:
- return 8;
- case R_RV_RVC_BRANCH:
- case R_RV_RVC_JUMP:
- return 2;
- case R_RV_RELAX:
- case R_RV_TPREL_ADD:
- return 4;
- case R_RV_ADD8:
- case R_RV_SUB8:
- case R_RV_SUB6:
- case R_RV_SET6:
- case R_RV_SET8:
- return 1;
- case R_RV_ADD16:
- case R_RV_SUB16:
- case R_RV_SET16:
- return 2;
- case R_RV_ADD32:
- case R_RV_SUB32:
- case R_RV_SET32:
- return 4;
- case R_RV_ADD64:
- case R_RV_SUB64:
- return 8;
- case R_RV_SET_ULEB128:
- case R_RV_SUB_ULEB128:
- /* ULEB128 fields are variable-length: the true width is the number
- * of bytes the assembler reserved at the reloc offset, which is
- * data-dependent and only knowable from the section bytes at the
- * site. reloc_width() is keyed solely on RelocKind and has no view
- * of those bytes, and the width it returns is consumed ONLY as a
- * non-zero "is this kind supported?" gate in link_emit_relocations
- * (LinkRelocApply.width is never read by any apply or output path —
- * link_reloc_apply is dispatched on RelocKind and re-reads the
- * encoded ULEB128 length straight from P_bytes). So we return a
- * fixed sentinel here purely to pass that gate; the byte-exact
- * width is established at apply time in link_reloc_apply.
- * RELOC_RV_ULEB128_NOMINAL_WIDTH is the common 1-byte case for the
- * small DWARF symbol differences these encode. */
- return RELOC_RV_ULEB128_NOMINAL_WIDTH;
- case R_COFF_SECREL:
- case R_COFF_ADDR32NB:
- return 4;
- case R_COFF_SECTION:
- return 2;
- case R_COFF_AARCH64_SECREL_LOW12A:
- case R_COFF_AARCH64_SECREL_HIGH12A:
- return 4;
- default:
- return 0;
- }
-}
-
-/* TLS Initial-Exec relocs that load a GOT slot holding the symbol's
- * TP-relative offset (rather than its address). They take an ordinary GOT
- * slot, but the slot is filled with the tpoff value at link time -- see the
- * slot_is_tls handling in link_layout_got. */
-static int reloc_is_tls_got(u16 kind) {
- switch (kind) {
- case R_X64_GOTTPOFF:
- case R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21:
- case R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC:
- case R_RV_TLS_GOT_HI20:
- return 1;
- default:
- return 0;
- }
-}
-
-static int reloc_uses_got(u16 kind) {
- switch (kind) {
- case R_AARCH64_ADR_GOT_PAGE:
- case R_AARCH64_LD64_GOT_LO12_NC:
- case R_X64_GOTPCREL:
- case R_X64_GOTPCRELX:
- case R_X64_REX_GOTPCRELX:
- case R_RV_GOT_HI20:
- return 1;
- default:
- return reloc_is_tls_got(kind);
- }
-}
-
/* ---- iplt alloc helpers (used by layout_jit_call_stubs too) ---- */
u32 link_iplt_alloc_segments(LinkImage* img, u32 nseg) {
@@ -572,7 +415,7 @@ void link_layout_jit_stubs(Linker* l, LinkImage* img, u32 map_size,
*stub_map_out = NULL;
arch = link_arch_desc_for(l->c);
if (l->emit_static_exe) return;
- if (!arch || !arch->needs_jit_call_stub) return;
+ if (!arch) return;
stub_map = (LinkSymId*)h->alloc(h, sizeof(*stub_map) * map_size,
_Alignof(LinkSymId));
@@ -591,7 +434,7 @@ void link_layout_jit_stubs(Linker* l, LinkImage* img, u32 map_size,
const LinkSymbol* tgt;
if (!s || !link_section_kept(s)) continue;
if (link_input_reloc_section(m, r, k) == LINK_SEC_NONE) continue;
- if (!arch->needs_jit_call_stub(r->kind)) continue;
+ if (!reloc_kind_is_branch(l->c, r->kind)) continue;
if (r->sym == OBJ_SYM_NONE || r->sym >= m->nsym) continue;
target = m->sym[r->sym];
if (target == LINK_SYM_NONE) continue;
@@ -734,14 +577,15 @@ void link_layout_got(Linker* l, LinkImage* img, u32 map_size,
LinkSymId target;
if (!s || !link_section_kept(s)) continue;
if (link_input_reloc_section(m, r, k) == LINK_SEC_NONE) continue;
- if (!reloc_uses_got(r->kind)) continue;
+ if (!reloc_kind_uses_got(l->c, r->kind)) continue;
if (r->sym == OBJ_SYM_NONE || r->sym >= m->nsym) continue;
target = m->sym[r->sym];
if (target == LINK_SYM_NONE) continue;
if (got_map[target] != LINK_SYM_NONE) {
/* A later reloc on the same target may reveal it is a TLS slot
* even if the slot was created by a non-TLS reference first. */
- if (reloc_is_tls_got(r->kind)) slot_is_tls[got_map[target] - 1u] = 1u;
+ if (reloc_kind_is_tls_got(l->c, r->kind))
+ slot_is_tls[got_map[target] - 1u] = 1u;
continue;
}
if (VEC_GROW(h, slot_targets, slot_cap, nslot + 1u))
@@ -749,7 +593,7 @@ void link_layout_got(Linker* l, LinkImage* img, u32 map_size,
if (VEC_GROW(h, slot_is_tls, tls_cap, nslot + 1u))
compiler_panic(img->c, SRCLOC_NONE, "link: oom on got slot tls map");
slot_targets[nslot] = target;
- slot_is_tls[nslot] = reloc_is_tls_got(r->kind) ? 1u : 0u;
+ slot_is_tls[nslot] = reloc_kind_is_tls_got(l->c, r->kind) ? 1u : 0u;
got_map[target] = (LinkSymId)(nslot + 1u);
nslot++;
}
@@ -803,14 +647,6 @@ void link_layout_got(Linker* l, LinkImage* img, u32 map_size,
/* ---- pass 3d: STT_GNU_IFUNC trampoline ---- */
-/* The arch's R_*_IRELATIVE wire type, for the static __rela_iplt table. */
-static u32 link_elf_irelative_type(Compiler* c) {
- const ObjFormatImpl* fmt = obj_format_lookup(KIT_OBJ_ELF);
- const ObjElfArchOps* ao =
- (fmt && fmt->elf_arch) ? fmt->elf_arch(c->target.arch) : NULL;
- return ao ? ao->r_irelative : 0u;
-}
-
void link_layout_iplt(Linker* l, LinkImage* img) {
Heap* h = img->heap;
u32 i;
@@ -835,7 +671,8 @@ void link_layout_iplt(Linker* l, LinkImage* img) {
LinkSectionId rela_iplt_sec_id = 0;
u64 rela_iplt_vaddr = 0, rela_iplt_size = 0;
u8* rela_iplt_bytes = NULL;
- u32 irelative_type = use_rela_iplt ? link_elf_irelative_type(l->c) : 0u;
+ u32 irelative_type =
+ use_rela_iplt ? obj_format_static_ifunc_irelative_type(l->c) : 0u;
LinkSymId ifunc_init_sym = LINK_SYM_NONE;
Sym ifunc_init_name = 0;
const LinkArchDesc* arch = link_arch_desc_for(l->c);
@@ -1056,7 +893,6 @@ void link_resolve_entry(Linker* l, LinkImage* img) {
void link_emit_relocations(Linker* l, LinkImage* img, const LinkSymId* got_map,
const LinkSymId* stub_map) {
- const LinkArchDesc* arch = link_arch_desc_for(l->c);
u32 ii;
for (ii = 0; ii < LinkInputs_count(&l->inputs); ++ii) {
ObjBuilder* ob = LinkInputs_at(&l->inputs, ii)->obj;
@@ -1083,15 +919,14 @@ void link_emit_relocations(Linker* l, LinkImage* img, const LinkSymId* got_map,
if (target == LINK_SYM_NONE)
compiler_panic(l->c, SRCLOC_NONE,
"link: reloc references unmapped symbol");
- if (got_map && reloc_uses_got(r->kind)) {
+ if (got_map && reloc_kind_uses_got(l->c, r->kind)) {
LinkSymId slot = got_map[target];
if (slot == LINK_SYM_NONE)
compiler_panic(l->c, SRCLOC_NONE,
"link: GOT slot missing for symbol");
target = slot;
}
- if (stub_map && arch && arch->needs_jit_call_stub &&
- arch->needs_jit_call_stub(r->kind)) {
+ if (stub_map && reloc_kind_is_branch(l->c, r->kind)) {
LinkSymId stub = stub_map[target];
if (stub != LINK_SYM_NONE) target = stub;
}
@@ -1105,7 +940,7 @@ void link_emit_relocations(Linker* l, LinkImage* img, const LinkSymId* got_map,
rec.section_id = r->section_id;
rec.link_section_id = ls->id;
rec.offset = r->offset - (u32)ls->obj_offset;
- rec.width = reloc_width((RelocKind)r->kind);
+ rec.width = reloc_kind_width(l->c, (RelocKind)r->kind);
rec.write_vaddr = ls->vaddr + rec.offset;
rec.write_file_offset = ls->file_offset + rec.offset;
rec.kind = (RelocKind)r->kind;
diff --git a/src/obj/macho/link.c b/src/obj/macho/link.c
@@ -50,6 +50,7 @@
#include "core/vec.h"
#include "link/link_arch.h"
#include "link/link_internal.h"
+#include "link/link_reloc_desc.h"
#include "obj/format.h"
#include "obj/macho/macho.h"
@@ -417,9 +418,7 @@ static void collect_imports(MCtx* x) {
/* Back-classify: any CALL26/JUMP26 reloc target -> function. */
for (u32 i = 0; i < LinkRelocs_count(&img->relocs); ++i) {
LinkRelocApply* r = LinkRelocs_at(&img->relocs, i);
- if (!x->link_arch->is_branch_reloc ||
- !x->link_arch->is_branch_reloc(r->kind))
- continue;
+ if (!reloc_kind_is_branch(x->c, r->kind)) continue;
if (r->target == LINK_SYM_NONE || r->target >= x->sym_to_imp_size) continue;
u32 idx = x->sym_to_imp[r->target];
if (!idx) {
@@ -489,9 +488,7 @@ static void collect_imports(MCtx* x) {
* post-ASLR. */
for (u32 i = 0; i < LinkRelocs_count(&img->relocs); ++i) {
LinkRelocApply* r = LinkRelocs_at(&img->relocs, i);
- if (!x->link_arch->is_got_load_reloc ||
- !x->link_arch->is_got_load_reloc(r->kind))
- continue;
+ if (!reloc_kind_is_got_load(x->c, r->kind)) continue;
if (r->target == LINK_SYM_NONE || r->target >= x->sym_to_imp_size) continue;
if (x->sym_to_imp[r->target]) continue;
LinkSymbol* t = sym_at(img, r->target);
@@ -563,8 +560,7 @@ static void collect_tlv(MCtx* x) {
u32 cap = 0;
for (u32 i = 0; i < LinkRelocs_count(&img->relocs); ++i) {
LinkRelocApply* r = LinkRelocs_at(&img->relocs, i);
- if (!x->link_arch->is_tlvp_reloc || !x->link_arch->is_tlvp_reloc(r->kind))
- continue;
+ if (!reloc_kind_is_tlvp(x->c, r->kind)) continue;
if (r->target == LINK_SYM_NONE || r->target >= x->sym_to_tlv_size) continue;
/* Resolve through canonical so multiple per-input duplicate undefs
* collapse onto one __thread_ptrs slot. */
@@ -793,7 +789,8 @@ static void plan_layout(MCtx* x) {
if (x->nimports) {
x->got_size = x->nimports * MZ_GOT_SIZE;
x->got_bytes = (u8*)h->alloc(h, x->got_size, 8);
- if (!x->got_bytes) compiler_panic(x->c, SRCLOC_NONE, "link_macho: oom on got");
+ if (!x->got_bytes)
+ compiler_panic(x->c, SRCLOC_NONE, "link_macho: oom on got");
memset(x->got_bytes, 0, x->got_size);
MSec* m = &x->secs[x->nsecs++];
memset(m, 0, sizeof(*m));
@@ -1201,7 +1198,8 @@ static void plan_layout(MCtx* x) {
}
u32 cap = x->nsecs + 1u;
x->outs = (OutSec*)h->alloc(h, sizeof(OutSec) * cap, _Alignof(OutSec));
- if (!x->outs) compiler_panic(x->c, SRCLOC_NONE, "link_macho: oom on OutSec");
+ if (!x->outs)
+ compiler_panic(x->c, SRCLOC_NONE, "link_macho: oom on OutSec");
memset(x->outs, 0, sizeof(OutSec) * cap);
x->nouts = 0;
for (u32 i = 0; i < x->nsecs; ++i) {
@@ -1480,7 +1478,7 @@ static void apply_relocs(MCtx* x, FixList* fl) {
* before the import / internal split because an imported TLV
* descriptor doesn't use the __got slot (its address lives in
* __thread_ptrs with its own chained bind). */
- if (x->link_arch->is_tlvp_reloc && x->link_arch->is_tlvp_reloc(r->kind)) {
+ if (reloc_kind_is_tlvp(x->c, r->kind)) {
u32 tlv_idx =
(r->target < x->sym_to_tlv_size) ? x->sym_to_tlv[r->target] : 0u;
if (!tlv_idx)
@@ -1493,8 +1491,7 @@ static void apply_relocs(MCtx* x, FixList* fl) {
if (is_imp) {
MachImp* mi = (imp_idx > 0) ? &x->imports[imp_idx - 1] : NULL;
- if (x->link_arch->is_branch_reloc &&
- x->link_arch->is_branch_reloc(r->kind)) {
+ if (reloc_kind_is_branch(x->c, r->kind)) {
if (!mi || !mi->stub_idx)
compiler_panic(x->c, SRCLOC_NONE,
"link_macho: import has no stub for branch");
@@ -1502,8 +1499,7 @@ static void apply_relocs(MCtx* x, FixList* fl) {
link_reloc_apply(x->c, r->kind, P_bytes, stub_v, r->addend, P);
continue;
}
- if (x->link_arch->is_got_load_reloc &&
- x->link_arch->is_got_load_reloc(r->kind)) {
+ if (reloc_kind_is_got_load(x->c, r->kind)) {
if (!mi)
compiler_panic(x->c, SRCLOC_NONE,
"link_macho: GOT reloc for unknown import");
@@ -1511,8 +1507,7 @@ static void apply_relocs(MCtx* x, FixList* fl) {
link_reloc_apply(x->c, r->kind, P_bytes, got_v, r->addend, P);
continue;
}
- if (x->link_arch->is_direct_page_reloc &&
- x->link_arch->is_direct_page_reloc(r->kind)) {
+ if (reloc_kind_is_direct_page(x->c, r->kind)) {
/* Direct page/lo12 against an import: route through __got. */
if (!mi)
compiler_panic(x->c, SRCLOC_NONE,
@@ -1560,8 +1555,7 @@ static void apply_relocs(MCtx* x, FixList* fl) {
* for any extern global, even if the def is in-image). imp_idx
* was populated by collect_imports' internal-GOT pass; redirect
* the page/lo12 reloc to the GOT slot's vaddr. */
- if (imp_idx > 0 && x->link_arch->is_got_load_reloc &&
- x->link_arch->is_got_load_reloc(r->kind)) {
+ if (imp_idx > 0 && reloc_kind_is_got_load(x->c, r->kind)) {
MachImp* mi = &x->imports[imp_idx - 1];
u64 got_v = x->got_vaddr + (mi->got_idx - 1u) * MZ_GOT_SIZE;
link_reloc_apply(x->c, r->kind, P_bytes, got_v, r->addend, P);
@@ -1969,7 +1963,8 @@ static void build_exports_trie(MCtx* x) {
uleb128(out, leaf_pos);
/* leaf node */
if (out->len != leaf_pos)
- compiler_panic(x->c, SRCLOC_NONE, "macho: exports trie leaf offset mismatch");
+ compiler_panic(x->c, SRCLOC_NONE,
+ "macho: exports trie leaf offset mismatch");
/* terminal_size byte then payload */
mbuf_u8(out, (u8)leaf_payload_len);
uleb128(out, flags);
diff --git a/src/obj/obj.h b/src/obj/obj.h
@@ -450,7 +450,8 @@ ObjSymId obj_symbol_ex(ObjBuilder*, Sym name, SymBind, SymVis, SymKind,
u64 common_align);
/* Allocate a stable symbol id for data that may be discarded before emission.
* The returned symbol is tombstoned and not entered in the name index; callers
- * must publish it with obj_symbol_define_live if the data is actually emitted. */
+ * must publish it with obj_symbol_define_live if the data is actually emitted.
+ */
ObjSymId obj_symbol_defer(ObjBuilder*, Sym name, SymBind, SymVis, SymKind,
u64 size);
ObjSymId obj_symbol_find(ObjBuilder*, Sym name);
@@ -825,6 +826,12 @@ int obj_format_weak_extern_underscore_alias(const Compiler*);
* knowledge lives. */
int obj_format_static_ifunc_via_rela_iplt(const Compiler*);
+/* The R_*_IRELATIVE resolver reloc wire type for the active target's
+ * __rela_iplt table (paired with the predicate above), resolved through the
+ * target object format so the generic iplt pass names no format literal.
+ * Returns 0 when the format has no such reloc. */
+u32 obj_format_static_ifunc_irelative_type(const Compiler*);
+
/* Per-arch variant-I TP bias for the active target's ELF arch: distance
* from the TLS image start to where `tp` points in kit's freestanding
* layout (16 for AArch64/RISC-V, 0 for x86_64 variant-II). Returns 0
diff --git a/src/obj/obj_secnames.c b/src/obj/obj_secnames.c
@@ -76,8 +76,7 @@ const char* obj_macho_canon_secname(SecKind kind) {
* rules of the Mach-O reader (sec_kind_from_seg_sect in macho/read.c)
* for the canonical names, but is name-only (no S_TYPE flags) so a
* format-neutral caller can classify without the raw section header. */
-int obj_macho_seckind_for_secname(const char* name, size_t len,
- SecKind* kind) {
+int obj_macho_seckind_for_secname(const char* name, size_t len, SecKind* kind) {
const char* comma;
size_t seg_len, sect_off, sect_len;
if (!name || len == 0) return 0;
@@ -354,7 +353,8 @@ int obj_format_supports_symbol_feature(const Compiler* c, int symfeat) {
/* The only format-divergent feature axis today is TLS access: only ELF and
* Mach-O can represent the ELF/Mach-O TLS-access features the CG layer mints.
* COFF (Windows TEB model) and Wasm cannot. Every other (non-TLS) feature is
- * representable by every format. The per-format answer lives on the vtable. */
+ * representable by every format. The per-format answer lives on the vtable.
+ */
switch (symfeat) {
case KIT_CG_SYMFEAT_TLS_LOCAL_EXEC:
case KIT_CG_SYMFEAT_TLS_INITIAL_EXEC:
@@ -376,6 +376,19 @@ int obj_format_static_ifunc_via_rela_iplt(const Compiler* c) {
return c && c->target.os == KIT_OS_FREEBSD && c->target.obj == KIT_OBJ_ELF;
}
+u32 obj_format_static_ifunc_irelative_type(const Compiler* c) {
+ /* The R_*_IRELATIVE resolver wire type for the __rela_iplt table the
+ * predicate above selects. Resolves through the *target* format rather
+ * than a literal KIT_OBJ_ELF so the generic iplt pass names no format
+ * constant; non-ELF formats have no elf_arch and yield 0. */
+ const ObjFormatImpl* fmt;
+ const ObjElfArchOps* ao;
+ if (!c) return 0u;
+ fmt = obj_format_lookup(c->target.obj);
+ ao = (fmt && fmt->elf_arch) ? fmt->elf_arch(c->target.arch) : NULL;
+ return ao ? ao->r_irelative : 0u;
+}
+
u32 obj_format_elf_tls_tp_bias(const Compiler* c) {
const ObjFormatImpl* fmt;
const ObjElfArchOps* arch;
diff --git a/src/obj/reloc.c b/src/obj/reloc.c
@@ -0,0 +1,43 @@
+/* Arch-neutral relocation descriptors + the shared row-table lookup.
+ *
+ * This is the format-neutral obj-core half of the relocation descriptor:
+ * the arch-independent data-word kinds whose width and (absent)
+ * classification are the same on every target. The arch-family kinds live
+ * in src/arch/<arch>/reloc.c and are stitched in by reloc_desc() in the
+ * link layer. See doc/plan/RELOC.md (WS-B). */
+
+#include "obj/reloc.h"
+
+const RelocDesc* reloc_desc_row_find(const RelocDescRow* rows, u32 n,
+ RelocKind k) {
+ u32 i;
+ for (i = 0; i < n; ++i)
+ if (rows[i].kind == (u16)k) return &rows[i].desc;
+ return NULL;
+}
+
+/* Arch-independent kinds: same width / byte encoding on every target.
+ *
+ * The COFF section-relative kinds (SECREL / SECTION / ADDR32NB) are
+ * likewise arch-neutral data words and belong here; the COFF *instruction*
+ * relocs (R_COFF_AARCH64_*) are AArch64 ISA encoders and live in that
+ * arch's slice.
+ *
+ * R_PLT32 is neutral for width (4) but its IS_BRANCH classification is
+ * arch-local — true on x86_64 / RISC-V, false on AArch64. The branch flag
+ * is therefore set in the x64 / riscv slices (which reloc_desc() consults
+ * first); this row supplies both the width fallback and the correct
+ * "not a branch" answer on AArch64. */
+static const RelocDescRow neutral_rows[] = {
+ {R_ABS32, {4, 0}}, {R_ABS64, {8, 0}},
+ {R_REL32, {4, 0}}, {R_REL64, {8, 0}},
+ {R_PC32, {4, 0}}, {R_PC64, {8, 0}},
+ {R_GOT32, {4, 0}}, {R_PLT32, {4, 0}},
+ {R_COFF_SECREL, {4, 0}}, {R_COFF_SECTION, {2, 0}},
+ {R_COFF_ADDR32NB, {4, 0}},
+};
+
+const RelocDesc* reloc_desc_neutral(RelocKind k) {
+ return reloc_desc_row_find(
+ neutral_rows, (u32)(sizeof neutral_rows / sizeof neutral_rows[0]), k);
+}
diff --git a/src/obj/reloc.h b/src/obj/reloc.h
@@ -0,0 +1,52 @@
+#ifndef KIT_OBJ_RELOC_H
+#define KIT_OBJ_RELOC_H
+
+#include "core/core.h"
+#include "obj/obj.h"
+
+/* Static, structural facts about a relocation kind: how wide the patched
+ * field is and how the linker must classify it. This is the arch-owned
+ * "width + classification" half of a relocation kind; the wire encoding
+ * and the diagnostic name live on the per-(arch,format) wire ops in
+ * src/obj/<fmt>/reloc_<arch>.c. See doc/plan/RELOC.md (WS-B).
+ *
+ * A kind's descriptor is resolved arch-aware via reloc_desc() in
+ * src/link/link_reloc_desc.h: each arch owns a slice (src/arch/<arch>/
+ * reloc.c) reached through LinkArchDesc.reloc_desc, falling back to the
+ * arch-neutral table here. Adding an arch's relocation is one row in that
+ * arch's slice (plus its wire-translator entry) — no generic switch. */
+typedef enum RelocDescFlag {
+ RELOC_USES_GOT = 1u << 0, /* direct GOT load: needs a (non-TLS) GOT slot */
+ RELOC_IS_TLS_GOT = 1u << 1, /* GOT slot holds a TP-relative offset (TLS-IE) */
+ RELOC_IS_BRANCH = 1u << 2, /* range-limited call/jump; may need a veneer */
+ RELOC_IS_TLVP = 1u << 3, /* Mach-O TLV descriptor page / pageoff */
+ RELOC_DIRECT_PAGE = 1u << 4, /* Mach-O ADRP-direct (non-GOT) page / pageoff */
+ RELOC_MARKER = 1u << 5, /* no bytes patched (RELAX / TPREL_ADD) */
+ RELOC_WIDTH_DYN = 1u << 6, /* width read from the bytes at apply (ULEB128) */
+} RelocDescFlag;
+
+typedef struct RelocDesc {
+ u8 width; /* patched-field width in bytes; nominal when RELOC_WIDTH_DYN */
+ u8 flags; /* RelocDescFlag bitset */
+} RelocDesc;
+
+/* One row of a per-arch / neutral descriptor table. `kind` holds a
+ * RelocKind narrowed to u16 (the enum fits comfortably). */
+typedef struct RelocDescRow {
+ u16 kind;
+ RelocDesc desc;
+} RelocDescRow;
+
+/* Linear lookup over a static row table; returns the matching row's
+ * descriptor or NULL. Per-arch slices and the neutral table are small and
+ * looked up off the hot path, so a scan keeps each slice a plain data
+ * table with no parallel index to maintain. */
+const RelocDesc* reloc_desc_row_find(const RelocDescRow* rows, u32 n,
+ RelocKind k);
+
+/* Descriptor for an arch-independent (data-word / format-neutral) kind, or
+ * NULL. Arch-family kinds resolve through the per-arch slice instead — use
+ * reloc_desc() (src/link/link_reloc_desc.h), not this directly. */
+const RelocDesc* reloc_desc_neutral(RelocKind k);
+
+#endif
diff --git a/test/link/reloc_desc_test.c b/test/link/reloc_desc_test.c
@@ -0,0 +1,296 @@
+/* Relocation-descriptor migration guard (doc/plan/RELOC.md, WS-B).
+ *
+ * The per-arch RelocDesc table replaced the generic reloc_width /
+ * reloc_uses_got / reloc_is_tls_got switches and the per-arch
+ * LinkArchDesc.is_branch / is_got_load / is_tlvp / is_direct_page /
+ * needs_jit_call_stub hooks. This test pins behavioural equivalence: for
+ * every RelocKind, under every backend arch, reloc_desc() and the
+ * reloc_kind_* predicates must reproduce the frozen pre-refactor behaviour
+ * captured by the oracle_* functions below.
+ *
+ * The oracle_* bodies are VERBATIM snapshots of the deleted code — do not
+ * "improve" them; they are the spec the descriptor table must match.
+ *
+ * Coverage of per-arch *ownership* of non-classifier kinds (a kind sized
+ * under the wrong arch) is left to the bootstrap / smoke link oracles, which
+ * patch every kind a backend actually emits — the unit guard here pins the
+ * width of every row that exists, the classification of every kind, and that
+ * no previously-sized kind lost its descriptor. */
+
+#include <kit/cg.h>
+#include <kit/core.h>
+
+#include "core/core.h"
+#include "lib/kit_unit.h"
+#include "link/link_reloc_desc.h"
+#include "obj/obj.h"
+#include "obj/reloc.h"
+
+static KitUnit g_u;
+#define EXPECT(cond, ...) CU_EXPECT(&g_u, cond, __VA_ARGS__)
+
+/* ============================================================
+ * Frozen pre-refactor behaviour (the migration guard)
+ * ============================================================ */
+
+#define ORACLE_RV_ULEB128_NOMINAL_WIDTH 1u
+
+/* Verbatim snapshot of the former reloc_width() (arch-independent). */
+static u8 oracle_width(RelocKind k) {
+ switch (k) {
+ case R_ABS32:
+ case R_REL32:
+ case R_PC32:
+ case R_GOT32:
+ case R_PLT32:
+ case R_X64_PLT32:
+ case R_X64_32S:
+ case R_X64_TPOFF32:
+ case R_X64_GOTPCREL:
+ case R_X64_GOTPCRELX:
+ case R_X64_REX_GOTPCRELX:
+ case R_X64_GOTPC32:
+ case R_X64_GOTTPOFF:
+ return 4;
+ case R_ABS64:
+ case R_REL64:
+ case R_PC64:
+ case R_X64_TPOFF64:
+ case R_X64_GLOB_DAT:
+ case R_X64_JUMP_SLOT:
+ case R_X64_RELATIVE:
+ return 8;
+ case R_AARCH64_ABS16:
+ case R_AARCH64_PREL16:
+ return 2;
+ case R_X64_PC8:
+ return 1;
+ case R_AARCH64_JUMP26:
+ case R_AARCH64_CALL26:
+ case R_AARCH64_CONDBR19:
+ case R_AARCH64_TSTBR14:
+ case R_AARCH64_LD_PREL_LO19:
+ case R_AARCH64_ADR_PREL_LO21:
+ case R_AARCH64_ADR_PREL_PG_HI21:
+ case R_AARCH64_ADR_PREL_PG_HI21_NC:
+ case R_AARCH64_ADD_ABS_LO12_NC:
+ case R_AARCH64_LDST8_ABS_LO12_NC:
+ case R_AARCH64_LDST16_ABS_LO12_NC:
+ case R_AARCH64_LDST32_ABS_LO12_NC:
+ case R_AARCH64_LDST64_ABS_LO12_NC:
+ case R_AARCH64_LDST128_ABS_LO12_NC:
+ case R_AARCH64_ADR_GOT_PAGE:
+ case R_AARCH64_LD64_GOT_LO12_NC:
+ case R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21:
+ case R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC:
+ case R_AARCH64_TLSLE_ADD_TPREL_HI12:
+ case R_AARCH64_TLSLE_ADD_TPREL_LO12_NC:
+ case R_AARCH64_TLVP_LOAD_PAGE21:
+ case R_AARCH64_TLVP_LOAD_PAGEOFF12:
+ return 4;
+ case R_RV_HI20:
+ case R_RV_LO12_I:
+ case R_RV_LO12_S:
+ case R_RV_BRANCH:
+ case R_RV_JAL:
+ case R_RV_PCREL_HI20:
+ case R_RV_PCREL_LO12_I:
+ case R_RV_PCREL_LO12_S:
+ case R_RV_GOT_HI20:
+ case R_RV_TLS_GOT_HI20:
+ case R_RV_TPREL_HI20:
+ case R_RV_TPREL_LO12_I:
+ case R_RV_TPREL_LO12_S:
+ return 4;
+ case R_RV_CALL:
+ return 8;
+ case R_RV_RVC_BRANCH:
+ case R_RV_RVC_JUMP:
+ return 2;
+ case R_RV_RELAX:
+ case R_RV_TPREL_ADD:
+ return 4;
+ case R_RV_ADD8:
+ case R_RV_SUB8:
+ case R_RV_SUB6:
+ case R_RV_SET6:
+ case R_RV_SET8:
+ return 1;
+ case R_RV_ADD16:
+ case R_RV_SUB16:
+ case R_RV_SET16:
+ return 2;
+ case R_RV_ADD32:
+ case R_RV_SUB32:
+ case R_RV_SET32:
+ return 4;
+ case R_RV_ADD64:
+ case R_RV_SUB64:
+ return 8;
+ case R_RV_SET_ULEB128:
+ case R_RV_SUB_ULEB128:
+ return ORACLE_RV_ULEB128_NOMINAL_WIDTH;
+ case R_COFF_SECREL:
+ case R_COFF_ADDR32NB:
+ return 4;
+ case R_COFF_SECTION:
+ return 2;
+ case R_COFF_AARCH64_SECREL_LOW12A:
+ case R_COFF_AARCH64_SECREL_HIGH12A:
+ return 4;
+ default:
+ return 0;
+ }
+}
+
+/* Per-arch classification snapshots. The deleted reloc_uses_got /
+ * reloc_is_tls_got were arch-independent switches; they partition cleanly
+ * by arch (each case belongs to exactly one backend), so the equivalent
+ * arch-scoped predicate is the per-arch slice below. branch / got_load /
+ * tlvp / direct_page mirror the former per-arch LinkArchDesc hooks (NULL
+ * hook == always 0). */
+
+/* RELOC_USES_GOT set (direct GOT load): the Mach-O is_got_load hook for
+ * aa64/x64, and the GOT-allocating subset of reloc_uses_got for rv. */
+static int oracle_aa64_got_use(RelocKind k) {
+ return k == R_AARCH64_ADR_GOT_PAGE || k == R_AARCH64_LD64_GOT_LO12_NC;
+}
+static int oracle_x64_got_use(RelocKind k) {
+ return k == R_X64_GOTPCREL || k == R_X64_GOTPCRELX ||
+ k == R_X64_REX_GOTPCRELX;
+}
+static int oracle_rv_got_use(RelocKind k) { return k == R_RV_GOT_HI20; }
+
+/* RELOC_IS_TLS_GOT set (TLS Initial-Exec GOT slot). */
+static int oracle_aa64_tls_got(RelocKind k) {
+ return k == R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21 ||
+ k == R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC;
+}
+static int oracle_x64_tls_got(RelocKind k) { return k == R_X64_GOTTPOFF; }
+static int oracle_rv_tls_got(RelocKind k) { return k == R_RV_TLS_GOT_HI20; }
+
+/* needs_jit_call_stub / is_branch_reloc. */
+static int oracle_aa64_branch(RelocKind k) {
+ return k == R_AARCH64_CALL26 || k == R_AARCH64_JUMP26;
+}
+static int oracle_x64_branch(RelocKind k) {
+ return k == R_X64_PLT32 || k == R_PLT32;
+}
+static int oracle_rv_branch(RelocKind k) {
+ return k == R_RV_CALL || k == R_PLT32;
+}
+
+/* is_tlvp_reloc (Mach-O aa64 only). */
+static int oracle_aa64_tlvp(RelocKind k) {
+ return k == R_AARCH64_TLVP_LOAD_PAGE21 || k == R_AARCH64_TLVP_LOAD_PAGEOFF12;
+}
+
+/* is_direct_page_reloc (Mach-O aa64 only). */
+static int oracle_aa64_direct_page(RelocKind k) {
+ switch (k) {
+ case R_AARCH64_ADR_PREL_PG_HI21:
+ case R_AARCH64_ADR_PREL_PG_HI21_NC:
+ case R_AARCH64_ADD_ABS_LO12_NC:
+ case R_AARCH64_LDST8_ABS_LO12_NC:
+ case R_AARCH64_LDST16_ABS_LO12_NC:
+ case R_AARCH64_LDST32_ABS_LO12_NC:
+ case R_AARCH64_LDST64_ABS_LO12_NC:
+ case R_AARCH64_LDST128_ABS_LO12_NC:
+ return 1;
+ default:
+ return 0;
+ }
+}
+
+static int zero_oracle(RelocKind k) {
+ (void)k;
+ return 0;
+}
+
+typedef struct ArchOracle {
+ const char* name;
+ KitArchKind arch;
+ int (*got_use)(RelocKind);
+ int (*tls_got)(RelocKind);
+ int (*branch)(RelocKind);
+ int (*tlvp)(RelocKind);
+ int (*direct_page)(RelocKind);
+} ArchOracle;
+
+static const ArchOracle kArchOracles[] = {
+ {"aarch64", KIT_ARCH_ARM_64, oracle_aa64_got_use, oracle_aa64_tls_got,
+ oracle_aa64_branch, oracle_aa64_tlvp, oracle_aa64_direct_page},
+ {"x86_64", KIT_ARCH_X86_64, oracle_x64_got_use, oracle_x64_tls_got,
+ oracle_x64_branch, zero_oracle, zero_oracle},
+ {"rv64", KIT_ARCH_RV64, oracle_rv_got_use, oracle_rv_tls_got,
+ oracle_rv_branch, zero_oracle, zero_oracle},
+ {"rv32", KIT_ARCH_RV32, oracle_rv_got_use, oracle_rv_tls_got,
+ oracle_rv_branch, zero_oracle, zero_oracle},
+};
+
+/* Last RelocKind enum value; the enum is contiguous from R_NONE = 0. */
+#define RELOC_KIND_LAST R_COFF_ADDR32NB
+
+static KitCompiler* new_compiler(KitArchKind arch) {
+ KitTargetSpec t = kit_unit_target(arch, KIT_OS_LINUX, KIT_OBJ_ELF);
+ KitCompiler* c = NULL;
+ if (arch == KIT_ARCH_RV32) {
+ t.ptr_size = 4;
+ t.ptr_align = 4;
+ }
+ if (kit_unit_compiler_new(&g_u, t, &c) != KIT_OK || !c) {
+ fprintf(stderr, "compiler_new failed for arch=%d\n", (int)arch);
+ exit(2);
+ }
+ return c;
+}
+
+int main(void) {
+ size_t a;
+ int covered[RELOC_KIND_LAST + 1];
+ int kk;
+
+ kit_unit_init(&g_u);
+ for (kk = 0; kk <= (int)RELOC_KIND_LAST; ++kk) covered[kk] = 0;
+
+ for (a = 0; a < sizeof kArchOracles / sizeof kArchOracles[0]; ++a) {
+ const ArchOracle* ao = &kArchOracles[a];
+ KitCompiler* c = new_compiler(ao->arch);
+ for (kk = 0; kk <= (int)RELOC_KIND_LAST; ++kk) {
+ RelocKind k = (RelocKind)kk;
+ const RelocDesc* d = reloc_desc(c, k);
+
+ /* Width parity wherever a row exists; coverage tracked separately. */
+ if (d) {
+ covered[kk] = 1;
+ EXPECT(d->width == oracle_width(k), "%s: width(%d) = %u, want %u",
+ ao->name, kk, (unsigned)d->width, (unsigned)oracle_width(k));
+ }
+
+ /* Classification parity, strict per arch (arch-scoped both sides). */
+ EXPECT(reloc_kind_is_got_load(c, k) == ao->got_use(k),
+ "%s: is_got_load(%d) mismatch", ao->name, kk);
+ EXPECT(reloc_kind_is_tls_got(c, k) == ao->tls_got(k),
+ "%s: is_tls_got(%d) mismatch", ao->name, kk);
+ EXPECT(reloc_kind_uses_got(c, k) == (ao->got_use(k) || ao->tls_got(k)),
+ "%s: uses_got(%d) mismatch", ao->name, kk);
+ EXPECT(reloc_kind_is_branch(c, k) == ao->branch(k),
+ "%s: is_branch(%d) mismatch", ao->name, kk);
+ EXPECT(reloc_kind_is_tlvp(c, k) == ao->tlvp(k),
+ "%s: is_tlvp(%d) mismatch", ao->name, kk);
+ EXPECT(reloc_kind_is_direct_page(c, k) == ao->direct_page(k),
+ "%s: is_direct_page(%d) mismatch", ao->name, kk);
+ }
+ }
+
+ /* No previously-sized kind lost its descriptor: every kind the old
+ * reloc_width() sized must resolve under at least one backend arch. */
+ for (kk = 0; kk <= (int)RELOC_KIND_LAST; ++kk) {
+ if (oracle_width((RelocKind)kk) != 0)
+ EXPECT(covered[kk], "kind %d had width %u but resolves under no arch", kk,
+ (unsigned)oracle_width((RelocKind)kk));
+ }
+
+ kit_unit_summary(&g_u, "reloc_desc_test");
+ return kit_unit_status(&g_u);
+}
diff --git a/test/objdump/rv64/cases/03-reloc-annotations.expected b/test/objdump/rv64/cases/03-reloc-annotations.expected
@@ -1,7 +1,7 @@
== reloc records ==
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
-RV_CALL helper
+R_RISCV_CALL helper
RELOCATION RECORDS FOR [.eh_frame]:
OFFSET TYPE VALUE
== call site annotation ==
diff --git a/test/objdump/rv64/cases/03-reloc-annotations.sh b/test/objdump/rv64/cases/03-reloc-annotations.sh
@@ -1,6 +1,9 @@
# Golden: relocation records + inline disasm annotation for an rv64
# call site. Asserts the auipc/jalr pair carries the symbol annotation
-# AND the relocation table prints the canonical kind name (RV_CALL).
+# AND the relocation table prints the ELF-canonical kind name
+# (R_RISCV_CALL, via the per-arch ObjElfArchOps.reloc_name table — matching
+# binutils objdump). The inline disasm annotation keeps the arch-neutral
+# spelling ([RV_CALL]); it comes from the disassembler, not the obj reader.
cat > t.c <<'EOF'
extern int helper(int);