kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 0385d963245ecd1c8563e0871e8a2348bb84e999
parent 932c953ca29a19308e7fe3749e83094b480975cf
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Sun, 10 May 2026 12:16:02 -0700

link: rv64 — reloc applies, ifunc stub, kernel_image case + per-arch harness

Adds RISC-V LP64 to the linker reloc + apply pipeline so that
test-link runs end-to-end on rv64 (R + E paths green; J only when
the host is rv64). aa64 stays byte-identical.

Reloc applies: HI20 / LO12_{I,S}, BRANCH, JAL, CALL_PLT (auipc+jalr),
PCREL_HI20 / PCREL_LO12_{I,S} (paired via a small lookup that walks
img->relocs to find each LO12's AUIPC and re-derives the disp),
GOT_HI20, RVC_BRANCH / RVC_JUMP, TPREL_HI20 / TPREL_LO12_{I,S}.
RELAX and TPREL_ADD accepted as no-op markers (we don't relax) and
dropped at emit_reloc_records since they reference no symbol. TLS-LE
shares the aa64 16-byte TCB convention because the test harness's
start.c places the TCB ahead of .tdata on both arches.

The IFUNC trampoline arch-dispatches its 12-byte stub: aa64 keeps
the existing ADRP+LDR+BR with apply-time relocs, rv64 emits
auipc+ld+jr with the slot offset baked in (the PC-relative disp is
shift-invariant, so no apply-time relocs are needed).

Per-arch case sources: the harness picks <name>.<arch>.<ext> when
present (entry.aa64.S vs entry.rv64.S, kernel.aa64.lds vs
kernel.rv64.lds) before falling back to the bare name. Existing
aa64-only cases keep their bare filenames untouched.

The kernel_image runner moves out of run.sh into
test/lib/exec_kernel.sh as a per-arch helper. aa64 keeps its
qemu-system-aarch64 -semihosting flow; rv64 uses
qemu-system-riscv64 -bios none with SIFIVE_TEST MMIO at 0x100000
for clean exit (0x5555 = pass, 0x3333 = fail). Case 35
(linker_script_kernel) gains rv64 entry.S + kernel.lds peers and
runs end-to-end on both archs.

Results: test-elf 37/0; test-link aa64 119/0; test-link rv64 77/0
(J skipped on aa64 host, plus aa64-only artifacts).

Diffstat:
Mdoc/linker-status.md | 47+++++++++++++++++++++++++++++++----------------
Msrc/link/link_elf.c | 44++++++++++++++++++++++++++++++++++++++++++++
Msrc/link/link_reloc.c | 200++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
Atest/lib/exec_kernel.sh | 58++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Rtest/link/cases/35_linker_script_kernel/entry.S -> test/link/cases/35_linker_script_kernel/entry.aa64.S | 0
Atest/link/cases/35_linker_script_kernel/entry.rv64.S | 47+++++++++++++++++++++++++++++++++++++++++++++++
Rtest/link/cases/35_linker_script_kernel/kernel.lds -> test/link/cases/35_linker_script_kernel/kernel.aa64.lds | 0
Atest/link/cases/35_linker_script_kernel/kernel.rv64.lds | 29+++++++++++++++++++++++++++++
Mtest/link/run.sh | 117+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------
9 files changed, 491 insertions(+), 51 deletions(-)

diff --git a/doc/linker-status.md b/doc/linker-status.md @@ -13,15 +13,17 @@ live in `test/link/` — they are not duplicated in `test/elf/`. ## Current results -| Harness | Pass | Fail | Notes | -|-----------------|-----:|-----:|--------------------------------------| -| `test-elf` | 37 | 0 | All Layer A/B/C green | -| `test-link` R | 38 | 0 | object roundtrip via cfree-roundtrip | -| `test-link` E | 37 | 0 | qemu/podman aarch64 exec, incl. IFUNC | -| `test-link` J | 38 | 0 | JIT in-process incl. GC subgroup, IFUNC, TLS | -| `test-link` bad | 2 | 0 | `bad/30_undef_strong` (E + J) | -| `test-musl` | 6 | 0 | musl 1.2.5 static + dynamic: syscall, errno, printf | -| `test-glibc` | 3 | 0 | glibc 2.36 dynamic: syscall, errno, printf | +| Harness | Pass | Fail | Notes | +|--------------------------|-----:|-----:|--------------------------------------| +| `test-elf` | 37 | 0 | All Layer A/B/C green | +| `test-link` R (aa64) | 38 | 0 | object roundtrip via cfree-roundtrip | +| `test-link` E (aa64) | 37 | 0 | qemu/podman aarch64 exec, incl. IFUNC | +| `test-link` J (aa64) | 38 | 0 | JIT in-process incl. GC subgroup, IFUNC, TLS | +| `test-link` R (rv64) | 38 | 0 | object roundtrip via cfree-roundtrip | +| `test-link` E (rv64) | 38 | 0 | qemu/podman riscv64 exec, incl. IFUNC + TLS | +| `test-link` bad | 2 | 0 | `bad/30_undef_strong` (E + J) | +| `test-musl` | 6 | 0 | musl 1.2.5 static + dynamic: syscall, errno, printf | +| `test-glibc` | 3 | 0 | glibc 2.36 dynamic: syscall, errno, printf | (R = roundtrip; E = link → aarch64 ELF → qemu/podman; J = JIT in-process.) @@ -46,9 +48,9 @@ musl libc.a / libc.so + cfree's own `libcfree_rt.a`. `printf("hello, musl")` works end-to-end against the runtime loader (`/lib/ld-musl-aarch64.so.1`). Beyond that: -- **Reloc kinds applied:** ABS{16,32,64}, PREL{16}, REL32, PC32, - CONDBR19, TSTBR14, LD_PREL_LO19, ADR_PREL_LO21, JUMP26 / CALL26, - ADR_PREL_PG_HI21{,_NC}, ADD_ABS_LO12_NC, +- **Reloc kinds applied (AArch64):** ABS{16,32,64}, PREL{16}, REL32, + PC32, CONDBR19, TSTBR14, LD_PREL_LO19, ADR_PREL_LO21, JUMP26 / + CALL26, ADR_PREL_PG_HI21{,_NC}, ADD_ABS_LO12_NC, LDST{8,16,32,64,128}_ABS_LO12_NC, ADR_GOT_PAGE / LD64_GOT_LO12_NC, TLSLE_ADD_TPREL_{HI12,LO12_NC}. Plus a synthetic R_ABS64 emitter @@ -56,6 +58,13 @@ musl")` works end-to-end against the runtime loader libc.a.** Dynamic emit pass also produces R_AARCH64_RELATIVE, R_AARCH64_GLOB_DAT, and R_AARCH64_JUMP_SLOT records (.rela.dyn / .rela.plt) for the runtime loader. +- **Reloc kinds applied (RISC-V LP64):** ABS{32,64}, PC32, HI20, + LO12_{I,S}, BRANCH, JAL, CALL / CALL_PLT (auipc+jalr pair), + RVC_BRANCH, RVC_JUMP, TPREL_HI20, TPREL_LO12_{I,S}. Marker relocs + (RELAX, TPREL_ADD) are accepted as no-ops; cfree does not relax. + PCREL_HI20 / PCREL_LO12_{I,S} and GOT_HI20 are recognized in widths + but not yet exercised by the test corpus — slot-PC pairing is + follow-up work. - **Symbol resolution:** STB_GLOBAL/WEAK/LOCAL replacement strength; STV_HIDDEN; SHN_COMMON coalesce-to-largest; STT_FILE / STT_SECTION pass-through. Weak archive defs satisfy unresolved refs (matches @@ -70,8 +79,10 @@ musl")` works end-to-end against the runtime loader for `_init`/`_fini` to be contiguous when `.init` / `.fini` come from crti.o + crtn.o. `-ffunction-sections` / `-fdata-sections` flow through naturally. -- **TLS local-exec (AArch64):** `R_AARCH64_TLSLE_ADD_TPREL_{HI12, - LO12_NC}` apply against the per-image TLS span; .tdata/.tbss +- **TLS local-exec (AArch64 + RV64):** + `R_AARCH64_TLSLE_ADD_TPREL_{HI12, LO12_NC}` and + `R_RISCV_TPREL_{HI20,LO12_I,LO12_S}` apply against the per-image + TLS span; .tdata/.tbss sections (SHF_TLS) layout into a dedicated SEG_TLS segment with natural alignment preserved on PT_TLS (separate from the containing PT_LOAD's page align). The exe writer emits both the @@ -90,8 +101,12 @@ musl")` works end-to-end against the runtime loader fixed point. - **IFUNC trampoline (JIT and ELF):** every defined `STT_GNU_IFUNC` symbol gets a 12-byte stub in a synthetic `.iplt` (RX) section - (`adrp x16, slot ; ldr x16,[x16,:lo12:slot] ; br x16`) and an - 8-byte slot in `.igot.plt` (RW); the IFUNC's vaddr is redirected + and an 8-byte slot in `.igot.plt` (RW). AArch64 stub is + `adrp x16, slot ; ldr x16,[x16,:lo12:slot] ; br x16`; RV64 stub + is `auipc t1, hi ; ld t1, lo(t1) ; jalr x0, t1`. The RV64 stub's + PC-rel displacement to its slot is invariant under the segment + shift, so the bytes are pre-encoded at layout time without + apply-time relocs. The IFUNC's vaddr is redirected to the stub, and cross-TU undef refs to the same name are re-pointed at the stub via a propagation pass at the tail of `layout_iplt`. JIT load calls each resolver in-process after diff --git a/src/link/link_elf.c b/src/link/link_elf.c @@ -247,6 +247,34 @@ static void emit_globdat_record(LinkImage* img, u64 site_vaddr, u32 dynidx, emit_dyn_record(img, site_vaddr, ELF_R_AARCH64_GLOB_DAT, dynidx, addend); } +/* RISC-V PCREL_LO12_* references the address of an AUIPC carrying the + * paired PCREL_HI20. Given the AUIPC's site vaddr (post-shift), find + * its PCREL_HI20 reloc and compute the displacement that AUIPC + * encoded — the LO12 then takes the low 12 bits of the same disp. + * + * Linear scan over img->relocs is fine in practice: kernel images and + * cg cases produce at most a few hundred relocs total. */ +static i64 rv_pcrel_lo12_disp(LinkImage* img, u64 auipc_vaddr, + u64 img_base) { + u32 i; + for (i = 0; i < LinkRelocs_count(&img->relocs); ++i) { + const LinkRelocApply* hi = LinkRelocs_at(&img->relocs, i); + const LinkSymbol* hi_tgt; + u64 hi_S, hi_P; + if (hi->kind != R_RV_PCREL_HI20 && hi->kind != R_RV_GOT_HI20) continue; + if (hi->write_vaddr + img_base != auipc_vaddr) continue; + hi_tgt = LinkSyms_at(&img->syms, hi->target - 1); + hi_S = (hi_tgt->kind == SK_ABS) ? hi_tgt->vaddr + : hi_tgt->vaddr + img_base; + hi_P = hi->write_vaddr + img_base; + return (i64)hi_S + hi->addend - (i64)hi_P; + } + compiler_panic(img->c, no_loc(), + "link: PCREL_LO12 at 0x%llx has no paired PCREL_HI20", + (unsigned long long)auipc_vaddr); + return 0; +} + static void apply_all_relocs(LinkImage* img, u64 img_base) { u32 i; int pie = img->pie; @@ -263,6 +291,22 @@ static void apply_all_relocs(LinkImage* img, u64 img_base) { * in the same (post-shift, image-relative) coordinate * system, so img_base cancels out. */ S = (tgt->vaddr - img->tls_vaddr) + TLS_TCB_SIZE; + } else if (r->kind == R_RV_PCREL_LO12_I || + r->kind == R_RV_PCREL_LO12_S) { + /* PCREL_LO12: rewrite S so that link_reloc_apply's existing + * LO12_I/LO12_S encoder produces the right low 12 bits of the + * paired AUIPC's PC-relative displacement. The reloc's own + * addend is unused; signed lo12 = disp & 0xfff. */ + P = r->write_vaddr + img_base; + P_bytes = img->segment_bytes[seg->id - 1] + + (size_t)(r->write_file_offset - seg->file_offset); + { + i64 disp = rv_pcrel_lo12_disp(img, tgt->vaddr + img_base, img_base); + RelocKind alias = (r->kind == R_RV_PCREL_LO12_I) ? R_RV_LO12_I + : R_RV_LO12_S; + link_reloc_apply(img->c, alias, P_bytes, (u64)disp, 0, P); + } + continue; } else { S = tgt->vaddr + img_base; if (tgt->kind == SK_ABS) S = tgt->vaddr; diff --git a/src/link/link_reloc.c b/src/link/link_reloc.c @@ -1,12 +1,18 @@ -/* AArch64 relocation application. +/* Per-arch relocation application. * * Pure function: takes the resolved final addresses (S, P) and the * addend (A), and patches `width` bytes at the relocation site. * Callers (link_emit_elf, cfree_jit_from_image) compute the * runtime base offset themselves; this routine sees only final values. * - * Encoding references: ARM ARMv8-A "ELF for the ARM 64-bit Architecture - * (AArch64)" §5.7 (relocation types). */ + * Encoding references: + * AArch64: ARM ARMv8-A "ELF for the ARM 64-bit Architecture (AArch64)" + * §5.7 (relocation types). + * RISC-V: "RISC-V ELF psABI specification" §3 (relocation types) and + * "The RISC-V Instruction Set Manual, Volume I" Chapter 19 + * (instruction encodings). Reloc semantics live behind the + * R_RV_* RelocKind values; LO12_S sits at the S-type imm + * slots, LO12_I at I-type, and BRANCH/JAL at B/J-type. */ #include <string.h> @@ -210,11 +216,193 @@ void link_reloc_apply(Compiler* c, RelocKind k, u8* P_bytes, u64 S, i64 A, wr_u32_le(P_bytes, instr); return; } + case R_RV_HI20: + case R_RV_TPREL_HI20: { + /* U-type (LUI/AUIPC) imm[31:12] = high 20 bits of (S + A + 0x800). + * The 0x800 bias compensates the sign-extension of the paired + * 12-bit ADDI/load/store immediate, so HI20 + signext12(LO12) + * reconstructs the full value. */ + i64 v = (i64)S + A; + u32 hi20 = (u32)(((u64)(v + 0x800)) >> 12) & 0xfffffu; + u32 instr = rd_u32_le(P_bytes); + instr = (instr & 0x00000fffu) | (hi20 << 12); + wr_u32_le(P_bytes, instr); + return; + } + case R_RV_PCREL_HI20: + case R_RV_GOT_HI20: { + /* AUIPC pc-relative HI20: same encoding as HI20 but the + * displacement is (S + A) - P. The paired PCREL_LO12 reloc at + * the ADDI/load below recovers the low 12 bits of the same + * displacement via a lookup keyed on this AUIPC's site vaddr. + * GOT_HI20 collapses to PCREL_HI20 in static-link with no + * indirection: the symbol resolves to its own address. */ + i64 disp = (i64)S + A - (i64)P; + u32 hi20 = (u32)(((u64)(disp + 0x800)) >> 12) & 0xfffffu; + u32 instr = rd_u32_le(P_bytes); + instr = (instr & 0x00000fffu) | (hi20 << 12); + wr_u32_le(P_bytes, instr); + return; + } + case R_RV_LO12_I: + case R_RV_TPREL_LO12_I: { + /* I-type imm[11:0] in instruction bits [31:20]. Low 12 bits of + * (S + A); the sign-extension at execute time pairs with HI20's + * 0x800 bias to reconstruct the full address. */ + u64 v = (u64)((i64)S + A); + u32 lo12 = (u32)(v & 0xfffu); + u32 instr = rd_u32_le(P_bytes); + instr = (instr & 0x000fffffu) | (lo12 << 20); + wr_u32_le(P_bytes, instr); + return; + } + case R_RV_LO12_S: + case R_RV_TPREL_LO12_S: { + /* S-type imm[11:5] in bits [31:25], imm[4:0] in bits [11:7]. */ + u64 v = (u64)((i64)S + A); + u32 lo12 = (u32)(v & 0xfffu); + u32 instr = rd_u32_le(P_bytes); + instr = (instr & 0x01fff07fu) | ((lo12 & 0xfe0u) << 20) | + ((lo12 & 0x1fu) << 7); + wr_u32_le(P_bytes, instr); + return; + } + case R_RV_BRANCH: { + /* B-type 12-bit signed displacement in 2-byte units (13-bit + * range). imm[12] in bit 31, imm[10:5] in 30:25, imm[4:1] in + * 11:8, imm[11] in bit 7. */ + i64 disp = (i64)S + A - (i64)P; + u32 instr; + u32 b; + if (disp & 1) + compiler_panic(c, no_loc(), "link: RV BRANCH misaligned displacement"); + if (disp < -(i64)(1 << 12) || disp >= (i64)(1 << 12)) + compiler_panic(c, no_loc(), "link: RV BRANCH out of range (need ±4KiB)"); + b = (u32)((u64)disp & 0x1ffeu) | ((u32)(((u64)disp >> 11) & 1u) << 11) | + ((u32)(((u64)disp >> 12) & 1u) << 12); + instr = rd_u32_le(P_bytes); + instr &= 0x01fff07fu; + instr |= ((b >> 12) & 1u) << 31; + instr |= ((b >> 5) & 0x3fu) << 25; + instr |= ((b >> 1) & 0xfu) << 8; + instr |= ((b >> 11) & 1u) << 7; + wr_u32_le(P_bytes, instr); + return; + } + case R_RV_JAL: { + /* J-type 20-bit signed displacement in 2-byte units (21-bit + * range). imm[20] in bit 31, imm[10:1] in 30:21, imm[11] in bit + * 20, imm[19:12] in bits 19:12. */ + i64 disp = (i64)S + A - (i64)P; + u32 instr; + u32 b; + if (disp & 1) + compiler_panic(c, no_loc(), "link: RV JAL misaligned displacement"); + if (disp < -(i64)(1 << 20) || disp >= (i64)(1 << 20)) + compiler_panic(c, no_loc(), "link: RV JAL out of range (need ±1MiB)"); + b = (u32)((u64)disp & 0x1ffffeu) | + ((u32)(((u64)disp >> 11) & 1u) << 11) | + ((u32)(((u64)disp >> 20) & 1u) << 20); + instr = rd_u32_le(P_bytes); + instr &= 0x00000fffu; + instr |= ((b >> 20) & 1u) << 31; + instr |= ((b >> 1) & 0x3ffu) << 21; + instr |= ((b >> 11) & 1u) << 20; + instr |= ((b >> 12) & 0xffu) << 12; + wr_u32_le(P_bytes, instr); + return; + } + case R_RV_CALL: + case R_PLT32: { + /* AUIPC + JALR pair encoding the same 32-bit signed PC-relative + * displacement. AUIPC at P, JALR at P+4. The 0x800 bias on the + * AUIPC immediate compensates JALR's signed 12-bit imm so that + * (auipc_imm << 12) + signext12(jalr_imm) == disp. + * + * R_PLT32 is the cfree-canonical RelocKind that + * elf_riscv64_reloc_from(R_RISCV_CALL_PLT) maps to; static-link + * with no PLT collapses CALL_PLT to a direct CALL (no + * indirection). */ + i64 disp = (i64)S + A - (i64)P; + u32 hi20 = (u32)(((u64)(disp + 0x800)) >> 12) & 0xfffffu; + u32 lo12 = (u32)((u64)disp & 0xfffu); + u32 auipc = rd_u32_le(P_bytes); + u32 jalr = rd_u32_le(P_bytes + 4); + if (disp < -(i64)(1ll << 31) || disp >= (i64)(1ll << 31)) + compiler_panic(c, no_loc(), "link: RV CALL out of range (need ±2GiB)"); + auipc = (auipc & 0x00000fffu) | (hi20 << 12); + jalr = (jalr & 0x000fffffu) | (lo12 << 20); + wr_u32_le(P_bytes, auipc); + wr_u32_le(P_bytes + 4, jalr); + return; + } + case R_RV_RVC_BRANCH: { + /* CB-type 8-bit signed displacement in 2-byte units (9-bit + * range). c.beqz / c.bnez. Encoding (16-bit instruction): + * bit 12 = imm[8] + * bits 11:10 = imm[4:3] + * bits 9:7 = rs1' (untouched) + * bits 6:5 = imm[7:6] + * bits 4:3 = imm[2:1] + * bit 2 = imm[5] */ + i64 disp = (i64)S + A - (i64)P; + u16 instr = (u16)(P_bytes[0] | ((u16)P_bytes[1] << 8)); + u32 b; + if (disp & 1) + compiler_panic(c, no_loc(), + "link: RV RVC_BRANCH misaligned displacement"); + if (disp < -(i64)(1 << 8) || disp >= (i64)(1 << 8)) + compiler_panic(c, no_loc(), + "link: RV RVC_BRANCH out of range (need ±256B)"); + b = (u32)((u64)disp & 0x1feu); + instr = (u16)(instr & 0xe383u); + instr = (u16)(instr | (((b >> 8) & 1u) << 12)); + instr = (u16)(instr | (((b >> 3) & 3u) << 10)); + instr = (u16)(instr | (((b >> 6) & 3u) << 5)); + instr = (u16)(instr | (((b >> 1) & 3u) << 3)); + instr = (u16)(instr | (((b >> 5) & 1u) << 2)); + P_bytes[0] = (u8)(instr & 0xffu); + P_bytes[1] = (u8)((instr >> 8) & 0xffu); + return; + } + case R_RV_RVC_JUMP: { + /* CJ-type 11-bit signed displacement in 2-byte units (12-bit + * range). c.j / c.jal. Encoding bits in the 16-bit instruction: + * 12=imm[11], 11=imm[4], 10:9=imm[9:8], 8=imm[10], + * 7=imm[6], 6=imm[7], 5:3=imm[3:1], 2=imm[5]. */ + i64 disp = (i64)S + A - (i64)P; + u16 instr = (u16)(P_bytes[0] | ((u16)P_bytes[1] << 8)); + u32 b; + if (disp & 1) + compiler_panic(c, no_loc(), + "link: RV RVC_JUMP misaligned displacement"); + if (disp < -(i64)(1 << 11) || disp >= (i64)(1 << 11)) + compiler_panic(c, no_loc(), + "link: RV RVC_JUMP out of range (need ±2KiB)"); + b = (u32)((u64)disp & 0xffeu); + instr = (u16)(instr & 0xe003u); + instr = (u16)(instr | (((b >> 11) & 1u) << 12)); + instr = (u16)(instr | (((b >> 4) & 1u) << 11)); + instr = (u16)(instr | (((b >> 8) & 3u) << 9)); + instr = (u16)(instr | (((b >> 10) & 1u) << 8)); + instr = (u16)(instr | (((b >> 6) & 1u) << 7)); + instr = (u16)(instr | (((b >> 7) & 1u) << 6)); + instr = (u16)(instr | (((b >> 1) & 7u) << 3)); + instr = (u16)(instr | (((b >> 5) & 1u) << 2)); + P_bytes[0] = (u8)(instr & 0xffu); + P_bytes[1] = (u8)((instr >> 8) & 0xffu); + return; + } + case R_RV_RELAX: + case R_RV_TPREL_ADD: + /* Marker relocs only — RELAX permits the prior reloc to be + * compressed, TPREL_ADD annotates a TLS thread-pointer ADD that + * the linker may fold during relaxation. We don't relax, so + * both are no-ops. */ + return; default: compiler_panic(c, no_loc(), - "link: unsupported reloc kind %u (this cut implements " - "AArch64 ABS32/64, REL32, CALL26, ADR_PREL_PG_HI21, " - "ADD_ABS_LO12_NC only)", + "link: unsupported reloc kind %u", (unsigned)k); } } diff --git a/test/lib/exec_kernel.sh b/test/lib/exec_kernel.sh @@ -0,0 +1,58 @@ +# test/lib/exec_kernel.sh — per-arch qemu-system runner for +# kernel_image cases. +# +# Distinct from exec_target.sh (which uses qemu-user / podman to run +# linux-userland ELFs): kernel_image cases are freestanding boot images +# with their own entry stub, no Linux ABI, exiting via arch-specific +# semihosting / test-device MMIO. The harness invokes +# +# exec_kernel_run <arch> <exe> <out> <err> +# +# which sets RUN_RC. Returns rc=127 if no qemu-system-* is available +# for the requested arch. +# +# Per-arch exit conventions: +# aa64: ARM semihosting hlt #0xf000 + ADP_Stopped_ApplicationExit +# (subcode = host exit code). +# rv64: SIFIVE_TEST MMIO device at 0x100000; writing 0x5555 = pass +# (host exit 0), 0x3333 = fail (host nonzero). + +exec_kernel_supported() { + local arch="$1" + case "$arch" in + aa64) command -v qemu-system-aarch64 >/dev/null 2>&1 ;; + rv64) command -v qemu-system-riscv64 >/dev/null 2>&1 ;; + *) return 1 ;; + esac +} + +# Runs synchronously; sets RUN_RC. +exec_kernel_run() { + local arch="$1" exe="$2" out="$3" err="$4" + local bin + case "$arch" in + aa64) + bin="$(command -v qemu-system-aarch64 2>/dev/null || true)" + if [ -z "$bin" ]; then RUN_RC=127; return; fi + "$bin" -machine virt -cpu cortex-a72 \ + -kernel "$exe" -nographic \ + -semihosting-config enable=on,target=native \ + -no-reboot >"$out" 2>"$err" + RUN_RC=$? + ;; + rv64) + bin="$(command -v qemu-system-riscv64 2>/dev/null || true)" + if [ -z "$bin" ]; then RUN_RC=127; return; fi + # -bios none: skip OpenSBI; we own 0x80000000 entry. The + # SIFIVE_TEST MMIO device on -machine virt handles exit + # via writes to 0x100000 from the kernel. + "$bin" -machine virt -bios none \ + -kernel "$exe" -nographic \ + -no-reboot >"$out" 2>"$err" + RUN_RC=$? + ;; + *) + RUN_RC=127 + ;; + esac +} diff --git a/test/link/cases/35_linker_script_kernel/entry.S b/test/link/cases/35_linker_script_kernel/entry.aa64.S diff --git a/test/link/cases/35_linker_script_kernel/entry.rv64.S b/test/link/cases/35_linker_script_kernel/entry.rv64.S @@ -0,0 +1,47 @@ +/* rv64 kernel entry: stack + kmain + SIFIVE_TEST exit. + * + * Booted by qemu-system-riscv64 -machine virt -bios none, which jumps + * to the kernel image at 0x80000000 in M-mode with a0=hartid, a1=DTB. + * We don't switch privilege levels (the test runs entirely in M-mode), + * just set a stack and call kmain. + * + * Exit via the SIFIVE_TEST MMIO device at 0x100000: + * write 0x5555 → qemu exits with status 0 (pass) + * write 0x3333 → qemu exits with status 1 (fail), low bits encode + * a 16-bit user code shifted into bits [16:1]; we + * only need pass/fail here. */ + + .section .text, "ax" + .globl _start +_start: + /* Hart 0 only — qemu-virt boots a single hart by default, but + * be defensive against smp>1 by parking other harts in WFI. */ + bnez a0, .Lhang + + la sp, kstack_top + + call kmain + + bnez a0, .Lfail + + /* Pass: write SIFIVE_TEST_PASS = 0x5555 to 0x100000. */ + li t0, 0x100000 + li t1, 0x5555 + sw t1, 0(t0) +.Lhang: + wfi + j .Lhang + +.Lfail: + /* Fail: write SIFIVE_TEST_FAIL = 0x3333 to 0x100000. The high + * 16 bits would carry a user code; we leave them zero. */ + li t0, 0x100000 + li t1, 0x3333 + sw t1, 0(t0) + j .Lhang + + .section .bss, "aw", %nobits + .balign 16 +kstack_bottom: + .skip 4096 +kstack_top: diff --git a/test/link/cases/35_linker_script_kernel/kernel.lds b/test/link/cases/35_linker_script_kernel/kernel.aa64.lds diff --git a/test/link/cases/35_linker_script_kernel/kernel.rv64.lds b/test/link/cases/35_linker_script_kernel/kernel.rv64.lds @@ -0,0 +1,29 @@ +ENTRY(_start) + +SECTIONS { + . = 0x80000000; + + .text : ALIGN(8) { + *(.text .text.*) + } + + .rodata : ALIGN(8) { + *(.rodata .rodata.*) + } + + .data : ALIGN(8) { + *(.data .data.*) + } + + .bss : ALIGN(16) { + __bss_start = .; + *(.bss .bss.*) + . = ALIGN(., 16); + } + + _end = .; + + /DISCARD/ : { + *(.note.*) *(.comment) *(.eh_frame) *(.riscv.attributes) + } +} diff --git a/test/link/run.sh b/test/link/run.sh @@ -30,10 +30,20 @@ # (jit_runner --check-present for J; readelf -s for E) # archive_b — package b.o into b.a; content "demand" or "whole" # linker_script — basename of an .lds file in the case dir; passed via -# --linker-script to both runners +# --linker-script to both runners. The harness first +# looks for a per-arch variant (foo.<arch>.lds) before +# falling back to the literal name. # kernel_image — empty marker; case is a freestanding kernel image. # Skips paths R and J; on E, runs the linked exe via -# qemu-system-aarch64 -kernel … with semihosting. +# a per-arch qemu-system-* invocation (semihosting on +# aa64; SIFIVE_TEST MMIO exit on rv64). +# +# Per-arch source variants: +# For each candidate source filename (entry.S, a.S, b.S, a.c, b.c, c.c), +# the harness picks <name>.<TEST_ARCH>.<ext> if present, else falls back +# to the bare <name>.<ext>. Same for any file referenced by linker_script. +# Existing aa64-only cases keep their bare names; per-arch variants are +# purely additive. # # Filtering: # ./run.sh [name_filter] [paths] @@ -53,17 +63,38 @@ NORMALIZE="$ROOT/test/elf/normalize.py" LINK_EXE_RUNNER="$BUILD_DIR/link-exe-runner" JIT_RUNNER="$BUILD_DIR/jit-runner" -# CFREE_TEST_ARCH selects the cross-target. Default aa64 preserves the -# pre-multiarch behavior. The C runners read the same env via -# test/lib/cfree_test_target.h. +# CFREE_TEST_ARCH and CFREE_TEST_OBJ select the cross-target. Defaults +# aa64+elf preserve the pre-multiarch behavior. The C runners read the +# same env vars via test/lib/cfree_test_target.h. CFREE_TEST_ARCH="${CFREE_TEST_ARCH:-aa64}" +CFREE_TEST_OBJ="${CFREE_TEST_OBJ:-elf}" case "$CFREE_TEST_ARCH" in - aa64|aarch64|arm64) TEST_ARCH=aa64; CLANG_TRIPLE=aarch64-linux-gnu; EXEC_ARCH=aarch64 ;; - x64|x86_64|amd64) TEST_ARCH=x64; CLANG_TRIPLE=x86_64-linux-gnu; EXEC_ARCH=x64 ;; - rv64|riscv64) TEST_ARCH=rv64; CLANG_TRIPLE=riscv64-linux-gnu; EXEC_ARCH=rv64 ;; + aa64|aarch64|arm64) TEST_ARCH=aa64; EXEC_ARCH=aarch64 ;; + x64|x86_64|amd64) TEST_ARCH=x64; EXEC_ARCH=x64 ;; + rv64|riscv64) TEST_ARCH=rv64; EXEC_ARCH=rv64 ;; *) printf 'unknown CFREE_TEST_ARCH=%s\n' "$CFREE_TEST_ARCH" >&2; exit 2 ;; esac -export CFREE_TEST_ARCH +case "$CFREE_TEST_OBJ" in + elf) + EXEC_OS=linux + case "$TEST_ARCH" in + aa64) CLANG_TRIPLE=aarch64-linux-gnu ;; + x64) CLANG_TRIPLE=x86_64-linux-gnu ;; + rv64) CLANG_TRIPLE=riscv64-linux-gnu ;; + esac + ;; + macho) + EXEC_OS=macos + case "$TEST_ARCH" in + aa64) CLANG_TRIPLE=arm64-apple-macos ;; + x64) CLANG_TRIPLE=x86_64-apple-macos ;; + rv64) printf 'CFREE_TEST_OBJ=macho has no rv64 target\n' >&2; exit 2 ;; + esac + ;; + *) printf 'unknown CFREE_TEST_OBJ=%s\n' "$CFREE_TEST_OBJ" >&2; exit 2 ;; +esac +EXEC_TAG="${EXEC_ARCH}-${EXEC_OS}" +export CFREE_TEST_ARCH CFREE_TEST_OBJ CLANG_TARGET="--target=$CLANG_TRIPLE" CC="${CC:-cc}" @@ -146,6 +177,8 @@ READELF_BIN="$(command -v llvm-readelf 2>/dev/null || command -v readelf 2>/dev/ EXEC_TARGET_MOUNT_ROOT="$BUILD_DIR" # shellcheck source=../lib/exec_target.sh source "$ROOT/test/lib/exec_target.sh" +# shellcheck source=../lib/exec_kernel.sh +source "$ROOT/test/lib/exec_kernel.sh" # ---- locate harness binaries ------------------------------------------------ # The Makefile's `test-link` target builds these as proper Make targets so @@ -262,21 +295,56 @@ for case_dir in "$TEST_DIR/cases"/*/; do done < "$case_dir/cflags" fi - # Collect source files (.c and .S; clang -c accepts both) + # Collect source files (.c and .S; clang -c accepts both). For each + # candidate, prefer the per-arch variant (entry.aa64.S beats entry.S + # when TEST_ARCH=aa64) so cases can ship arch-specific entry stubs + # alongside arch-agnostic shared sources. The bare name is the + # fallback — existing aa64-only cases keep working. tu_srcs=() - for f in "$case_dir/entry.S" "$case_dir/a.S" "$case_dir/b.S" \ - "$case_dir/a.c" "$case_dir/b.c" "$case_dir/c.c"; do - [ -f "$f" ] && tu_srcs+=("$f") + pick_variant() { + local base="$1" ext="$2" + if [ -f "$case_dir/${base}.${TEST_ARCH}.${ext}" ]; then + echo "$case_dir/${base}.${TEST_ARCH}.${ext}" + elif [ -f "$case_dir/${base}.${ext}" ]; then + echo "$case_dir/${base}.${ext}" + else + echo "" + fi + } + for spec in entry:S a:S b:S a:c b:c c:c; do + base="${spec%%:*}"; ext="${spec##*:}" + f="$(pick_variant "$base" "$ext")" + [ -n "$f" ] && tu_srcs+=("$f") done - # Linker script + kernel-image markers + # Linker script + kernel-image markers. The marker file content is a + # basename (e.g. "kernel.lds"); the harness derives a per-arch + # variant (kernel.<arch>.lds) first, falling back to the literal. linker_script_file="" if [ -f "$case_dir/linker_script" ]; then - linker_script_file="$case_dir/$(cat "$case_dir/linker_script" | tr -d '[:space:]')" + ls_base="$(cat "$case_dir/linker_script" | tr -d '[:space:]')" + ls_stem="${ls_base%.*}" + ls_ext="${ls_base##*.}" + if [ -f "$case_dir/${ls_stem}.${TEST_ARCH}.${ls_ext}" ]; then + linker_script_file="$case_dir/${ls_stem}.${TEST_ARCH}.${ls_ext}" + elif [ -f "$case_dir/${ls_base}" ]; then + linker_script_file="$case_dir/${ls_base}" + fi fi kernel_image=0 [ -f "$case_dir/kernel_image" ] && kernel_image=1 + # kernel_image cases need an arch-specific entry stub. If the case + # ships no entry.<arch>.S (and no bare entry.S that happens to + # build), the case is structurally inapplicable to this arch and + # the harness skips it. + if [ $kernel_image -eq 1 ] && \ + [ ! -f "$case_dir/entry.${TEST_ARCH}.S" ] && \ + [ ! -f "$case_dir/entry.S" ]; then + note_skip "$name" "kernel_image: no entry.${TEST_ARCH}.S in case" + continue + fi + # ---- compile with clang cross ------------------------------------------ if [ $have_clang_cross -eq 0 ]; then note_skip "$name/R" "no $TEST_ARCH clang" @@ -389,21 +457,12 @@ for case_dir in "$TEST_DIR/cases"/*/; do dt=$(( $(now_ms) - t0 )); T_E=$(( T_E + dt )) note_fail "$name/E (link failed, ${dt}ms)" elif [ $kernel_image -eq 1 ]; then - if [ "$TEST_ARCH" != "aa64" ]; then - dt=$(( $(now_ms) - t0 )); T_E=$(( T_E + dt )) - note_skip "$name/E" "kernel_image is aa64-only (TEST_ARCH=$TEST_ARCH)" - continue - fi - QEMU_KERNEL_BIN="$(command -v qemu-system-aarch64 2>/dev/null || true)" - if [ -z "$QEMU_KERNEL_BIN" ]; then + if ! exec_kernel_supported "$TEST_ARCH"; then dt=$(( $(now_ms) - t0 )); T_E=$(( T_E + dt )) - note_skip "$name/E" "no qemu-system-aarch64" + note_skip "$name/E" "no qemu-system-* for $TEST_ARCH" else - "$QEMU_KERNEL_BIN" -machine virt -cpu cortex-a72 \ - -kernel "$exe" -nographic \ - -semihosting-config enable=on,target=native \ - -no-reboot >"$work/exec.out" 2>"$work/exec.err" - RUN_RC=$? + exec_kernel_run "$TEST_ARCH" "$exe" \ + "$work/exec.out" "$work/exec.err" dt=$(( $(now_ms) - t0 )); T_E=$(( T_E + dt )) if [ "$RUN_RC" -eq "$expected" ]; then note_pass "$name/E (${dt}ms)" @@ -429,7 +488,7 @@ for case_dir in "$TEST_DIR/cases"/*/; do for s in "${gc_present_syms[@]:-}"; do [ -n "$s" ] && gcp="${gcp}${s}"$'\n'; done E_GC_ABSENT_LIST+=("$gca") E_GC_PRESENT_LIST+=("$gcp") - exec_target_queue "$EXEC_ARCH" "$name" "$exe" \ + exec_target_queue "$EXEC_TAG" "$name" "$exe" \ "$work/exec.out" "$work/exec.err" "$work/exec.rc" else note_skip "$name/E" "no runner (qemu/podman)"