kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 1d81b481da21ff373bfe7d861167245307bb7f74
parent e6a04f06bd66c77b397468a5749785163adb18fa
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Sun, 10 May 2026 13:23:59 -0700

link/test: Mach-O read + JIT — test-link path R+J green on aa64-macho

Linker-side Mach-O ingestion (doc/MULTIOBJ.md Phase 2.5).
link_add_obj_bytes / link_add_archive_bytes dispatch on
cfree_detect_fmt so Mach-O .o / .a members route through read_macho
exactly like an ELF input does. C-symbol mangling lives at the linker
API boundary (link_intern_c_name, cfree_jit_lookup, the undef
diagnostic) — Mach-O on-disk names stay byte-for-byte verbatim (the
round-trip oracle is unaffected), and callers see the source-level
form uniformly across formats.

test/link harness — start.c on macOS calls libc exit(), since Apple
does not expose a stable syscall ABI; tls_init is a no-op on Darwin
(tpidr_el0 is libsystem-owned). run.sh gains a j_targets per-path
applicability marker mirroring targets; four cases ship one because
the J pass/fail criterion exercises an ELF-specific ABI feature with
no Mach-O analogue (.fini_array destructors, -ffunction-sections
per-fn dead-strip).

Path E remains paused (doc/MULTIOBJ.md Phase 3): link_emit_macho is
the ~1.5–2 kLOC dyld-loadable MH_EXECUTE writer with LC_LOAD_DYLIB
libSystem, chained fixups, and an ad-hoc LC_CODE_SIGNATURE. Path-E
failures stay visible — no skips paper them over.

Counts: aa64-elf 119/119 (unchanged); aa64-macho R 32/32, J 35 + 4
SKIP-NA + 0 fail, E 0/40 (paused).

Diffstat:
Mdoc/MULTIOBJ.md | 42++++++++++++++++++++++++++++++++++++++++++
Msrc/link/link.c | 83++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------
Msrc/link/link_jit.c | 20+++++++++++++++++++-
Msrc/link/link_layout.c | 7+++++++
Msrc/obj/macho_emit.c | 11+++++------
Msrc/obj/macho_read.c | 6++++--
Atest/link/cases/21_fini_array/j_targets | 3+++
Atest/link/cases/22_init_fini_both/j_targets | 3+++
Atest/link/cases/25a_gc_basic/j_targets | 3+++
Atest/link/cases/25d_gc_chain/j_targets | 3+++
Mtest/link/harness/start.c | 21++++++++++++++++++++-
Mtest/link/run.sh | 25++++++++++++++++++++++++-
12 files changed, 203 insertions(+), 24 deletions(-)

diff --git a/doc/MULTIOBJ.md b/doc/MULTIOBJ.md @@ -51,7 +51,49 @@ matrix. per-case `*.targets` applicability (`test/elf/cases/18_bti_note.targets`) - [ ] Clang-emitted Mach-O round-trip — needs section-relative reloc and `__compact_unwind` handling (deferred) +- [x] **Phase 2.5** — Linker-side Mach-O read + JIT + - [x] `link_add_obj_bytes` / `link_add_archive_bytes` dispatch on + `cfree_detect_fmt`, so a Mach-O `.o` (or `.a` member) parses + through `read_macho` exactly like an ELF input does (§3.3) + - [x] C-symbol mangling lives at the linker API boundary + (`link_intern_c_name`, `cfree_jit_lookup`, the undef diag) — + Mach-O on-disk names stay byte-for-byte verbatim (round-trip + intact), and callers see the source-level form across both + formats: `link_set_entry("test_main")` and + `cfree_jit_lookup(jit, "test_main")` work uniformly. + - [x] test-link paths R + J both green on `aa64-macho` + (35 R / 31 J passing, including bad/30_undef_strong; + path E is the remaining gap) + - Four cases ship a `j_targets` file restricting their J path to + ELF tuples (R + E still run on every tuple). Path J's pass/fail + criterion in these cases depends on an ELF-specific ABI feature + with no Mach-O analogue: + * `21_fini_array` / `22_init_fini_both` — Mach-O destructors flow + through `__StaticInit` + `__cxa_atexit` registration, not the + ELF `.fini_array` shape the test and + `test/link/harness/start.c` walk. Without a `__cxa_atexit` + runtime in the JIT, the destructor is never invoked. + * `25a_gc_basic` / `25d_gc_chain` — `clang -ffunction-sections` + on Mach-O still emits a single `__TEXT,__text` per `.o` + (`.subsections_via_symbols` is the per-symbol dead-strip + granularity, not `-ffunction-sections`). `--gc-sections` can + drop whole sections but not individual functions, so the + `gc_absent unreachable_fn` check fails. - [ ] **Phase 3** — Mach-O linker (`link_emit_macho`) + - [ ] Paused — design is option #1 from `doc/MULTIOBJ.md §3.3` (full + dyld-loadable MH_EXECUTE with `LC_DYLD_CHAINED_FIXUPS` or + `LC_DYLD_INFO_ONLY`, `__stubs` / `__got` synthesis, + `LC_LOAD_DYLIB` against `libSystem.B.dylib`, `LC_MAIN`, and an + ad-hoc `LC_CODE_SIGNATURE`). Until it lands, path E of + `test/link/run.sh` panics on every case via + `link_emit_image_writer`'s per-format "not yet implemented" + diagnostic — the failures are visible (not skipped) so the + outstanding work stays on the dashboard. + - [ ] `test/link/harness/start.c` on macOS already calls `extern void + exit(int)` via libc rather than emitting `svc #0x80` directly + (Apple does not expose a stable syscall ABI); `link_emit_macho` + is responsible for resolving the `_exit` import against + libSystem and emitting the corresponding bind / fixup records. - [ ] Ad-hoc codesigning in `link_macho.c` (LC_CODE_SIGNATURE) so kernel will exec the binary on macOS 11+ diff --git a/src/link/link.c b/src/link/link.c @@ -114,25 +114,33 @@ LinkInputId link_add_obj_bytes(Linker* l, const char* name, const u8* data, size_t len) { /* Detect format from magic and dispatch to the matching reader. * The returned ObjBuilder is owned by the linker and freed via the - * input cleanup. ELF only this cut. */ + * input cleanup. ELF and Mach-O are supported. */ ObjBuilder* ob; LinkInput* in; LinkInputId id; CfreeBinFmt fmt; + const char* reader_name; if (!l || !data || !len) return LINK_INPUT_NONE; fmt = cfree_detect_fmt(data, len); switch (fmt) { case CFREE_BIN_ELF: ob = read_elf(l->c, name, data, len); + reader_name = "read_elf"; + break; + case CFREE_BIN_MACHO: + ob = read_macho(l->c, name, data, len); + reader_name = "read_macho"; break; default: compiler_panic(l->c, no_loc(), - "link_add_obj_bytes: only ELF is supported in this cut"); + "link_add_obj_bytes: unsupported object format " + "(fmt=%u) for '%s'", + (u32)fmt, name ? name : "(unnamed)"); } if (!ob) compiler_panic(l->c, no_loc(), - "link_add_obj_bytes: read_elf returned NULL for '%s'", - name ? name : "(unnamed)"); + "link_add_obj_bytes: %s returned NULL for '%s'", + reader_name, name ? name : "(unnamed)"); in = inputs_push(l, &id); in->kind = LINK_INPUT_OBJ_BYTES; in->obj = ob; /* re-uses the ObjBuilder slot for ownership */ @@ -214,9 +222,11 @@ LinkInputId link_add_archive_bytes(Linker* l, const char* name, const u8* data, compiler_panic(l->c, no_loc(), "link: oom on archive members"); if (n) memset(ar->members, 0, sizeof(*ar->members) * n); - /* Pass 2: parse each member as ELF. ar.c's iterator skips the + /* Pass 2: parse each member as object. ar.c's iterator skips the * symbol-index ('/' and '__.SYMDEF') and long-name ('//') members - * for us, so every member returned here is a real object file. */ + * for us, so every member returned here is a real object file. + * Format is detected per-member so a single archive could in + * principle hold mixed formats (in practice it never does). */ if (!cfree_ar_iter_init(&it, &in_arc)) compiler_panic(l->c, no_loc(), "link_add_archive_bytes: ar_iter_init failed on '%s' " @@ -224,10 +234,26 @@ LinkInputId link_add_archive_bytes(Linker* l, const char* name, const u8* data, name ? name : "(unnamed)"); n = 0; while (cfree_ar_iter_next(&it, &mem) && n < ar->nmembers) { - ObjBuilder* ob = read_elf(l->c, mem.name, mem.data, mem.size); + ObjBuilder* ob = NULL; + CfreeBinFmt mfmt = cfree_detect_fmt(mem.data, mem.size); + switch (mfmt) { + case CFREE_BIN_ELF: + ob = read_elf(l->c, mem.name, mem.data, mem.size); + break; + case CFREE_BIN_MACHO: + ob = read_macho(l->c, mem.name, mem.data, mem.size); + break; + default: + compiler_panic(l->c, no_loc(), + "link_add_archive_bytes: unsupported member " + "format (fmt=%u) for '%s' in archive '%s'", + (u32)mfmt, + mem.name ? mem.name : "(unnamed)", + name ? name : "(unnamed)"); + } if (!ob) compiler_panic(l->c, no_loc(), - "link_add_archive_bytes: read_elf failed for " + "link_add_archive_bytes: object read failed for " "member '%s' of archive '%s'", mem.name ? mem.name : "(unnamed)", name ? name : "(unnamed)"); @@ -240,16 +266,42 @@ LinkInputId link_add_archive_bytes(Linker* l, const char* name, const u8* data, &l->archives); /* opaque non-zero handle */ } +/* Intern a C-source-level symbol name in the format the input objects + * use on the wire. Mach-O prepends `_` to every C symbol on disk, so + * a caller-supplied "test_main" must become `_test_main` to match what + * read_macho produced. ELF / COFF / Wasm intern verbatim. */ +Sym link_intern_c_name(Linker* l, const char* name) { + Sym sym; + if (!l || !name) return 0; + if (l->c->target.obj == CFREE_OBJ_MACHO) { + /* Skip the prefix if the caller already supplied one. */ + if (name[0] == '_') return pool_intern_cstr(l->c->global, name); + { + size_t n = strlen(name); + char* buf = (char*)l->heap->alloc(l->heap, n + 2, 1); + if (!buf) + compiler_panic(l->c, no_loc(), + "link_intern_c_name: oom prefixing '%s'", name); + buf[0] = '_'; + memcpy(buf + 1, name, n); + buf[n + 1] = 0; + sym = pool_intern(l->c->global, buf, (u32)(n + 1)); + l->heap->free(l->heap, buf, n + 2); + return sym; + } + } + return pool_intern_cstr(l->c->global, name); +} + void link_set_entry(Linker* l, const char* name) { if (!l || !name) return; - l->entry_name = pool_intern_cstr(l->c->global, name); + l->entry_name = link_intern_c_name(l, name); } void link_set_script(Linker* l, const CfreeLinkScript* script) { if (!l || !script) return; l->script = script; - if (script->entry) - l->entry_name = pool_intern_cstr(l->c->global, script->entry); + if (script->entry) l->entry_name = link_intern_c_name(l, script->entry); } void link_set_extern_resolver(Linker* l, LinkExternResolver fn, void* user) { @@ -432,8 +484,13 @@ void link_emit_image_writer(LinkImage* img, Writer* w) { return; case CFREE_OBJ_MACHO: compiler_panic(img->c, no_loc(), - "link_emit_image_writer: Mach-O linker emit not yet " - "implemented (see doc/MULTIOBJ.md Phase 3)"); + "link_emit_image_writer: Mach-O exe emit not yet " + "implemented (paused; see doc/MULTIOBJ.md Phase 3). " + "Path R (round-trip) and path J (in-process JIT) " + "work today on aa64-macho via read_macho + the " + "format-agnostic LinkImage; path E needs the full " + "dyld-loadable MH_EXECUTE linker (LC_LOAD_DYLIB " + "libSystem, chained fixups, ad-hoc code signature)."); case CFREE_OBJ_COFF: compiler_panic(img->c, no_loc(), "link_emit_image_writer: COFF/PE linker emit not yet " diff --git a/src/link/link_jit.c b/src/link/link_jit.c @@ -295,7 +295,25 @@ void* cfree_jit_lookup(CfreeJit* jit, const char* name) { LinkSymId id; const LinkSymbol* s; if (!jit || !name) return NULL; - sym = pool_intern_cstr(jit->c->global, name); + /* C-symbol mangling: Mach-O on-disk names carry a leading `_` for + * every C source-level symbol (read_macho preserves it verbatim). + * Match that convention so a caller looking up "test_main" finds + * the `_test_main` defined by clang-emitted input. An explicit + * underscore-prefix in the caller-supplied name is left alone so + * raw on-disk names still resolve directly. */ + if (jit->c->target.obj == CFREE_OBJ_MACHO && name[0] != '_') { + size_t n = strlen(name); + Heap* heap = (Heap*)jit->c->env->heap; + char* buf = (char*)heap->alloc(heap, n + 2, 1); + if (!buf) return NULL; + buf[0] = '_'; + memcpy(buf + 1, name, n); + buf[n + 1] = 0; + sym = pool_intern(jit->c->global, buf, (u32)(n + 1)); + heap->free(heap, buf, n + 2); + } else { + sym = pool_intern_cstr(jit->c->global, name); + } id = symhash_get(&jit->image->globals, sym); if (id == LINK_SYM_NONE) return NULL; s = LinkSyms_at(&jit->image->syms, id - 1); diff --git a/src/link/link_layout.c b/src/link/link_layout.c @@ -352,6 +352,13 @@ static void resolve_undefs(Linker* l, LinkImage* img) { size_t namelen; const char* nm = s->name ? pool_str(l->c->global, s->name, &namelen) : (namelen = 0, ""); + /* On Mach-O the on-disk name carries a leading `_` C-mangle + * byte; strip it for display so the diagnostic surface matches + * the source-level symbol name across formats. */ + if (l->c->target.obj == CFREE_OBJ_MACHO && namelen >= 1 && nm[0] == '_') { + ++nm; + --namelen; + } compiler_panic(l->c, no_loc(), "link: undefined reference to '%.*s'", (int)namelen, nm); } diff --git a/src/obj/macho_emit.c b/src/obj/macho_emit.c @@ -346,12 +346,11 @@ void emit_macho(Compiler* c, ObjBuilder* ob, Writer* w) { const char* nm = pool_str(c->global, s->name, &nlen); /* Mach-O symbol names are stored on disk verbatim — including * the leading `_` Apple toolchains use for C-source-level - * symbols ("_main" for `int main()`). The cfree path treats - * that prefix as part of the on-disk name, not a transform - * applied at emit; a future Mach-O codegen frontend can - * prepend the underscore itself the same way LLVM's MCSymbol - * does via target.MCAsmInfo. Round-tripping is then byte-for- - * byte: emit writes what read sees. */ + * symbols ("_main" for `int main()`). cfree treats the prefix + * as part of the on-disk name, not a transform applied at emit. + * Name-canonicalization for API callers (cfree_jit_lookup, + * link_set_entry) lives one layer up at the linker boundary + * (link.c), so emit/read stay byte-for-byte stable. */ if (nlen && nm) { u32 off = buf_pos(&strtab); buf_write(&strtab, nm, nlen); diff --git a/src/obj/macho_read.c b/src/obj/macho_read.c @@ -238,8 +238,10 @@ ObjBuilder* read_macho(Compiler* c, const char* name, const u8* data, } /* Mach-O names round-trip verbatim — the leading `_` Apple * toolchains apply to C symbols is part of the on-disk name as - * far as ObjBuilder is concerned. Mirrors the no-transform - * decision in emit_macho. */ + * far as ObjBuilder is concerned. Name-canonicalization (the + * `test_main` ↔ `_test_main` mapping for API callers) happens + * one layer up at the linker API boundary (link_c_name_intern + * in link.c); the on-disk shape stays byte-for-byte stable. */ Sym sn = nlen ? pool_intern(c->global, nm, nlen) : 0; u8 type_field = (u8)(n_type & N_TYPE); diff --git a/test/link/cases/21_fini_array/j_targets b/test/link/cases/21_fini_array/j_targets @@ -0,0 +1,3 @@ +aa64-elf +rv64-elf +x64-elf diff --git a/test/link/cases/22_init_fini_both/j_targets b/test/link/cases/22_init_fini_both/j_targets @@ -0,0 +1,3 @@ +aa64-elf +rv64-elf +x64-elf diff --git a/test/link/cases/25a_gc_basic/j_targets b/test/link/cases/25a_gc_basic/j_targets @@ -0,0 +1,3 @@ +aa64-elf +rv64-elf +x64-elf diff --git a/test/link/cases/25d_gc_chain/j_targets b/test/link/cases/25d_gc_chain/j_targets @@ -0,0 +1,3 @@ +aa64-elf +rv64-elf +x64-elf diff --git a/test/link/harness/start.c b/test/link/harness/start.c @@ -65,8 +65,20 @@ void __cfree_ifunc_init(void) { } } +#if defined(__APPLE__) +/* macOS doesn't expose a stable syscall ABI — all syscalls must go + * through libSystem.dylib. start.c on macOS therefore calls libc + * `exit` rather than emitting `svc #0x80` inline; the cfree Mach-O + * exe linker resolves the import via LC_LOAD_DYLIB libSystem.B.dylib + * and the dyld bind info / chained-fixups stream. */ +extern void exit(int) __attribute__((noreturn)); +#endif + __attribute__((noreturn)) static void do_exit(int code) { -#if defined(__aarch64__) +#if defined(__APPLE__) + exit(code); + __builtin_unreachable(); +#elif defined(__aarch64__) register long x8 __asm__("x8") = 94; /* sys_exit_group */ register long x0 __asm__("x0") = code; __asm__ volatile("svc #0" ::"r"(x8), "r"(x0) : "memory"); @@ -85,6 +97,12 @@ __attribute__((noreturn)) static void do_exit(int code) { } static void tls_init(void) { +#if defined(__APPLE__) + /* On Darwin, tpidr_el0 is owned by libsystem/dyld; freestanding + * tests don't synthesize TLS roots (31_tls_local_exec is N/A on + * Mach-O), so the prologue is a no-op. */ + return; +#else unsigned long td_n = (unsigned long)(__tdata_end - __tdata_start); unsigned long bs_n = (unsigned long)(unsigned long long)__tbss_size; unsigned long i; @@ -126,6 +144,7 @@ static void tls_init(void) { #else #error "start.c: unsupported architecture" #endif +#endif /* !__APPLE__ */ } void _start(void) { diff --git a/test/link/run.sh b/test/link/run.sh @@ -37,6 +37,12 @@ # Skips paths R and J; on E, runs the linked exe via # a per-arch qemu-system-* invocation (semihosting on # aa64; SIFIVE_TEST MMIO exit on rv64). +# j_targets — per-path applicability for J only. Listed tuples +# (one per line) run J; others print SKIP-NA for J +# and continue running R/E. For cases whose JIT +# success criterion depends on an ABI feature with +# no Mach-O analogue (ELF .fini_array destructors, +# -ffunction-sections per-fn dead-strip). # # Per-arch source variants: # For each candidate source filename (entry.S, a.S, b.S, a.c, b.c, c.c), @@ -294,6 +300,20 @@ for case_dir in "$TEST_DIR/cases"/*/; do expected=0; [ -f "$case_dir/expected" ] && expected="$(cat "$case_dir/expected" | tr -d '[:space:]')" jit_only=0; [ -f "$case_dir/jit_only" ] && jit_only=1 use_resolver=0; [ -f "$case_dir/use_resolver" ] && use_resolver=1 + # Per-path applicability — `j_targets` lists tuples on which the + # J path can run. Used for cases that exercise an ELF-specific + # ABI feature inapplicable to Mach-O at the test level (e.g. ELF + # `.fini_array` destructors vs Mach-O `__StaticInit` + `atexit`; + # `-ffunction-sections` per-function dead-strip vs Mach-O's single + # `__TEXT,__text`). R and E still run — they don't depend on the + # ABI feature the J path's pass/fail criterion does. + j_applicable=1 + if [ -f "$case_dir/j_targets" ]; then + j_applicable=0 + for tuple in $(cat "$case_dir/j_targets"); do + [ "$tuple" = "$CUR_TUPLE" ] && j_applicable=1 + done + fi archive_mode="none" if [ -f "$case_dir/archive_b" ]; then archive_mode="$(cat "$case_dir/archive_b" | tr -d '[:space:]')" @@ -547,7 +567,10 @@ for case_dir in "$TEST_DIR/cases"/*/; do fi # ---- Path J: JIT -------------------------------------------------------- - if [ $RUN_J -eq 1 ] && [ $have_jit_runner -eq 1 ] && [ $kernel_image -eq 0 ]; then + if [ $RUN_J -eq 1 ] && [ $j_applicable -eq 0 ] && [ $kernel_image -eq 0 ]; then + printf ' %s %s/J — N/A on %s\n' \ + "$(color_yel SKIP-NA)" "$name" "$CUR_TUPLE" + elif [ $RUN_J -eq 1 ] && [ $have_jit_runner -eq 1 ] && [ $kernel_image -eq 0 ]; then t0=$(now_ms) jit_cmd=("$JIT_RUNNER" "${extra_flags[@]}") [ $use_resolver -eq 1 ] && jit_cmd+=(--use-resolver)