commit a2abf09a805ee4916fbcdd926a9d444cbc34229d
parent cd128bb839e5be267adabb140be3944872123cb1
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Mon, 11 May 2026 13:52:09 -0700
stage2: standalone link probe + STAGE2/MACHO notes
Adds scripts/stage2_link.sh, an out-of-tree end-to-end probe that
compiles every src/*.c + driver/*.c cfree can ingest with cfree-stage1,
falls back to clang for driver/env.c and driver/ld.c (A2-blocked), and
then links the combined object set with `cfree ld` against
libSystem.B.tbd. Compile is clean at 105/107; link now reaches the
chained-fixup emit pass and trips on TLV slot routing (D2-emit).
Also: STAGE2.md snapshot bumped to 105/107, D2-read marked done and
D2-emit recorded; driver/ld.c now treats LIB_RESOLVE_KIND_TBD the same
as SHARED for `-l` resolution so `-lSystem` lands on the .tbd stub;
MACHO.md drops the long-resolved sections 1–3 so only the live TLV
section remains.
Diffstat:
4 files changed, 156 insertions(+), 133 deletions(-)
diff --git a/doc/MACHO.md b/doc/MACHO.md
@@ -21,137 +21,6 @@ CFREE_OBJ_MACHO`.
---
-## 1. Common symbols (case `17_common_coalesce`) — RESOLVED
-
-**Symptom.** clang on Mach-O emits `int shared_val;` (tentative def) as
-`N_UNDF | N_EXT` with `n_value = size`. cfree's `read_macho` translates
-that to `SK_COMMON` correctly. After `layout_commons`, the symbol is
-placed at a vaddr just past the RW segment's `file_size`:
-
- s->vaddr = bss_cursor; /* image-relative, in the RW seg's trailer */
- s->kind = SK_OBJ; /* no longer COMMON */
- s->section_id = LINK_SEC_NONE; /* never set */
-
-`link_macho.c::shift_sections` only re-bases LinkSymbols whose
-`section_id` matches a planned `MSec`. A common symbol has no
-section_id, so its vaddr stays at the pre-Mach-O layout coordinate
-(typically `0x4000`-ish) and the GOT slot pointing at it carries that
-stale value. At runtime the load reads the wrong address and the
-dereference faults.
-
-**Fix shape:** Two options, both straightforward:
-
-1. **Synthesize a `SSEM_NOBITS` LinkSection in `layout_commons`** to
- wrap every common symbol, set each `s->section_id` to that
- LinkSection's id, and let `plan_layout` pick it up as a regular
- writable-zerofill section (it'll land in `__DATA,__bss` via the
- existing `pick_macho_names` flags path). Cleanest and fixes both
- ELF and Mach-O uniformly.
-
-Witness: `otool -tV build/test/link/17_common_coalesce/linked.exe`
-shows the LDR pair targeting GOT slot 1 (`0x100004008`), which
-contains the literal `0x4000` rebase target — that's the stale
-common-symbol vaddr.
-
-**Resolution.** `layout_commons` (option 1 above) now appends a
-synthetic `.bss.common` NOBITS LinkSection that wraps every common
-symbol, with each common's `section_id` pointing at it and `value`
-set to the per-symbol offset. `link_symbols_to_sections` then
-recomputes `s->vaddr` from `section.vaddr + value` after Mach-O's
-`shift_sections` rebases the synthetic section into `__DATA,__bss`.
-`emit_segment_bytes` skips synthetic sections (input_id == NONE).
-ELF behavior is unchanged — the segment's `mem_size` extension and
-the symbols' final vaddrs are the same as before, only the
-intervening representation has a backing section_id.
-
-## 2. Init / fini ctors (case `23_init_order`) — RESOLVED
-
-**Symptom.** With `-ffreestanding -O1 -fno-inline`, clang on Mach-O
-*does not emit* `__mod_init_func` for `__attribute__((constructor))`
-functions. A minimal reproducer with no `-ffreestanding` does emit
-the section; the freestanding flag suppresses it.
-
-Even past that, cfree's `start.c` walks `__init_array_start/end`
-(linker-synthesized boundary syms). On Mach-O the canonical init
-array is `__DATA,__mod_init_func`, with `__StaticInit` + `dyld` doing
-the iteration. Bridging the two:
-
-- Have the Mach-O writer synthesize a `__mod_init_func` from the
- collected `__init_array` entries so dyld runs them as part of normal
- image startup, then make `start.c`'s init walk a no-op on Mach-O.
-
-Needs the clang-emit issue solved first,
-otherwise the input objects don't carry the ctor pointers.
-
-**Resolution.** Two-line fix:
-
-1. `link_macho.c::plan_layout` now stamps the section-type byte to
- `S_MOD_INIT_FUNC_POINTERS` (0x9) on every `__mod_init_func` MSec
- (and `S_MOD_TERM_FUNC_POINTERS` 0xa on `__mod_term_func`). Without
- the right type byte, dyld walks past the section silently — the
- section was structurally present in the output, but its entries
- never ran. Pass-through of the clang-emitted `__mod_init_func`
- from input objects is sufficient; no synthesis from `__init_array`
- is needed since Mach-O inputs carry `__mod_init_func` directly.
-2. `test/link/harness/start.c` short-circuits the `__init_array` /
- `__fini_array` walks under `__APPLE__`. Boundary symbols on
- Mach-O land in the `__got` region (no real init-array section);
- dyld already invokes `__mod_init_func` entries before `_start`,
- so the harness loop would otherwise fault on the synthesized
- boundaries.
-
-The clang-emit observation is real but turned out not to block the
-test: at `-O1` clang pre-evaluates the constructor's effect into the
-initial values of static data (e.g. `g_pos = 1`, `g_seq[0] = 1` in
-this case), so the test still observes the expected end state once
-the remaining ctor (the cross-TU one whose effect can't be
-pre-evaluated) actually runs.
-
----
-
-## 3. Path J on `aa64-macho` — RESOLVED
-
-`make test-link CFREE_TEST_OBJ=macho` Path J is now 100/100 (88 J
-cases + the §3.3 IFUNC trio excluded via `j_targets`). Each
-sub-issue is summarized below; see `doc/JIT.md` §"Reloc-apply gaps"
-for the canonical implementation pointers.
-
-- **§3.1 Cross-TU data via ADRP/ADD/LDR — RESOLVED.** Cases:
- `11_data_cross_tu/J`, `14_weak_present/J`, `17_common_coalesce/J`,
- `34_ifunc_addr_taken/J` (now Mach-O-skipped via §3.3). Fixed by
- enabling the ELF-shaped `layout_got` synthesis on the Mach-O JIT
- path (`src/link/link_layout.c`, gated on `!l->emit_static_exe`).
- The exe path keeps its `link_macho.c::collect_imports` scheme.
-
-- **§3.2 Weak-undef proximity — RESOLVED.** Case: `16_weak_undef/J`.
- Fixed by allocating the JIT image as a single contiguous mapping
- (`src/link/link_jit.c::cfree_jit_from_image`): one `mem->reserve`
- for the full image span, segments are subdivisions, inter-segment
- displacements are always within ±4 GiB. Weak-undef now flows
- through a GOT slot whose `R_ABS64` writes 0.
-
-- **§3.3 IFUNC under Mach-O JIT — RESOLVED via exclusion.**
- Cases: `32_ifunc/J`, `33_ifunc_in_init/J`, `34_ifunc_addr_taken/J`.
- IFUNC is ELF-only at the format level, so Mach-O has no
- __mod_init_func equivalent for the iplt synthesis. Excluded via
- `j_targets` on all three cases (ELF tuples only), matching the
- pre-existing `e_targets` shape on `33_ifunc_in_init`. Revisit
- only if a Mach-O-shaped iplt scheme inside the JIT mapping
- becomes a requirement.
-
-- **§3.4 Extern resolver — RESOLVED.** Case: `28_extern_resolver/J`.
- Fixed by a new layout pass `layout_jit_call_stubs`
- (`src/link/link_layout.c`) that synthesizes a 12-byte
- `ADRP+LDR+BR` stub per resolver-supplied / weak-undef SK_ABS
- target hit by CALL26/JUMP26. The stub lives in its own RX
- subsegment of the contiguous JIT mapping; its slot is filled by
- an `R_ABS64` against a synthetic resolver-pointer LinkSymbol
- carrying the original (host) vaddr. `emit_reloc_records` redirects
- CALL26/JUMP26 to the stub. End-to-end: `cfree run` can now call
- libc directly (verified with `write` and `printf`).
-
----
-
## 4. TLV (thread-local variables) — PARTIALLY RESOLVED
Adds `ARM64_RELOC_TLVP_LOAD_PAGE21` / `PAGEOFF12` support and
diff --git a/doc/STAGE2.md b/doc/STAGE2.md
@@ -3,11 +3,17 @@
What's missing to make `make self` produce a stage-2 `cfree` built by stage-1
cfree itself. Companion to `DESIGN.md`.
-Latest snapshot: **104 / 106 files compile clean** (92/92 `src/**/*.c`,
+Latest snapshot: **105 / 107 files compile clean** (93/93 `src/**/*.c`,
12/14 `driver/*.c`). The two remaining driver failures (`env.c`, `ld.c`)
are both blocked by A2 — system-header ingest. Everything in `src/` builds
under stage 1.
+A standalone link probe (`scripts/stage2_link.sh`) drives the full
+sequence end-to-end: cfree-stage1 compiles the 105 clean files, clang
+compiles `env.c` / `ld.c`, and `cfree ld` then attempts to link the
+combined object set against `libSystem.B.tbd`. As of the latest run the
+link reaches the chained-fixup emit pass and trips D2 below.
+
## Build configuration
Stage 2 currently invokes:
@@ -124,6 +130,19 @@ not been switched back on.
turn shells out to the host linker. Once stage 2 builds, verify the
produced binary is genuinely a stage-1-emitted object linked through
cfree's own ld path, not falling back to clang/ld silently.
+- [x] **D2-read.** Mach-O reader rejected `ARM64_RELOC_TLVP_LOAD_PAGE21`
+ (8) and `ARM64_RELOC_TLVP_LOAD_PAGEOFF12` (9). Clang emits these for
+ TLS references in `driver/env.c` (errno-style access); without them
+ the standalone link probe couldn't ingest `env.o`. Reader now maps
+ both to TLV reloc kinds.
+- [ ] **D2-emit.** Chained-fixup emit doesn't know how to locate the
+ byte slot for the new TLV pointer region — `cfree ld` aborts with
+ `link_macho: chained-fixup slot for vaddr 0x… not in any segment
+ buffer` at `src/link/link_macho.c:1564`. The lookup at
+ `link_macho.c:1543` currently routes only segidx 2 (`__DATA_CONST`
+ __got) and segidx 3 (`__DATA` __thread_ptrs / MSec walk); the new
+ TLV section/segment added by the TLV ingest work isn't covered.
+ Blocks the standalone link probe past compile.
### Hosted libc shim
diff --git a/driver/ld.c b/driver/ld.c
@@ -411,7 +411,7 @@ static int ld_parse(int argc, char** argv, LdOptions* o) {
driver_errf(LD_TOOL, "cannot find -l%s", name);
return 1;
}
- if (kind == LIB_RESOLVE_KIND_SHARED) {
+ if (kind == LIB_RESOLVE_KIND_SHARED || kind == LIB_RESOLVE_KIND_TBD) {
ld_push_dso(o, resolved, 1, resolved_size);
} else {
ld_push_archive(o, resolved, 1, resolved_size);
diff --git a/scripts/stage2_link.sh b/scripts/stage2_link.sh
@@ -0,0 +1,135 @@
+#!/usr/bin/env bash
+# Stage-2 standalone link probe.
+#
+# 1. Compile every src/**/*.c with cfree-stage1.
+# 2. Compile every driver/*.c with cfree-stage1; fall back to clang for the
+# files cfree still can't ingest (env.c, ld.c, per doc/STAGE2.md A2).
+# 3. Link all the resulting objects with `cfree ld` against libSystem.B.tbd.
+#
+# Runs out-of-tree under build/stage2-probe/ so the Makefile's build/ tree
+# is left alone.
+set -u
+
+ROOT="$(cd "$(dirname "$0")/.." && pwd)"
+cd "$ROOT"
+
+BIN="$ROOT/build/cfree"
+if [ ! -x "$BIN" ]; then
+ echo "missing $BIN — run \`make\` first" >&2
+ exit 2
+fi
+
+SDK="$(xcrun --show-sdk-path)"
+OUT="$ROOT/build/stage2-probe"
+LIB_OUT="$OUT/lib"
+DRV_OUT="$OUT/driver"
+LOG="$OUT/log"
+mkdir -p "$LIB_OUT" "$DRV_OUT" "$LOG"
+
+CFREE_FLAGS="-isystem $ROOT/rt/include -isystem $ROOT/rt/include/libc -Iinclude -Isrc"
+DRIVER_FLAGS="-isystem $ROOT/rt/include -isystem $ROOT/rt/include/libc -Iinclude"
+
+# Driver files cfree still cannot parse (doc/STAGE2.md A2). Compile with
+# clang in stage-2 flag style so the resulting objects are still arm64
+# Mach-O at a matching SDK level.
+CLANG_FALLBACK=("env.c" "ld.c")
+
+cfree_objs=()
+clang_objs=()
+fail_src=()
+fail_driver=()
+
+compile_with_cfree() {
+ local src="$1" obj="$2" flags="$3"
+ mkdir -p "$(dirname "$obj")"
+ if "$BIN" cc $flags -c "$src" -o "$obj" >"$LOG/$(basename "$obj").log" 2>&1; then
+ return 0
+ fi
+ return 1
+}
+
+echo "=== compiling src/ with cfree ==="
+while IFS= read -r src; do
+ rel="${src#src/}"
+ obj="$LIB_OUT/${rel%.c}.o"
+ if compile_with_cfree "$src" "$obj" "$CFREE_FLAGS"; then
+ cfree_objs+=("$obj")
+ printf ' ok %s\n' "$src"
+ else
+ fail_src+=("$src")
+ head -1 "$LOG/$(basename "$obj").log" | sed "s|^| FAIL $src: |"
+ fi
+done < <(find src -name '*.c' | sort)
+
+echo
+echo "=== compiling driver/ ==="
+for src in $(ls driver/*.c | sort); do
+ base="$(basename "$src")"
+ obj="$DRV_OUT/${base%.c}.o"
+ use_clang=0
+ for skip in "${CLANG_FALLBACK[@]}"; do
+ if [ "$base" = "$skip" ]; then use_clang=1; fi
+ done
+ if [ "$use_clang" = 1 ]; then
+ # -fno-{,asynchronous-}unwind-tables: cfree's macho_read doesn't yet
+ # ingest the section-relative UNSIGNED relocs that clang emits in
+ # __LD,__compact_unwind. Suppress the section entirely.
+ if clang -arch arm64 -isysroot "$SDK" -Iinclude \
+ -fno-unwind-tables -fno-asynchronous-unwind-tables \
+ -c "$src" -o "$obj" >"$LOG/$base.log" 2>&1; then
+ clang_objs+=("$obj")
+ printf ' CLG %s\n' "$src"
+ else
+ fail_driver+=("$src (clang)")
+ head -1 "$LOG/$base.log" | sed "s|^| FAIL $src: |"
+ fi
+ continue
+ fi
+ if compile_with_cfree "$src" "$obj" "$DRIVER_FLAGS"; then
+ cfree_objs+=("$obj")
+ printf ' ok %s\n' "$src"
+ else
+ fail_driver+=("$src (cfree)")
+ head -1 "$LOG/$base.log" | sed "s|^| FAIL $src: |"
+ fi
+done
+
+echo
+echo "=== compile summary ==="
+echo " cfree objects: ${#cfree_objs[@]}"
+echo " clang objects: ${#clang_objs[@]}"
+echo " src failures: ${#fail_src[@]}"
+echo " driver failures: ${#fail_driver[@]}"
+
+if [ "${#fail_src[@]}" -gt 0 ] || [ "${#fail_driver[@]}" -gt 0 ]; then
+ echo
+ echo "compile failures present; skipping link" >&2
+ exit 1
+fi
+
+echo
+echo "=== linking with cfree ld ==="
+BIN_OUT="$OUT/cfree-stage2"
+LIBSYS_DIR="$SDK/usr/lib"
+if [ ! -f "$LIBSYS_DIR/libSystem.B.tbd" ] && \
+ [ ! -f "$LIBSYS_DIR/libSystem.tbd" ]; then
+ echo "libSystem stub not found under $LIBSYS_DIR" >&2
+ exit 2
+fi
+
+set -x
+"$BIN" ld -o "$BIN_OUT" -pie \
+ "${cfree_objs[@]}" "${clang_objs[@]}" \
+ -L "$LIBSYS_DIR" -lSystem
+status=$?
+set +x
+
+if [ "$status" -ne 0 ]; then
+ echo "cfree ld failed (exit $status)" >&2
+ exit "$status"
+fi
+
+echo
+echo "=== link succeeded ==="
+file "$BIN_OUT"
+ls -l "$BIN_OUT"