boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit e21919d9ee8fa9e95220ff98f3fb11f49f35cee3
parent b797882f164a5440762926256347aed75997d2ef
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Mon,  4 May 2026 14:12:53 -0700

Fix LP64 constants for boot4 musl

Diffstat:
Mdocs/MUSL.md | 402++++++++++++++++++++-----------------------------------------------------------
Ascripts/simple-patches/tcc-0.9.26/lp64-long-constant.after | 6++++++
Ascripts/simple-patches/tcc-0.9.26/lp64-long-constant.before | 6++++++
Mscripts/stage1-flatten.sh | 6++++++
Atests/cc/339-lp64-unsigned-long-constant.c | 19+++++++++++++++++++
5 files changed, 136 insertions(+), 303 deletions(-)

diff --git a/docs/MUSL.md b/docs/MUSL.md @@ -1,319 +1,115 @@ -# boot4 — building musl with the boot3 tcc +# boot4 musl spec -Working doc. boot3 produces a self-host fixed-point tcc (`tcc2 == tcc3`). -boot4 takes that compiler and uses it to build [musl -1.2.5](https://musl.libc.org/) from upstream source plus a small set of -tcc-compatibility patches, then links and runs a static hello world. -The harness is wired for amd64, aarch64, and riscv64; **amd64** and -**riscv64** are verified end-to-end (aarch64 still blocks at link due -to tcc 0.9.26 codegen bugs). Modeled on the in-image build in -[/Users/ryan/tmp/musltcc](file:///Users/ryan/tmp/musltcc) but driven -from this repo's bootstrap and constrained to scratch+busybox. +`scripts/boot4.sh <arch>` builds a static musl 1.2.5 libc with the +verified boot3 tcc for the same architecture, then links and runs a +static hello-world smoke binary. Supported architectures are `amd64`, +`aarch64`, and `riscv64`; all three are verified end-to-end. -## Pipeline - -``` -build/$ARCH/boot3/tcc3 (verified by boot3) - │ - │ scripts/boot4.sh amd64 - │ • libtcc1.a: tcc compiles libtcc1.c, alloca86_64.{S,bt.S}, - │ va_list.c into a tcc -ar archive - │ • patch: apply musl-1.2.5-tcc.patch + scrub deleted - │ src/complex/, src/{fenv,signal}/x86_64/, - │ src/math/x86_64/*.c - │ • configure: CC=tcc AR=true sh ./configure - │ --target=x86_64-linux-musl --disable-shared - │ • headers: sed mkalltypes.sed; build syscall.h, version.h - │ • compile: tcc -c every src/<dir>/*.{c,s,S}, skip-on-fail - │ • crt: Scrt1.o, crt1.o, rcrt1.o, crti.o, crtn.o - │ • libc.a: tcc -ar rcs all .o files - │ • hello: tcc -static crt1.o hello.c -lc -ltcc1 -lc - │ - ▼ -build/$ARCH/boot4/{libtcc1.a, libc.a, crt1.o, crti.o, crtn.o, hello} -``` - -The container is the same `boot2-scratch:$ARCH` boot3 uses (FROM scratch -+ busybox, no libc, no /etc). - -## Multi-arch status - -| arch | dispatch | musl patch | va_list shim | end-to-end | -|---------|:--:|:--:|:--:|:--:| -| amd64 | ✓ | ✓ | ✓ | ✓ verified | -| aarch64 | ✓ | aarch64-targeted patches landed (syscall trampoline, atomic CAS via single-fn LL/SC, get_tp helper, replacement crt_arch.h, replacement __set_thread_area, deletion sweep) + arm64-gen.c VT_CONST\|VT_LVAL store/load fix. Compile reaches **1263/1271**; 8 skips. | mirrors tcc 0.9.26 AAPCS register-save struct, gets past the `va_list` typename | hello starts, runs first printf, then segfaults — open mystery (see below) | -| riscv64 | ✓ | riscv64-targeted patches landed (mirrors aarch64: syscall trampoline, atomic externs, get_tp helper, replacement crt_arch.h, patched __set_thread_area, deletion sweep). Compile reaches **1268/1271**; 3 skips. | mirrors tcc 0.9.26 riscv64 stdarg.h: `__builtin_va_list = char *`, va_arg as the lp64 pointer-arithmetic macro | ✓ verified | - -### aarch64 status after patch round 1 + tcc fix - -The first round of musl patches dropped the skip count from 153 to 20. -Of those 20, 17 were tccgen-level `store(0, (1011, 5130, 0)) / assert -fail: 0` failures across `__libc_start_main`, `__init_tls`, `abort`, -the entire `mallocng/*`, `oldmalloc/malloc.c`, `pthread_join`, -`asctime_r`, etc. — the load-bearing files hello needs to link. - -Root cause **found and fixed in tcc**: -[`scripts/simple-patches/tcc-0.9.26/arm64-{store,load}-const-lvalue.{before,after}`](../scripts/simple-patches/tcc-0.9.26/). -`arm64-gen.c`'s `store` and `load` handle `VT_CONST | VT_LVAL | VT_SYM` -(store/load via symbol address) but never plain `VT_CONST | VT_LVAL` -(via integer address). Trips on `*(volatile T *)addr = v;` patterns -that fall out of musl's weak-hidden-extern code paths after constant -folding. x86_64 routes through generic `gen_modrm`, riscv64 has an -explicit `fr == VT_CONST` branch, arm64 just ran into `printf + assert`. -The patch mirrors the existing `|VT_SYM` case but materializes the -address with `arm64_movimm` instead of `arm64_sym`. Regression test: -`tests/cc/338-literal-addr-deref.c`. After this fix the skip count -drops from 20 to 8. - -### aarch64 status after patch round 2 (current) - -Patches added in round 2 to push past the residual asm-shaped issues: - -- **`arch/aarch64/atomic_arch.h` redesigned** to expose only `a_cas` / - `a_cas_p` as externs (plus `a_barrier` / `a_ctz_64` / `a_clz_64`). - An earlier attempt with extern `a_ll` / `a_sc` deadlooped: the - function-call boundary between `ldaxr` and `stlxr` clears the - exclusive monitor on real hardware and on QEMU/Apple Silicon, so - the LL/SC retry loop never made progress. musl's - `src/internal/atomic.h` derives `a_swap` / `a_fetch_add` / `a_or` / - `a_and` / `a_inc` / `a_dec` / `a_store` from `a_cas`. -- **`src/internal/aarch64/atomic.s`** holds the entire LL/SC pair - inside one call. Two arm64-asm.c phase-2 quirks shape the layout: - - **forward `b.cond` / `cbz` / `cbnz` to a same-file label** errors - with `"CONDBR19 reloc unsupported"`, - - **forward unconditional `b` to a same-file label** silently - assembles as `b +0` (branch-to-self) — no error, but the function - becomes an infinite loop. - Backward branches resolve correctly; branches to extern symbols - (CALL26/JUMP26) work in either direction. So each function defines - its exit block BEFORE the function entry, making every conditional - branch backward. -- **`src/thread/aarch64/__set_thread_area.s` restored** as a - replacement (not deletion). Stock musl uses `msr tpidr_el0, x0`; - arm64-asm.c phase 1+2 doesn't recognize the `msr` mnemonic, so the - encoding is emitted as a raw `.long`. Without this file, - `__init_tls` calls undefined `__set_thread_area` whose static-link - reference gets silently resolved by tcc, then jumps to garbage - before main runs. -- **`arch/aarch64/crt_arch.h` simplified** to just `mov x0, sp; b _start_c` - — drops the `adrp`/`:lo12:_DYNAMIC` sequence (unused for static - builds), the `and sp, x0, #-16` alignment (`bic` rejects bitmask- - immediate; Linux/AAPCS already 16-byte-aligns sp at process entry), - and the `mov x29, #0` / `mov x30, #0` register zeroing (arm64-asm.c - encodes `mov xN, #imm` as the 32-bit `MOVZ wN, #imm` form, leaving - upper 32 bits unset — kernel zeroes GPRs at process entry anyway). -- Two compounding `.word` → `.long` fixes: tcc's `.word` is **2 bytes** - (gas-style for x86), not 4. Every raw-encoding line in `atomic.s`, - `get_tp.s`, and `__set_thread_area.s` would have emitted only half - the instruction, misaligning subsequent function symbols and tripping - `R_AARCH64_(JUMP|CALL)26 relocation failed (val=…, addr=…)` at - link. - -Result: **1263/1271 compile, 8 skips, libc.a archives at ~2.95 MB, -hello links at 87 KB.** The 8 skips are the same long-double -constant-folding files as amd64 (`__cosl.c`, `__sinl.c`, `__tanl.c`, -`exp2l.c`, `fmaf.c`, `j1f.c`, `pow_data.c`) plus -`src/thread/__unmapself.c` (inline asm with output operand — -phase-3-blocked). - -### Hello segfault — open mystery - -Hello starts up, prints `hello from boot4 (tcc-built musl); argc=4`, -then segfaults before the second printf. Isolating shows -`malloc(8)` returns NULL deterministically in some link closures and -succeeds in others. So far: - -- Direct `__syscall(SYS_brk, 0)` works (returns valid break). -- Direct `__syscall(SYS_mmap, ...)` works (returns valid page). -- `int main(void) { malloc(8); }` — NULL. -- `int main(void) { malloc(8); printf(...); }` — still NULL. -- Same with `extern int *__errno_location(void);` declared — still NULL. -- Same with `putchar` before malloc — still NULL. -- Same with extern `__syscall` reference — still NULL. -- BUT: same after explicitly calling `__syscall(SYS_brk, 0)` once - before malloc — succeeds. Repeated mallocs after that all succeed. - -The malloc trampoline path is right (verified in isolation); the asm -primitives are right (verified). The trigger isn't an unresolved -weak-alias `___errno_location` either — adding strong references to -it doesn't change the behavior. Looks like an actual bug in mallocng's -first-call init that depends on something subtle about call ordering -or the kernel's brk-state-on-first-call semantics under QEMU emulation. - -**Pursuing root cause; not papering over with a `brk(0)` warm-up call.** - -### aarch64 skip taxonomy (pre-patch snapshot — 153 skipped sources) - -Compiling each skipped file in isolation and bucketing the first error: - -| count | bucket | category | -|------:|--------|----------| -| **79** | `pthread_arch.h:4` / `atomic_arch.h:5,73` "ARM64 inline asm operands not implemented yet" | arm64-asm.c phase 3 not started — input/output operand constraint plumbing (`subst_asm_operand`, `asm_compute_constraints`). Same root cause as `pthread_arch.h`'s `__get_tp` (uses `"=r"(tp)`), atomic primitives (`a_ll`/`a_sc`/`a_cas`), and the entire `arch/aarch64/syscall_arch.h` surface (every `__syscallN`). | -| **30** | `arch/aarch64/<math>.c` "invalid operand reference after %" | parser doesn't accept the `%w0` width-modifier form for 32-bit views of x-registers; phase-3-adjacent. | -| **17** | `atomic_arch.h:20` "dsb/dmb/isb: expected #imm option" | mnemonic is recognized but the parser wants `#imm`; musl writes `dmb ish` (the named option form). | -| **17** | `store(0, (1011, 5130, 0))` and `assert fail: f == VT_FLOAT \|\| ...` / `assert fail: 0` | tcc internal codegen / assertion failures on aarch64 — NOT asm-related. Files: TLS init, abort, malloc, locale, errno-via-TLS. Pre-existing tcc 0.9.26 aarch64 codegen bugs. | -| **5** | "known instruction expected" — scattered: `crt_arch.h`, `clone.s`, `__set_thread_area.s`, `memset.S`, `fenv.s` | mnemonics outside phase 1+2. `crt_arch.h:15` is the load-bearing one (uses `adrp` + `:lo12:` reloc). | -| **3** | `CONDBR19`/`TSTBR14` reloc unsupported (`b.cond`/`cbz`/`cbnz`/`tbz` to extern targets) | per `docs/TCC-ARM64-ASM.md` phase 2: in-section only; extern targets need entries in `arm64-link.c`. | -| **2** | `ldp/stp: expected register` (setjmp/longjmp) | parsing of pre/post-indexed forms not yet covered. | -| **1** | `pow_data.c` "initializer element is not constant" | same long-double constant-folding bug as on amd64. | -| **1** | crt step: `arch/aarch64/crt_arch.h:15: known instruction expected` | (separate from the 153 — kills the `crt1.o` build.) | - -### riscv64 status after patch round 1 +The build runs in `boot2-scratch:$ARCH` (scratch + busybox, no libc, no +`/etc`) and produces only static artifacts. Dynamic linking and `ldso/` +are intentionally out of scope. -riscv64 mirrors the aarch64 strategy: tcc 0.9.26's `riscv64-asm.c` -has a real upstream assembler for the base ISA, but -`subst_asm_operand` is a stub (`tcc_error("RISCV64 asm not -implemented.")`), so every musl inline-asm site with output operands -fails. The lr/sc atomics, the named `fence rw,rw` form, and a handful -of pseudo-instructions (`tail`, `j`, `ret`) are also absent. +## Usage -Patches added (mirrors the aarch64 set): - -- `arch/riscv64/syscall_arch.h` — static `__inline` wrappers calling - one variadic `__syscall` trampoline -- `src/internal/riscv64/syscall.s` — C-ABI → kernel-ABI shuffle, `ecall` -- `arch/riscv64/pthread_arch.h` — `__get_tp` extern -- `src/internal/riscv64/get_tp.s` — `mv a0, tp; jalr x0, x1, 0` -- `arch/riscv64/atomic_arch.h` — `a_barrier` / `a_cas` / `a_cas_p` as externs -- `src/internal/riscv64/atomic.s` — `lr.w/d.aqrl`, `sc.w/d.aqrl`, - `bne ±12`, `fence rw, rw` as raw `.word` encodings; control flow as - plain mnemonics -- `arch/riscv64/crt_arch.h` — drop `tail` / `.option push/norelax/pop` - / `lla gp, __global_pointer$` (static-only, GP relaxation - unnecessary); pass `_DYNAMIC = NULL`; tail-call via `jal x0, _start_c` -- `src/thread/riscv64/__set_thread_area.s` — replace the `ret` - pseudo with `jalr x0, x1, 0` (other two instructions are stock). - This file is on the `__init_tp` startup path; without it the - generic C fallback returns -ENOSYS (no `SYS_set_thread_area` on - riscv64) and `a_crash()` fires before main runs. - -Plus the per-arch va_list shim. tcc 0.9.26's stock riscv64 `stdarg.h` -spells `__builtin_va_list` as `char *` and implements `va_arg` as the -lp64 pointer-arithmetic macro (no helper-call required, unlike the -amd64 path). The shim mirrors that exactly so musl's `<stdarg.h>` / -`bits/alltypes.h` typedefs and macros resolve under `-nostdinc`. - -Plus `boot4.sh` deletion sweep: `src/math/riscv64/*.c` (FPU inline asm -with `"=f"` constraints — portable C in `src/math/` takes over), -`src/fenv/riscv64/*`, `src/setjmp/riscv64/*.S`, `src/signal/riscv64/*.s`, -the remaining `src/thread/riscv64/*.s` (clone, syscall_cp, -__unmapself), `src/process/riscv64/vfork.s` — all use displacement -load/store syntax (`sd rs, off(rd)`), `csr*` mnemonics, or the -missing pseudos (`j`, `ret`). libc.a will lack clone, syscall_cp, -setjmp/longjmp/sigsetjmp, vfork, fenv, restore — fine for hello. - -Result: **1268/1271 sources compile**, libc.a 2.77 MB, hello 69 KB, -runs in scratch+busybox container. The 3 remaining skips are: - -- `src/math/log.c`, `src/math/pow_data.c` — long-double constant- - initializer folding (same tcc 0.9.26 bug that skips 11 files on - amd64). -- `src/thread/__unmapself.c` — `arch/riscv64/reloc.h`'s `CRTJMP` - macro is `__asm__("mv sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : - "memory")`, which needs `subst_asm_operand`. Not on the hello - path; called only when threads exit. - -Cleaner residual than amd64 (3 vs 11) and dramatically cleaner than -aarch64 (3 vs 20) — riscv64 has no equivalent of aarch64's tcc -codegen-bug bucket, so all asm-shaped failures cleared with the -patch round. +```sh +scripts/boot3.sh <amd64|aarch64|riscv64> +scripts/boot4.sh <amd64|aarch64|riscv64> +``` ## Inputs -| Path | Contents | -|------|----------| -| `build/amd64/boot3/tcc3` | boot3's verified self-host tcc | -| `build/tcc/X86_64/tcc-0.9.26-1147-gee75a10c/{lib,include}` | tcc lib + headers, staged by `stage1-flatten.sh` | -| `vendor/upstream/musl-1.2.5.tar.gz` | pristine upstream musl tarball | -| `vendor/upstream/musl-1.2.5-tcc.patch` | tcc-compat patch (3145 lines, 93 files) | -| `scripts/boot4-musl-shim.h` | `__builtin_va_list` shim (see below) | - -## The musl patch - -The patch is the unified diff between upstream musl-1.2.5 and the -pre-modified tree under `/Users/ryan/tmp/musltcc/musl-1.2.5/`. Most of -it is deletions; the meaningful additions/modifications are: +| Path | Purpose | +|------|---------| +| `build/$ARCH/boot3/tcc3` | fixed-point self-host tcc from boot3 | +| `build/tcc/$TCC_TARGET/tcc-0.9.26-1147-gee75a10c/{include,lib}` | staged tcc headers and libtcc1 sources | +| `vendor/upstream/musl-1.2.5.tar.gz` | pristine upstream musl source | +| `vendor/upstream/musl-1.2.5-tcc.patch` | tcc-compat musl patch | +| `scripts/boot4-musl-shim-$ARCH.h` | per-arch `__builtin_va_list` bridge | -| File | Change | -|------|--------| -| `arch/x86_64/syscall_arch.h` | replace inline-asm syscalls with calls to a pure-asm trampoline (tcc lacks GCC's register-asm-variable extension) | -| `src/internal/x86_64/syscall.s` | new SysV-ABI → kernel-ABI shim called by the new syscall_arch.h | -| `src/include/features.h` | redefine `weak_alias()` as `.weak`/`.set` directives + an extern decl, since tcc ignores `__attribute__((alias(...)))` | -| `src/internal/syscall.h`, `src/network/lookup*.{h,c}` | drop C99 `[static N]` array-parameter qualifiers (tcc 0.9.26 doesn't parse them) | -| `include/complex.h` | stub out — tcc has no `_Complex` | -| `src/complex/*` (deleted) | empty header makes them irrelevant | -| `src/{fenv,signal}/x86_64/*.s`, `src/math/x86_64/*.c` (deleted) | drop x86_64 inline-asm overrides — tcc rejects SSE/x87 constraints, `stmxcsr`, x87 tbyte ops; the portable C fallbacks take over | +Architecture mapping: -## boot4-musl-shim.h - -Pre-included on every musl `.c` translation unit. musl's `stdarg.h` and -generated `bits/alltypes.h` spell varargs the GCC way: - -```c -typedef __builtin_va_list va_list; -#define va_start(v,l) __builtin_va_start(v,l) -``` +| `ARCH` | container platform | `TCC_TARGET` | musl target | +|--------|--------------------|--------------|-------------| +| `amd64` | `linux/amd64` | `X86_64` | `x86_64-linux-musl` | +| `aarch64` | `linux/arm64` | `ARM64` | `aarch64-linux-musl` | +| `riscv64` | `linux/riscv64` | `RISCV64` | `riscv64-linux-musl` | -tcc 0.9.26 has no `__builtin_va_list` typename; its own `<stdarg.h>` -spells the same shape `__va_list_struct[1]`. The shim aliases -`__builtin_va_list` to that array type and routes the four -`__builtin_va_*` macros to tcc's intrinsics (`__va_start`, `__va_arg`, -`__builtin_frame_address`, `__builtin_va_arg_types`). libtcc1's -`va_list.c` provides `__va_start` and `__va_arg` at link time. - -## Two tcc 0.9.26 traps - -1. **`-include` corrupts assembler input.** tcc 0.9.26 prepends the - contents of `-include` files to `.s`/`.S` inputs as well as `.c` - inputs, choking the assembler. It still emits a 620-byte ELF with - no defined symbols and no error — the build looks green and the link - fails with "undefined symbol memset". boot4 splits CFLAGS into - `CFLAGS_C` (with `-include`) and `CFLAGS_ASM` (without). -2. **No `__builtin_va_list`.** Solved by the shim above. Without it, - the first musl source to pull in `<stdio.h>` errors with `';' - expected (got "va_list")`. - -## Skipped sources +## Outputs -11 files are skipped (compile-on-fail in the loop, never reached at -link time by hello). All of them lean on long-double constant folding -that tcc 0.9.26 can't do — e.g. `static const long double toint = -1.5/LDBL_EPSILON;` in `__rem_pio2l.c`: +`scripts/boot4.sh` writes final artifacts to `build/$ARCH/boot4/`: -``` -src/math/__rem_pio2l.c src/math/__sinl.c src/math/__tanl.c -src/math/erfl.c src/math/lgammal.c src/math/modfl.c -src/math/pow_data.c src/math/powl.c src/math/rintl.c -src/math/roundl.c src/math/tgammal.c -``` +| File | Purpose | +|------|---------| +| `libtcc1.a` | tcc runtime archive used when linking musl-built programs | +| `libc.a` | static musl libc archive | +| `crt1.o`, `crti.o`, `crtn.o` | static startup and init/fini CRT objects | +| `hello` | static smoke-test ELF linked by boot4 | -Anything that calls `sinl`, `cosl`, `tanl`, `erfl`, `lgammal`, `powl`, -or other long-double trig/special functions will fail to link. hello.c -doesn't use any of them. The musltcc demo (which uses tcc-mob 0.9.28rc) -does not skip these. +The staging copy under `build/$ARCH/.boot4-stage/` is disposable. -## Outputs +## Pipeline +1. Copy boot3 `tcc3`, tcc headers, tcc runtime sources, musl tarball, + musl patch, and the per-arch shim into `build/$ARCH/.boot4-stage/in`. +2. Build `libtcc1.a` with the boot3 tcc. +3. Extract musl, apply `musl-1.2.5-tcc.patch`, and remove unsupported + arch-specific override files so portable C fallbacks are selected + where possible. +4. Configure musl with `CC=$TCC AR=true RANLIB=true`, `--disable-shared`, + and `--disable-wrapper`. +5. Generate `bits/alltypes.h`, `bits/syscall.h`, and `version.h`. +6. Compile all selected musl sources. Sources that fail to compile are + skipped and reported; boot4 requires the remaining closure to archive, + link, and run hello. +7. Build CRT objects, archive `libc.a`, link static `hello`, and execute + it inside the target container. + +Assembler inputs must not receive the va-list shim. tcc 0.9.26 applies +`-include` to `.s`/`.S` as well as `.c`, so boot4 keeps separate +`CFLAGS_C` and `CFLAGS_ASM`. + +## Compatibility Surface + +The musl patch keeps upstream musl mostly intact and replaces only the +surfaces tcc 0.9.26 cannot compile: + +| Area | Rule | +|------|------| +| syscalls | replace GCC register-asm-variable wrappers with per-arch asm trampolines | +| atomics / thread pointer | replace inline asm operands with extern asm helpers on aarch64/riscv64 | +| weak aliases | implement `weak_alias` via assembler `.weak`/`.set` directives | +| C99 array parameters | remove `[static N]` qualifiers tcc does not parse | +| `_Complex` | stub `complex.h` and remove complex sources | +| arch asm overrides | delete unsupported fenv, signal, setjmp, thread, string, math overrides as needed | +| varargs | pre-include `scripts/boot4-musl-shim-$ARCH.h` for C translation units | + +Required tcc fixes live under `scripts/simple-patches/tcc-0.9.26/`. +The musl build depends on the aarch64 literal-address load/store fixes +and the LP64 `L`-suffix constant fix. + +## Status + +| arch | result | skipped sources | +|------|--------|-----------------| +| `amd64` | verified | 11 | +| `aarch64` | verified | 8 | +| `riscv64` | verified | 3 | + +Skipped sources are outside the boot4 hello closure. They fall into two +categories: + +- long-double constant-folding files that tcc 0.9.26 cannot compile; +- thread exit / low-level asm files needing inline-asm operand support. + +Anything that references a skipped function may fail to link. The boot4 +contract is a static libc sufficient to link and run the included hello +smoke program, not full musl conformance. + +## Smoke Output + +Successful boot4 ends by running: + +```text +hello from boot4 (tcc-built musl); argc=4 +strdup: works, strlen: 5 ``` -libtcc1.a ~7 KB tcc runtime: libtcc1.o + alloca86_64.{o,bt.o} + va_list.o -libc.a ~2.4 MB static musl libc, 1258 .o members -crt1.o ~1.2 KB static-link entry stub -crti.o ~830 B _init/_fini head -crtn.o ~770 B _init/_fini tail -hello ~55 KB static ELF, runs in container, prints argc + strdup demo -``` - -## Caveats / not done - -- musl is built **static-only**: `ldso/` is excluded from libc.a — it's - for the dynamic linker and defines `__init_array_start` which collides - with what tcc's internal linker synthesizes for `-static` binaries. -- `compat/time32` skipped — 32-bit time_t aliases, irrelevant on - x86_64 and produces duplicate-symbol errors. -- aarch64: blocks at link — see "aarch64 status after patch round 1" - for the remaining tcc 0.9.26 codegen bugs. -- riscv64: end-to-end works after patch round 1 (see "riscv64 status - after patch round 1"). 3 residual skips, none on the hello path. -- No `make`, `busybox`, or further userland — boot4 stops at hello. - The musltcc demo continues to GNU make 4.4.1 and busybox 1.36.1; that - pipeline could plug in here once the libc is solid. diff --git a/scripts/simple-patches/tcc-0.9.26/lp64-long-constant.after b/scripts/simple-patches/tcc-0.9.26/lp64-long-constant.after @@ -0,0 +1,6 @@ + lcount++; +#if (!defined TCC_TARGET_X86_64 && !defined TCC_TARGET_ARM64 && !defined TCC_TARGET_RISCV64) || defined TCC_TARGET_PE + if (lcount == 2) +#endif + must_64bit = 1; + ch = *p++; diff --git a/scripts/simple-patches/tcc-0.9.26/lp64-long-constant.before b/scripts/simple-patches/tcc-0.9.26/lp64-long-constant.before @@ -0,0 +1,6 @@ + lcount++; +#if !defined TCC_TARGET_X86_64 || defined TCC_TARGET_PE + if (lcount == 2) +#endif + must_64bit = 1; + ch = *p++; diff --git a/scripts/stage1-flatten.sh b/scripts/stage1-flatten.sh @@ -140,6 +140,12 @@ apply_our_patch getcwd-stub "$SRC/tccgen.c" apply_our_patch ldexp-stub "$SRC/tccpp.c" apply_our_patch date-time-stub "$SRC/tccpp.c" apply_our_patch lex-char-unsigned "$SRC/tccpp.c" + +# LP64 constants: upstream's parser treats one `L` suffix as 64-bit +# only on x86_64. ARM64/RISCV64 are LP64 too; without this, `-4096UL` +# is zero-extended from 32 bits and musl's __syscall_ret rejects valid +# high mmap addresses as errors. +apply_our_patch lp64-long-constant "$SRC/tccpp.c" apply_our_patch elfinterp-stub "$SRC/tccelf.c" # x86_64 static-link PLT32 collapse: under BOOTSTRAP we force diff --git a/tests/cc/339-lp64-unsigned-long-constant.c b/tests/cc/339-lp64-unsigned-long-constant.c @@ -0,0 +1,19 @@ +/* LP64 integer suffix regression. + * + * tcc 0.9.26 treated one `L` suffix as 64-bit only for x86_64. On + * aarch64/riscv64 that made `-4096UL` become 0x00000000fffff000 + * instead of 0xfffffffffffff000, which broke musl's __syscall_ret: + * valid high mmap addresses were classified as syscall errors. + */ + +int main(void) +{ + unsigned long u = -12UL; + unsigned long threshold = -4096UL; + unsigned long high_user_addr = 0x0000ffff00000000UL; + + if ((long)u != -12L) return 1; + if ((long)threshold != -4096L) return 2; + if (high_user_addr > threshold) return 3; + return 0; +}