boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit 9dfc688cb237a9e09da34062faf775820a01c67c
parent 73824f6a72c7d9f372b48aa9b0058727c8e992a1
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Sat,  2 May 2026 03:51:36 -0700

tcc-patch: const-expr &&/||/?: must short-circuit (C11 §6.6¶3)

gen_opic/gen_opif raised "division by zero in constant" from
1 || (1/0), 0 && (1/0), 1 ? 7 : 1/0 because expr_land/expr_lor/
expr_cond bumped nocode_wanted around the unevaluated arm but left
const_wanted set, so const-folding still aborted. Gate the error
on !nocode_wanted.

gcc-built ARM64 control: 173→175 passed (only 200-lex-char-type
remains as upstream tcc bug). On cc.scm-built tcc-boot2, 240/290
shifted from div-zero error into the dominant vtop-assert cluster
(cc.scm-side miscompile), so total stays 14/162; cap rises to
175/176.

Refresh docs/TCC-TODO.md with current 176-fixture counts and
re-grouped failure modes.

Diffstat:
Mdocs/TCC-TODO.md | 138++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------
Ascripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-float.after | 7+++++++
Ascripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-float.before | 4++++
Ascripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-int.after | 9+++++++++
Ascripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-int.before | 4++++
Mscripts/stage1-flatten.sh | 6++++++
6 files changed, 118 insertions(+), 50 deletions(-)

diff --git a/docs/TCC-TODO.md b/docs/TCC-TODO.md @@ -74,14 +74,17 @@ make test SUITE=tcc-cc Result: ```text -14 passed, 139 failed +14 passed, 162 failed ``` -(153 fixtures total. `068-main-noret` was retired earlier as part of -the host-baseline cleanup; `134-decl-define-in-ifdef` was added as a -regression for the `pps-cond-stack` promotion fix in cc.scm. The new -fixture passes the cc-built tcc-boot2 path because tcc itself has no -trouble with the shape; cc.scm couldn't compile it before the fix.) +(176 fixtures total. The fixture set has grown since the previous +snapshot; the same 14 fixtures still pass. `068-main-noret` was +retired earlier as part of the host-baseline cleanup; +`134-decl-define-in-ifdef` was added as a regression for the +`pps-cond-stack` promotion fix in cc.scm. Newer additions in the +2xx/3xx ranges exercise lex/preproc/codegen edges; only +`220-const-promote` from that range currently makes it past compile +on the cc.scm-built path, and it then exits wrong.) Raw run log: @@ -120,29 +123,49 @@ Failure groups from per-fixture `tcc.log` files: | group | count | examples | |------:|------:|----------| -| plain segfault during compile/link | 59 | `006-call-no-args`, `008-pointer-deref`, `011-struct`, `125-anon-union` | -| `store(...); assert fail: 0`, then segfault | 43 | `002-arith`, `004-inc-dec`, `020-switch`, `133-for-continue` | -| `assert fail: vtop[-1].r < VT_CONST && vtop[0].r < VT_CONST`, then segfault | 31 | `007-call-with-args`, `013-call`, `015-variadic`, `024-globals`, `076-vararg-recv` | -| `too many field init` diagnostic | 3 | `001-kitchen-sink`, `012-struct-ptr`, `053-init-struct-pos` | -| compile succeeds, generated program exits wrong | 2 | `019-zext-narrow`, `101-char-escapes` | -| `field expected` diagnostic | 1 | `054-init-struct-desig` | - -The shape is unchanged: 137 of 139 failures happen before the +| plain segfault during compile/link | 73 | `006-call-no-args`, `008-pointer-deref`, `011-struct`, `054-init-struct-desig`, `310-tag-shadow-inner-scope` | +| `store(...); assert fail: 0`, then segfault | 54 | `002-arith`, `004-inc-dec`, `020-switch`, `133-for-continue` | +| `assert fail: vtop[-1].r < VT_CONST && vtop[0].r < VT_CONST`, then segfault | 30 | `007-call-with-args`, `013-call`, `015-variadic`, `024-globals`, `076-vararg-recv`, `240-parse-const-shortcircuit`, `290-parse-const-ternary-shortcircuit` | +| compile succeeds, generated program exits wrong | 3 | `019-zext-narrow`, `101-char-escapes`, `220-const-promote` | +| `too many field init` diagnostic | 1 | `001-kitchen-sink` | +| `field not found` diagnostic | 1 | `331-fs-compound-struct-addr` | + +The shape is unchanged: 159 of 162 failures happen before the generated fixture binary runs. The dominant problem is still the compiled `tcc-boot2` while it is compiling/linking C input, not the runtime behavior of most generated test binaries. -The previous run carved out a separate "`__builtin_va_start` warning, -then segfault" group (3 fixtures: 015-variadic, 076-vararg-recv, -079-vararg-deep). Those have moved into the `vtop[-1].r < VT_CONST` -cluster after the libp1pp consolidation in `cc.scm` (sub-word -ld/st, sext/zext, lea_slot, ptr arith, struct-copy via memcpy, -cmpset, neg/bnot/bool, switch_case): the cc.scm-built tcc-boot2 no -longer takes the upstream warning path, so the same underlying vstack -miscompile asserts directly. Failure counts shift slightly (58→59 -plain segfaults, 44→43 store-asserts, 28→31 vtop-asserts) for the -same reason — it's the same compiler bug surface, regrouped, not a -behavior regression. +One failure is not cc.scm miscompilation — it reproduces on the +gcc-built control (see Host Baseline below): + +- `200-lex-char-type` (exits 21 instead of 0 on both paths) + +This is an upstream tcc bug and would need a `simple-patches/` +patch to fix. It caps the achievable cc.scm-built result at +`175 passed, 1 failed` until tcc itself is patched. + +The const-expression short-circuit pair (`240-parse-const-shortcircuit` +and `290-parse-const-ternary-shortcircuit`) used to also be capped by +upstream tcc — `gen_opic` raised `division by zero in constant` from +`1 || (1/0)` and `1 ? 7 : 1/0` because `expr_land`/`expr_lor`/ +`expr_cond` bumped `nocode_wanted` around the unevaluated arm but +left `const_wanted` set, so const-folding still aborted. The +`const-divzero-shortcircuit-int` simple-patch in +`scripts/simple-patches/tcc-0.9.26/` gates that error on +`!nocode_wanted`. Both fixtures now pass on the gcc-built control; +on the cc.scm-built path they shifted into the dominant +`vtop[-1].r < VT_CONST` cluster (cc.scm-side miscompile), which is +why this fix didn't move the cc.scm-built pass count. + +Two diagnostics are new on this snapshot and look like targeted +miscompiles rather than missing categories: + +- `001-kitchen-sink:29: too many field init` — this fixture parses + on the gcc-built tcc, so cc.scm-built tcc has a corrupted + field-init counter on this aggregate shape. +- `331-fs-compound-struct-addr:3: field not found: x` — file-scope + compound literal (`&(struct point){3,4}`) loses its struct type + on the cc.scm-built path; the gcc-built control accepts it. Working hypothesis: our compiler is miscompiling tcc itself. In this suite, `tcc-boot2` is a tcc binary produced by `cc.scm`; the failures @@ -156,17 +179,21 @@ Alpine gcc and use that gcc-built tcc to run the same `tests/cc` fixtures: ```text -gcc-built ARM64 tcc.flat.c (libc + tcc hdrs): 152 passed, 0 failed -cc.scm-built ARM64 tcc-boot2: 13 passed, 139 failed +gcc-built ARM64 tcc.flat.c (libc + tcc hdrs): 175 passed, 1 failed +cc.scm-built ARM64 tcc-boot2: 14 passed, 162 failed ``` -The gcc-built control is fully green: it links the flattened tcc -against `libc.flat.c` + libtcc1 + a tiny mes-libc string runtime, and -passes `-I tcc/include` so the bundled `<stdarg.h>` resolves under -`-nostdlib`. Run with `scripts/run-gcc-libc-flat-tcc.sh`. This proves -the fixtures and the flattened tcc source are coherent end-to-end, so -the remaining 139 failures on the `cc.scm`-built path are evidence -that our compiler is miscompiling tcc. +The gcc-built control's only remaining failure (`200-lex-char-type`) +is an upstream tcc bug, not cc.scm miscompilation, and it caps the +achievable cc.scm result at 175/176 until tcc itself is patched. +Aside from those, the gcc-built control is green: it links the +flattened tcc against `libc.flat.c` + libtcc1 + a tiny mes-libc +string runtime, and passes `-I tcc/include` so the bundled +`<stdarg.h>` resolves under `-nostdlib`. Run with +`scripts/run-gcc-libc-flat-tcc.sh`. This proves the fixtures and the +flattened tcc source are coherent end-to-end, so the remaining 159 +cc.scm-only failures are evidence that our compiler is miscompiling +tcc. ## Host Baseline @@ -199,24 +226,32 @@ Current result: ```text tcc version 0.9.26 (AArch64 Linux) -152 passed, 0 failed +175 passed, 1 failed ``` +The only remaining failure (`200-lex-char-type`) is an upstream tcc +bug, not a fixture or cc.scm issue — see the failure-modes notes +above. + This is the canonical sanity reference for tcc-built-from-our-source. The fixtures were cleaned up to drop assumptions about implicit `char` signedness and `long double` size; `068-main-noret` was retired because relying on the implicit return value of `main` is undefined. -`scripts/simple-patches/tcc-0.9.26/` carries two AArch64 vararg fixes -applied during stage1-flatten so any tcc rebuilt from this tree picks -them up: `aarch64-stdarg-array.{before,after}` swaps the bundled -`va_list` for `__va_list_struct[1]` (matches glibc/musl/x86_64 ABI), -and the `arm64-va-{pointer-operand,arg-pointer}.{before,after}` pair -teach `gen_va_start`/`gen_va_arg` to skip `gaddrof()` when the -operand is already a pointer (the array-decayed/pointer-parameter -case). Without this, `va_list` forwarding into a non-variadic helper -(the `vfprintf` shape, e.g. `131-vararg-mixed`) hit `assert fail: 0` -in `arm64-gen.c`. +`scripts/simple-patches/tcc-0.9.26/` carries AArch64 vararg fixes and +a const-expr fix applied during stage1-flatten so any tcc rebuilt +from this tree picks them up. `aarch64-stdarg-array.{before,after}` +swaps the bundled `va_list` for `__va_list_struct[1]` (matches +glibc/musl/x86_64 ABI), and the +`arm64-va-{pointer-operand,arg-pointer}.{before,after}` pair teach +`gen_va_start`/`gen_va_arg` to skip `gaddrof()` when the operand is +already a pointer (the array-decayed/pointer-parameter case). +Without this, `va_list` forwarding into a non-variadic helper (the +`vfprintf` shape, e.g. `131-vararg-mixed`) hit `assert fail: 0` in +`arm64-gen.c`. `const-divzero-shortcircuit-int.{before,after}` gates +`gen_opic`'s "division by zero in constant" error on +`!nocode_wanted` so that the unevaluated arm of `&&`/`||`/`?:` in +constant expressions (C11 §6.6¶3) does not abort. Two fixture cleanups are part of that baseline: @@ -245,11 +280,14 @@ Start with the earliest minimal failures in each dominant group: - `007-call-with-args`: first clear `vtop[-1].r < VT_CONST` assertion on an ordinary call with arguments. - `006-call-no-args`: first plain segfault with a very small source. -- `001-kitchen-sink` or `012-struct-ptr`: first incorrect tcc parser - diagnostics around aggregate initialization. -- `019-zext-narrow`, `101-char-escapes`: the only current failures - where `tcc-boot2` successfully emits and links a binary but the - binary returns the wrong status. +- `001-kitchen-sink`: only remaining `too many field init` case; + good lens on aggregate-init counter miscompile. +- `331-fs-compound-struct-addr`: only remaining `field not found` + case; lens on file-scope compound-literal type tracking. +- `019-zext-narrow`, `101-char-escapes`, `220-const-promote`: the + three current failures where `tcc-boot2` successfully emits and + links a binary but the binary returns the wrong status — closest + to "isolated codegen miscompile." Keep using `make test SUITE=cc ARCH=aarch64 NAMES=...` as the control path for fixture semantics, and `make test SUITE=tcc-cc NAMES=...` as diff --git a/scripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-float.after b/scripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-float.after @@ -0,0 +1,7 @@ + if (f2 == 0.0) { + /* See const-divzero-shortcircuit-int patch: respect + * nocode_wanted so unevaluated short-circuited arms + * don't trigger constant divide-by-zero errors. */ + if (const_wanted && !nocode_wanted) + tcc_error("division by zero in constant"); + goto general_case; diff --git a/scripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-float.before b/scripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-float.before @@ -0,0 +1,4 @@ + if (f2 == 0.0) { + if (const_wanted) + tcc_error("division by zero in constant"); + goto general_case; diff --git a/scripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-int.after b/scripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-int.after @@ -0,0 +1,9 @@ + if (l2 == 0) { + /* C11 §6.6¶3: an unevaluated short-circuited arm of + * &&/||/?: need not be a valid constant expression. + * tcc's expr_land/expr_lor/expr_cond bump nocode_wanted + * around the unevaluated arm but leave const_wanted set, + * so 1 || (1/0) used to abort here. Honor nocode_wanted. */ + if (const_wanted && !nocode_wanted) + tcc_error("division by zero in constant"); + goto general_case; diff --git a/scripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-int.before b/scripts/simple-patches/tcc-0.9.26/const-divzero-shortcircuit-int.before @@ -0,0 +1,4 @@ + if (l2 == 0) { + if (const_wanted) + tcc_error("division by zero in constant"); + goto general_case; diff --git a/scripts/stage1-flatten.sh b/scripts/stage1-flatten.sh @@ -141,6 +141,12 @@ apply_our_patch ldexp-stub "$SRC/tccpp.c" apply_our_patch date-time-stub "$SRC/tccpp.c" apply_our_patch elfinterp-stub "$SRC/tccelf.c" +# Const-expr short-circuit: gen_opic/gen_opif must respect nocode_wanted +# so 1 || (1/0), 0 && (1/0), 1 ? 2 : 1/0 etc. don't abort with "division +# by zero in constant" in their unevaluated arms (C11 §6.6¶3). +apply_our_patch const-divzero-shortcircuit-int "$SRC/tccgen.c" +apply_our_patch const-divzero-shortcircuit-float "$SRC/tccgen.c" + # AArch64 vararg fixes — only relevant when targeting ARM64; harmless # to apply unconditionally since neither file is read on other arches. apply_our_patch aarch64-stdarg-array "$SRC/include/stdarg.h"