boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit 877607adbf647dd1bf3942699d4efcef757571e4
parent 91bee1d76ff5a3d817e7322ab68c3f81945c879a
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Sun,  3 May 2026 20:51:08 -0700

docs: refresh cc.scm map + tcc tracker for current state

cc/cc.scm.md: bump line numbers to current cc.scm; record the M1pp +
hex2++ migration in design choices (dotted local labels under .scope,
.align directives, bare-hex string emission); add %ctype-arith? to the
predicates list and %cg-merge-arith-type to the line map.

docs/TCC-TODO.md: tcc-cc result is now 176/2, exact parity with the
gcc baseline. Remaining 2 fails are 200-lex-char-type (upstream tcc
bug) and 129-extern-libp1pp (linkage-only). Replace the result-history
prose with a delta table from 148/30 -> 176/2; remove the "Next Debug
Targets" section (all targets resolved).

docs/TCC.md: stage 2 status updated from "scheme1-cc will fill the
slot" to "now runs end-to-end via `make tcc-boot2`."

Drop docs/TCC-CC-INVESTIGATION.md and scripts/dbg-load-cbz.sh: both
tracked the now-resolved `assert fail: 0@12051` cluster (14 fixtures),
with hardcoded binary addresses that no longer match the current build.

Diffstat:
Mcc/cc.scm.md | 126+++++++++++++++++++++++++++++++++++++++++--------------------------------------
Ddocs/TCC-CC-INVESTIGATION.md | 303-------------------------------------------------------------------------------
Mdocs/TCC-TODO.md | 231++++++++++++++++++++++++-------------------------------------------------------
Mdocs/TCC.md | 20+++++++++++---------
Dscripts/dbg-load-cbz.sh | 36------------------------------------
5 files changed, 146 insertions(+), 570 deletions(-)

diff --git a/cc/cc.scm.md b/cc/cc.scm.md @@ -2,7 +2,7 @@ ## Overview -`cc.scm` is a complete C compiler (6611 lines) written in Scheme (scheme1 dialect) that compiles C source to P1pp assembly. It implements a streaming pipeline: **lexer → preprocessor → parser → codegen**. Designed for minimal memory use with fixed pre-allocated buffers and a scratch/main heap discipline that resets per declaration. Targets the P1 64-bit RISC ISA via libp1pp macros. +`cc.scm` is a complete C compiler (7405 lines) written in Scheme (scheme1 dialect) that compiles C source to P1pp assembly. It implements a streaming pipeline: **lexer → preprocessor → parser → codegen**. Designed for minimal memory use with fixed pre-allocated buffers and a scratch/main heap discipline that resets per declaration. Targets the P1 64-bit RISC ISA via libp1pp macros; output is consumed directly by the M1pp expander and hex2++ assembler/linker (see [docs/M1PP.md](../docs/M1PP.md), [docs/HEX2pp.md](../docs/HEX2pp.md)). --- @@ -12,14 +12,14 @@ | Subsystem | Lines | Role | |-----------|-------|------| -| Utilities | 1–282 | Bytevector helpers, list/alist ops, output buffers, diagnostics, debug logging, name generation | -| Data Structures | 283–583 | Record type definitions, interned primitive ctypes, ctype predicates | -| Symbol Alphabets | ~430–~560 | Keyword and punctuator alists | -| Lexer | 584–1676 | Tokenizes C source; trigraph/splice, comments, escape sequences | -| Preprocessor | 1677–2503 | `#define`, `#if`, macro expansion with hide-sets; `pp-eval-cexpr` delegates to `parse-const-int` via `%pp-make-const-ps` | -| Code Generator | 2504–4034 | P1pp assembly emission, vstack, frame allocation, all operators and control flow | -| Parser | 4035–6505 | Recursive-descent + Pratt; declarations, statements, expressions; shared constant-expression evaluator | -| Main Driver | 6506–6611 | CLI parsing, file I/O, pipeline initialization | +| Utilities | 1–286 | Bytevector helpers, list/alist ops, output buffers, diagnostics, debug logging, name generation | +| Data Structures | 287–595 | Record type definitions, interned primitive ctypes, ctype predicates | +| Symbol Alphabets | ~530–~595 | Keyword and punctuator alists | +| Lexer | 596–1700 | Tokenizes C source; trigraph/splice, comments, escape sequences | +| Preprocessor | 1701–2540 | `#define`, `#if`, macro expansion with hide-sets; `pp-eval-cexpr` delegates to `parse-const-int` via `%pp-make-const-ps` | +| Code Generator | 2541–4332 | P1pp assembly emission, vstack, frame allocation, all operators and control flow | +| Parser | 4334–6800 | Recursive-descent + Pratt; declarations, statements, expressions; shared constant-expression evaluator | +| Main Driver | 7372–7405 | CLI parsing, file I/O, pipeline initialization | --- @@ -88,7 +88,7 @@ Source file **Per-function code path:** 1. `cg-fn-begin` — emit param spills, sret setup, allocate prologue-buf 2. `parse-fn-body` — emits P1pp directly into fn-buf via cg ops -3. `cg-fn-end` — drain prologue-buf + fn-buf into text, emit ret block +3. `cg-fn-end` — drain prologue-buf + fn-buf into text, emit `:.ret` label and ret block, wrap in `%fn(<name>, <frame>, { ... })` **`#if` constant-expression path:** `pp-eval-cexpr` → resolve `defined`, macro-expand, idents→0 → `%pp-make-const-ps` (minimal pstate, empty scope, no cg) → `parse-const-int` (shared with parser) @@ -99,62 +99,62 @@ Source file | Lines | Description | |-------|-------------| -| **1–109** | Bytevector primitives: `bv=`, `bv-prefix?`, `bv-slice`, `bv-cat`, `bv->fixnum`; list/alist utilities: `alist-ref`, `alist-update`, `any`, `every`, `count`; integer helpers: `min3`, `align-up` | -| **110–122** | `%BUF-CAP-*` — buffer pre-allocation constants (TEXT 8MiB, DATA 2MiB, BSS 2MiB, FN 256KiB, PROLOGUE 16KiB) | -| **123–210** | Output buffer system: `buf` record, `buf-push!`, `buf-flush`, `buf-reset!`, `buf-drain!` — fixed-capacity, no growth | -| **211–282** | Diagnostics: `die` with loc formatting, `slurp-fd`, `write-bv-fd`; debug logging: `debug-log-on!/off!`, `trace-emit` flags; fresh name generator: `make-namer` | -| **283–496** | Record type definitions: `loc`, `tok`, `macro`, `ctype`, `sym`, `opnd`, `loop-ctx`, `fn-ctx`, `world`, `pstate`, `cg`; interned primitive ctypes (`%t-void`, `%t-i8`…`%t-u64`, `%t-bool`, `%t-flt`, `%t-dbl`, `%t-ldbl`); ctype predicates: `%ctype-ptr?`, `%ctype-pointee`, `%ctype-unsigned?`, `%ctype-fp?`; ctype accessors | -| **~430–~560** | `%keyword-alist` — storage/qualifiers/type specifiers/statements/operators/reserved; `%punct-alist` — punctuators longest-first, digraphs | -| **584–640** | Lexer byte-class predicates: `%digit?`, `%hex?`, `%alpha?`, `%ident-start?`, `%ident-cont?`, `%hspace?`, `%newline?`; `%lex-scratch` buffer | -| **641–770** | Logical byte access: `%lex-peek` with trigraph translation + line splice | -| **771–920** | Comment stripping: `%skip-ws-and-comments`, `%skip-line-comment`, `%skip-block-comment` | -| **921–1070** | Byte-run scanners: `%scan-while`, `%fill-while-bv`, `%accum-int-while`, `%accum-octal-bounded` | -| **1071–1270** | Token readers: `lex-read-ident`, `%lex-read-number` (hex/octal/decimal), `%lex-read-string` (with escapes), `lex-read-char` | -| **1271–1351** | `%lex-read-punct` with longest-match bucketing; `%punct-buckets` | -| **1352–1676** | `lex-iter` streaming token source: `make-lex-iter`, `%lex-iter-pull` with heap-rewind discipline; `list-iter` wrapper; `lex-tokenize` test driver | -| **1677–1820** | Preprocessor state (`pp-state`), token classification helpers (`%pp-eof?`, `%pp-nl?`, `%pp-hash?`, etc.) | +| **1–116** | Bytevector primitives: `bv=`, `bv-prefix?`, `bv-slice`, `bv-cat`, `bv->fixnum`; list/alist utilities: `alist-ref`, `alist-update`, `any`, `every`, `count`; integer helpers: `min3`, `align-up` | +| **117–122** | `%BUF-CAP-*` — buffer pre-allocation constants (TEXT 8MiB, DATA 2MiB, BSS 2MiB, FN 256KiB, PROLOGUE 16KiB) | +| **124–215** | Output buffer system: `buf` record, `buf-push!`, `buf-flush`, `buf-reset!`, `buf-drain!` — fixed-capacity, no growth | +| **216–286** | Diagnostics: `die` with loc formatting, `slurp-fd`, `write-bv-fd`; debug logging: `debug-log-on!/off!`, `trace-emit` flags; fresh name generator: `make-namer` | +| **287–528** | Record type definitions: `loc`, `tok`, `macro`, `ctype`, `sym`, `opnd`, `loop-ctx`, `fn-ctx`, `world`, `pstate`, `cg`; interned primitive ctypes (`%t-void`, `%t-i8`…`%t-u64`, `%t-bool`, `%t-flt`, `%t-dbl`, `%t-ldbl`); ctype predicates: `%ctype-ptr?`, `%ctype-pointee`, `%ctype-unsigned?`, `%ctype-arith?`, `%ctype-fp?`; ctype accessors | +| **530–595** | `%keyword-alist` — storage/qualifiers/type specifiers/statements/operators/reserved; `%punct-alist` — punctuators longest-first, digraphs | +| **596–660** | Lexer byte-class predicates: `%digit?`, `%hex?`, `%alpha?`, `%ident-start?`, `%ident-cont?`, `%hspace?`, `%newline?`; `%lex-scratch` buffer | +| **661–790** | Logical byte access: `%lex-peek` with trigraph translation + line splice | +| **791–940** | Comment stripping: `%skip-ws-and-comments`, `%skip-line-comment`, `%skip-block-comment` | +| **941–1090** | Byte-run scanners: `%scan-while`, `%fill-while-bv`, `%accum-int-while`, `%accum-octal-bounded` | +| **1091–1290** | Token readers: `lex-read-ident`, `%lex-read-number` (hex/octal/decimal), `%lex-read-string` (with escapes), `lex-read-char` | +| **1291–1370** | `%lex-read-punct` with longest-match bucketing; `%punct-buckets` | +| **1371–1700** | `lex-iter` streaming token source: `make-lex-iter`, `%lex-iter-pull` with heap-rewind discipline; `list-iter` wrapper; `lex-tokenize` test driver | +| **1701–1820** | Preprocessor state (`pp-state`), token classification helpers (`%pp-eof?`, `%pp-nl?`, `%pp-hash?`, etc.) | | **1821–1920** | Built-in macros: `__FILE__`, `__LINE__`, `__STDC__`, `__LISPCC__`, `__DATE__`, `__TIME__`, `__STDC_VERSION__`, `__STDC_HOSTED__`, `__VA_ARGS__` | | **1921–2020** | Streaming pp-iter: `make-pp-iter`, `%pp-iter-pull` with out-buf stashing | | **2021–2120** | Upstream helpers: `%pp-pull-upstream`, `%pp-peek-upstream`, `%pp-unshift-upstream!`, `%pp-collect-line-stream`, `%pp-collect-args-stream` | | **2121–2270** | Directive dispatch: `%pp-dispatch-step`, `%pp-dispatch-directive` → `%pp-do-define`, `%pp-do-undef`, `%pp-do-if/ifdef/ifndef/elif/else/endif` with cond-stack | | **2271–2370** | Directives: `%pp-do-error`, `%pp-do-line`, `%pp-do-pragma`, `%pp-do-include` | | **2371–2430** | Macro expansion: `%pp-emit-expanded`, `%pp-apply-macro`, `%pp-prepare-body`, `%pp-collect-args`, `%pp-bind-args` (variadic), `%pp-substitute` (`#param` stringize, `##` paste) | -| **2431–2503** | Paste operator: `%pp-paste-tokens`; string fusion: `%pp-maybe-fuse-str`; `#if` evaluator: `%pp-make-const-ps` (IO adapter wrapping token list as minimal pstate), `pp-eval-cexpr`, `%pp-resolve-defined`, `%pp-expand-line`, `%pp-idents-as-zero` | -| **2504–2620** | CG emission primitives: `%cg-emit-buf`, `%cg-emit`, `%cg-emit-many`, `%cg-fresh-label`, `%n` (number→bv) | -| **2621–2720** | CG metadata: `%cg-fn-set!/%cg-fn-get`; register/label helpers: `%cg-reg→bv`, `%cg-emit-li`, `%cg-emit-la` | -| **2721–2870** | Load/store emission: `%cg-emit-ld/st`, `%cg-emit-ld-slot-typed` (sign-extended sub-word loads), `%cg-emit-sext`, `%cg-spill-reg` | -| **2871–3020** | Operand loading: `%cg-load-opnd-into` (imm/frame/global); vstack ops: `cg-push/pop/top/depth/dup`, snapshot/rewind for sizeof | -| **3021–3170** | Materialize: `cg-push-imm`, `cg-push-string` (with intern), `cg-push-sym` (fn/enum/var/param), `cg-push-deref` (indirect-slot tracking) | -| **3171–3320** | Aggregate access: `cg-push-field` with `%cg-find-field` (anonymous-member-aware lookup, shared with parser's offsetof), `cg-decay-array`; address/deref: `%cg-emit-addr-of`, `cg-copy-struct`, `cg-take-addr`, `cg-load` | -| **3321–3470** | Type conversions: `cg-cast` (bool/ptr/widening/narrowing with sign-extend), `cg-promote`, `cg-arith-conv` | -| **3471–3620** | Operators: `cg-binop` (pointer arithmetic scaling, comparison), `cg-unop` (neg/bnot/lnot), `cg-assign` (type coercion), post-inc/dec | -| **3621–3770** | Function calls: `cg-call` (sret >16B struct return, arg staging a0–a3 + stack, variadic) | -| **3771–3870** | Return: `cg-return` (void/scalar/struct); conditional: `cg-if`, `cg-ifelse`, `cg-ifelse-merge` (ternary/&&/\|\|) | -| **3871–3970** | Loop control flow: `cg-loop`, `cg-break`, `cg-continue`; switch: `cg-switch-begin`, `cg-switch-case`, `cg-switch-default`, `cg-switch-end` (dispatch table) | -| **3971–4000** | Variadic: `cg-va-start`, `cg-va-arg`, `cg-va-end`; labels/goto: `cg-emit-label`, `cg-goto` | -| **4001–4034** | Globals/data: `cg-emit-global`, `cg-emit-extern`, tentatives, `cg-intern-string`; frame: `cg-alloc-slot`; lifecycle: `cg-init`, `cg-fn-begin/v`, `cg-fn-end`, `cg-finish` | -| **4035–4160** | Scope/tag ops: `scope-enter/leave`, `scope-bind/lookup`, `tag-bind/lookup`, `typedef?` | -| **4161–4260** | Type compatibility: `ctype-compat?`, `%fn-ctype-compat?`, `%fn-params-compat?`; symbol merge: `sym-merge` (linkage inheritance) | -| **4261–4360** | Type constructors: `%mk-ptr`, `%mk-arr`, `%mk-fn`; qualifier handling: `eat-cv-quals!`, `skip-gnu-attribute!`, `eat-gnu-attributes!` | -| **4361–4411** | Declaration specifiers: `parse-decl-spec` (storage/type/signedness), `resolve-base` | -| **4412–~4480** | Aggregate parsing: `parse-aggregate-spec` (struct/union forward + complete), `parse-struct-fields` (union offset=0), `complete-agg!` (size/align/fields), `parse-enum-spec` | -| **4412–4488** | Const-expr value helpers: `%const-trunc`, `%const-arith-conv`, `%const-arith-conv-type`, `%const-promote`, `%const-bool?` | -| **4489–4512** | Const-expr binary-level infrastructure: `%const-binl` (generic left-associative loop), `%const-arith-op`, `%const-div-op`, `%const-cmp-op` | -| **4473–4814** | Constant expression evaluator: `parse-const-expr` → `parse-const-cond` (ternary) → binary levels via `%const-binl` (lor/land/bor/bxor/band/eq/rel/add/mul) → `parse-const-shift` (inline; lhs-type-only) → `parse-const-cast` → `parse-const-unary` (sizeof, &, prefix ops) → `parse-const-primary` (INT/CHAR/paren/enum-const); `%const-sizeof-expr` (cg snapshot/rewind; guards against pp context) | -| **4815–~4970** | offsetof support: `%const-parse-addrof-postfix`, `%const-parse-addrof-primary` — recognizes `&((T*)0)->field` chains; reuses `%cg-find-field` | -| **4814–5060** | `parse-const-int`; declarators: `parse-declarator`, `parse-decl-cont`, `parse-decl-suf-cont`, `parse-fn-params` | -| **5061–5160** | Phase 3 promotion: `%promote-pending-completions`, `rewrite-pending-completions!`, `promote-roots!`, `promote-iter-buffers!` (main/scratch boundary) | -| **5161–5310** | Translation unit: `parse-translation-unit` with `call-with-scratch-cycle` per decl; `parse-decl-or-fn` | -| **5311–5510** | Declarations/definitions: `handle-decl` (typedef/fn/var/static/file-scope/block-scope with tentatives) | -| **5511–5710** | Global initializers: `parse-init-global` (string/brace/scalar with inferred-length arrays), `%parse-init-array-list` with element promotion, `%parse-init-struct-list` with designated designators and padding | -| **5711–5860** | Local initializers: `parse-init-local-aggregate` (string/brace), `%parse-init-local-array-list`, `%parse-init-local-struct-list` (zero-pass); compound literals as frame lvalues | -| **5861–5960** | Function body: `parse-fn-body`, `%parse-fn-body-inner` (param binding, scope enter/leave) | -| **5961–6090** | Statements: `parse-stmt` dispatch, `parse-cstmt`, `parse-if-stmt`, `parse-while-stmt`, `parse-do-stmt`, `parse-for-stmt` (deferred condition/step), `parse-switch-stmt`, `parse-case-stmt`, `parse-default-stmt`, `parse-return-stmt`, `parse-goto-stmt`, `parse-labelled-stmt`, `parse-expr-stmt`, `parse-local-decl` | -| **6091–6110** | `%binop-bp` — Pratt binding power table (comma=1, assign=4, `\|\|`=10, `&&`=20, bitwise=30–50, relational=60, shift=70, add=80, mul=90) | -| **6111–6310** | Expression parser: `parse-expr` (`expr-bp(0)`), `parse-expr-bp` (Pratt climbing), `parse-binary-rhs` (comma/assign/compound-assign/ternary/logical/bitwise) | -| **6311–6460** | Unary/cast/postfix: `parse-unary` (prefix ops, sizeof), `parse-cast-or-unary` (paren disambiguation), `parse-compound-literal`, `parse-postfix` (`[]`/call/`.`/`->`/post-inc/post-dec) | -| **6461–6505** | Call parsing: `call-fn-type`, `parse-call-args` (param casting, variadic promotion); builtins: `parse-builtin-va-start/va-arg/va-end`; primary: `parse-primary` (literals/idents/strings/parens/enum-consts); rvalue: `rval!`, `rval-not-fn!` | -| **6506–6611** | Driver: `%cc-slurp`, `%cc-write`, CLI flag parsing (`--cc-debug`, `--cc-trace-emit`, `--lib=PFX`), `%cc-initial-defines` (CCSCM sentinel), `cc-main` (pipeline init + `parse-translation-unit` + `cg-finish` + write) | +| **2431–2540** | Paste operator: `%pp-paste-tokens`; string fusion: `%pp-maybe-fuse-str`; `#if` evaluator: `%pp-make-const-ps` (IO adapter wrapping token list as minimal pstate), `pp-eval-cexpr`, `%pp-resolve-defined`, `%pp-expand-line`, `%pp-idents-as-zero` | +| **2541–2640** | CG emission primitives: `%cg-emit-buf`, `%cg-emit`, `%cg-emit-many`, `%cg-fresh-label`, `%n` (number→bv) | +| **2641–2745** | CG metadata: `%cg-fn-set!/%cg-fn-get`; register/label helpers: `%cg-reg→bv`, `%cg-emit-li`, `%cg-emit-la`, slot-expr (`(+ %<fn>__SO N)` so the slot offset resolves through the per-fn `__SO` macro at M1pp time) | +| **2745–2870** | Load/store emission: `%cg-emit-ld/st`, `%cg-emit-ld-slot-typed` (sign-extended sub-word loads), `%cg-emit-sext`, `%cg-spill-reg` | +| **2871–3025** | Operand loading: `%cg-load-opnd-into` (imm/frame/global) — re-canonicalizes a frame rval against its type kind on load (sign- or zero-extend); vstack ops: `cg-push/pop/top/depth/dup`, snapshot/rewind for sizeof | +| **3025–3170** | Materialize: `cg-push-imm`, `cg-push-string` (with intern), `cg-push-sym` (fn/enum/var/param), `cg-push-deref` (indirect-slot tracking) | +| **3171–3360** | Aggregate access: `cg-push-field` with `%cg-find-field` (anonymous-member-aware lookup, shared with parser's offsetof), `cg-decay-array`; address/deref: `%cg-emit-addr-of`, `cg-copy-struct`, `cg-take-addr`, `cg-load` | +| **3361–3540** | Type conversions: `cg-cast` (bool/ptr/widening/narrowing with sign-extend), `cg-promote`, `cg-arith-conv` | +| **3541–3720** | Operators: `cg-binop` (pointer arithmetic scaling, comparison), `cg-unop` (neg/bnot/lnot), `cg-assign` (type coercion), post-inc/dec | +| **3720–3850** | Function calls: `cg-call` (sret >16B struct return, arg staging a0–a3 + stack, variadic) | +| **3850–3950** | Return: `cg-return` (void/scalar/struct via `%b(&.ret)` to the per-fn dotted local label); conditional: `cg-if`, `cg-ifelse`, `cg-ifelse-merge` (ternary/`&&`/`||`); `%cg-merge-arith-type` (C11 §6.5.15 result type for ternary merge) | +| **3951–4090** | Loop control flow: `cg-loop` (opens nested `.scope` with `:.top`/`:.end`), `cg-break` / `cg-continue` (bare `%break`/`%continue` resolved by hex2++'s innermost-out scope walk); switch: `cg-switch-begin/case/default/end` (dotted case labels and dispatch table inside the switch's `.scope`) | +| **3994–4090** | Variadic: `cg-va-start`, `cg-va-arg`, `cg-va-end`; labels/goto: `cg-emit-label`, `cg-goto` — user C labels emit as `cc__<fn>__user_<name>` global names so `goto` survives nested loop/switch scopes | +| **4090–4332** | Globals/data: `cg-emit-global` (prefixes `.align <ctype-align>` for both `.data` and `.bss`), `cg-emit-extern`, tentatives, `cg-intern-string` (string pool with `.align 8` framing), `%cg-bv->hex-lines` (bare-hex chunked output for hex2++); frame: `cg-alloc-slot`; lifecycle: `cg-init`, `cg-fn-begin/v`, `cg-fn-end` (wraps body in `%fn(name, frame, { … })`), `cg-finish` | +| **4334–4460** | Scope/tag ops: `scope-enter/leave`, `scope-bind/lookup`, `tag-bind/lookup`, `typedef?` | +| **4460–4560** | Type compatibility: `ctype-compat?`, `%fn-ctype-compat?`, `%fn-params-compat?`; symbol merge: `sym-merge` (linkage inheritance) | +| **4560–4660** | Type constructors: `%mk-ptr`, `%mk-arr`, `%mk-fn`; qualifier handling: `eat-cv-quals!`, `skip-gnu-attribute!`, `eat-gnu-attributes!` | +| **4660–4710** | Declaration specifiers: `parse-decl-spec` (storage/type/signedness), `resolve-base` | +| **4710–4760** | Aggregate parsing: `parse-aggregate-spec` (struct/union forward + complete), `parse-struct-fields` (union offset=0), `complete-agg!` (size/align/fields), `parse-enum-spec` | +| **4700–4760** | Const-expr value helpers: `%const-trunc`, `%const-arith-conv`, `%const-arith-conv-type`, `%const-promote`, `%const-bool?` | +| **4845–4900** | Const-expr binary-level infrastructure: `%const-binl` (generic left-associative loop), `%const-arith-op`, `%const-div-op`, `%const-cmp-op` | +| **4762–5230** | Constant expression evaluator: `parse-const-expr` → `parse-const-cond` (ternary) → binary levels via `%const-binl` (lor/land/bor/bxor/band/eq/rel/add/mul) → `parse-const-shift` (inline; lhs-type-only) → `parse-const-cast` → `parse-const-unary` (sizeof, &, prefix ops) → `parse-const-primary` (INT/CHAR/paren/enum-const); `%const-sizeof-expr` (cg snapshot/rewind; guards against pp context) | +| **5040–5230** | offsetof support: `%const-parse-addrof-postfix`, `%const-parse-addrof-primary` — recognizes `&((T*)0)->field` chains; reuses `%cg-find-field` | +| **5230–5400** | `parse-const-int`; declarators: `parse-declarator`, `parse-decl-cont`, `parse-decl-suf-cont`, `parse-fn-params` | +| **5400–5430** | Phase 3 promotion: `%promote-pending-completions`, `rewrite-pending-completions!`, `promote-roots!`, `promote-iter-buffers!` (main/scratch boundary) | +| **5403–5535** | Translation unit: `parse-translation-unit` with `call-with-scratch-cycle` per decl; `parse-decl-or-fn` | +| **5535–5860** | Declarations/definitions: `handle-decl` (typedef/fn/var/static/file-scope/block-scope with tentatives) | +| **5864–6160** | Global initializers: `parse-init-global` (string/brace/scalar with inferred-length arrays), `%parse-init-array-list` with element promotion, `%parse-init-struct-list` with designated designators and padding | +| **6165–6440** | Local initializers: `parse-init-local-aggregate` (string/brace), `%parse-init-local-array-list`, `%parse-init-local-struct-list` (zero-pass); compound literals as frame lvalues | +| **6447–6470** | Function body: `parse-fn-body`, `%parse-fn-body-inner` (param binding, scope enter/leave) | +| **6467–6760** | Statements: `parse-stmt` dispatch, `parse-cstmt`, `parse-if-stmt`, `parse-while-stmt`, `parse-do-stmt` (`.scope` with `:.body` / `:.top` for `continue`-to-cond semantics), `parse-for-stmt` (`.scope` with deferred condition/step), `parse-switch-stmt`, `parse-case-stmt`, `parse-default-stmt`, `parse-return-stmt`, `parse-goto-stmt`, `parse-labelled-stmt`, `parse-expr-stmt`, `parse-local-decl` | +| **6767–6810** | `%binop-bp` — Pratt binding power table (comma=1, assign=4, `\|\|`=10, `&&`=20, bitwise=30–50, relational=60, shift=70, add=80, mul=90) | +| **6810–7090** | Expression parser: `parse-expr` (`expr-bp(0)`), `parse-expr-bp` (Pratt climbing), `parse-binary-rhs` (comma/assign/compound-assign/ternary/logical/bitwise) | +| **7089–7250** | Unary/cast/postfix: `parse-unary` (prefix ops, sizeof), `parse-cast-or-unary` (paren disambiguation), `parse-compound-literal`, `parse-postfix` (`[]`/call/`.`/`->`/post-inc/post-dec) | +| **7152–7370** | Call parsing: `call-fn-type`, `parse-call-args` (param casting, variadic promotion); builtins: `parse-builtin-va-start/va-arg/va-end`; primary: `parse-primary` (literals/idents/strings/parens/enum-consts); rvalue: `rval!`, `rval-not-fn!` | +| **7372–7405** | Driver: `%cc-slurp`, `%cc-write`, CLI flag parsing (`--cc-debug`, `--cc-trace-emit`, `--lib=PFX`), `%cc-initial-defines` (CCSCM sentinel), `cc-main` (pipeline init + `parse-translation-unit` + `cg-finish` + write) | --- @@ -166,8 +166,12 @@ Source file - **Vstack-based codegen** — expression evaluation pushes/pops `opnd` records; values optionally spilled to frame slots - **Macro hide-sets** — `tok` carries hide set to prevent recursive expansion (C11 §6.10.3.4) - **Shared constant-expression evaluator** — `parse-const-*` serves both the parser (typed, with sizeof/cast/offsetof) and the preprocessor `#if` evaluator (`%pp-make-const-ps` wraps a token list as a minimal pstate with empty scope and `ps-cg = #f`); `%const-binl` provides the generic left-associative binary level pattern -- **Sign-extension discipline** — narrow types (i8/i16/i32) stored as canonical 64-bit forms via shli/sari; widening casts are relabel-only +- **Sign-extension discipline** — narrow types (i8/i16/i32) stored as canonical 64-bit forms via shli/sari; widening casts are relabel-only. Frame rval loads (`%cg-load-opnd-into`) re-canonicalize against the opnd's type kind so a relabel-only cast (e.g. via `cg-arith-conv`) reads correctly downstream. - **Sret (struct return)** — structs >16B use indirect result: caller passes pointer in `a0` - **Variadic ABI** — 16 contiguous 8-byte slots; args 0–3 from `a`-regs, 4+ from `LDARG` - **Tentative definitions** — collected in `world-tentatives`; emitted as `.bss` only if no full definition appears by TU end - **FP softening** — float/double types parsed and sized per SysV ABI but all FP ops emit integer bitpattern operations +- **M1pp + hex2++ output** — bodies are wrapped in libp1pp's `%fn(name, frame, { … })` (which opens a hex2++ `.scope` and emits `%enter`/`%eret`); compiler-internal labels (`:.ret`, loop `:.top`/`:.end`, switch `:.lbl_N`) are dotted scope-locals resolved by hex2++'s innermost-out scope walk; `%break` / `%continue` resolve through the same walk to the nearest enclosing scoped loop. User C labels use a `cc__<fn>__user_<name>` global mangling so `goto` is unaffected by nested scopes (C labels have function scope, not block). +- **Alignment via `.align`** — `cg-emit-global` emits `.align <ctype-align>` before every `.data` or `.bss` symbol; `cg-intern-string` brackets each pooled string with `.align 8` so a non-multiple-of-4 string doesn't misalign the next instruction on aarch64. Intra-struct field padding stays as inline zero bytes (the offsets are constant relative to the aligned struct start, so a `.align` directive there would be redundant). +- **Bare-hex string emission** — string pool and `(label-ref . LBL)` initializer pieces emit as bare hex chunks (≤64 bytes / 128 hex chars per line) consumed directly by hex2++; cc.scm no longer produces M0-style quoted-text literals. +- **Ternary common type** — `cg-ifelse-merge` runs `%cg-merge-arith-type` over both arms after they emit, so the result `opnd` carries the C11 §6.5.15 common type rather than the first arm's type. The slot stores the raw 8-byte payload; `%cg-load-opnd-into` re-canonicalizes against whichever common type was picked. `&&`/`||` callers pre-cast both arms to `%t-i32` so the merge is a no-op for them. diff --git a/docs/TCC-CC-INVESTIGATION.md b/docs/TCC-CC-INVESTIGATION.md @@ -1,303 +0,0 @@ -# tcc-cc bug investigation: 14 fixtures fail with `assert fail: 0@12051` - -## Status - -Bug **localized but not fixed**. Root cause is in `cc.scm`-built `tcc-boot2`'s -runtime behavior: a corrupted `ret.type.t` in `cc__unary`'s function-call-return -path leads to allocating a float register for an int return value, which then -trips a legitimate `assert(0)` in `arm64-gen.c:load()` for the unsupported -mixed int/float register class case. - -Of the 15 failing fixtures, 14 hit the same `assert fail: 0` at the same -`tcc.flat.c:12051` site. The 15th (`220-const-promote`) is a separate -"compile succeeds, exits wrong" issue not covered here. - -## TL;DR for the next agent - -You're looking for a **`cc.scm` miscompile of `cc__unary` (in `tcc.flat.c`) -that corrupts the local `SValue ret` between `ret.type = s->type;` -(line 7837) and `is_float(ret.type.t)` (line 7840)**. - -The corruption is layout-sensitive (any 4-byte instruction added in -`cc__load` or anywhere in `cc__unary` makes the bug not fire on this -fixture). The corrupted value reads back as `0x007B4C19` — a vstack -region pointer-looking value, which has low 4 bits = 9 = `VT_DOUBLE`, so -`is_float` returns true, `ret.r = TREG_F(0) = 20`, and `vsetc` pushes an -SValue with `r=20` (a float register) for an int return. Later -`gv(RC_INT) → load(0, vtop)` correctly asserts because there's no int↔float -move for the `svr<0x30` register-pair case. - -## Reproduction - -```sh -make test SUITE=tcc-cc NAMES=013-call # fails: assert fail: 0@12051 -``` - -(The Makefile already drives `TCC_TARGET=ARM64` for the `tcc-cc` suite. -Don't run `make tcc-boot2 ARCH=aarch64` standalone without setting -`TCC_TARGET` — until commit `3317ca3` the default `TCC_TARGET=X86_64` was -silently producing an x86_64-targeted `tcc.flat.c`. That's fixed.) - -To run native gdb against the binary: -```sh -podman run --rm --pull=never --platform linux/arm64 \ - -v "$PWD":/work -w /work boot2-alpine-gcc:aarch64 \ - sh scripts/dbg-load-cbz.sh -``` - -`scripts/dbg-load-cbz.sh` is a working diagnostic harness with breakpoints -already wired up. Edit the gdb commands in `/tmp/g.gdb` (heredoc inside the -script) to add new breakpoints. Note: the binary has no symbols — work -in raw addresses. - -## Confirmed facts (verified by gdb on the running binary) - -1. **The `assert(0)` itself is correct given its inputs.** At the failing - `load()` call: `r=0` (target = `TREG_R(0)` = X0, an int reg) and - `sv->r=0x14` (= `TREG_F(0)` = 20, a float reg). `arm64-gen.c:12042-51` - asserts because there's no defined cross-class register move for the - `svr<0x30` branch. So the bug is upstream. - -2. **`vtop->r=20` was set by a `vsetc` call** at `LR=0x6c777c`, which is - `cc__unary` line 7864: - ```c - for (r = ret.r + ret_nregs + !ret_nregs; r-- > ret.r;) { - vsetc(&ret.type, r, &ret.c); - vtop->r2 = ret.r2; - } - ``` - For `ret.r=20` (= TREG_F(0)) and `ret_nregs=1`, the loop runs once with - `r=20`, so `vsetc` pushes an SValue with `r=20`. - -3. **`ret.r` was set to 20 by line 7841** because `is_float(ret.type.t)` - returned true: - ```c - if (is_float(ret.type.t)) { - ret.r = reg_fret(ret.type.t); // TREG_F(0) = 20 - } else { - ret.r = ((0)); - } - ``` - -4. **`is_float` was called with `t=0x007B4C19`**, not a valid type code. - Low 4 bits = 9 (= `VT_DOUBLE`), so `is_float` returns true. gdb output - confirmed: `is_float(t=0x7b4c19) -> TRUE lr=0x6c6e30`. - -5. **`ret.type.t` (read at SP+704 in `cc__unary`'s frame) holds - `0x007B4C19`** — a vstack-region pointer-like value, NOT a type code. - It must have been corrupted between line 7837 (the assignment `ret.type - = s->type`) and line 7840 (the `is_float` read). - -6. **Adjacent stack bytes look like memcpy loop variables.** At the - moment of the failing `is_float`: - - `[SP+704..711] = 0x007B4C19` - - `[SP+712..719] = 0x007B4C1A` - - These differ by exactly 1, which is the signature of `_memcpy`'s - byte-by-byte `dest`/`src` walking (each iteration: `dest++; src++`). - `0x7B4C0A` is `vtop` for this fixture; `0x7B4C1A = vtop+16` is exactly - `&vtop->r`. So one of these slots is holding a pointer that walked - into the middle of an SValue during a struct-copy memcpy. - -7. **`mes-libc/string/memcpy.c:_memcpy` is byte-by-byte** (compiled to a - loop with 4 LDRBs + ORs + shifts per byte for ld_w, etc). cc.scm - compiles its locals into 240 bytes of frame; max slot offset used is - 216, so it doesn't overflow its own frame. The `_memcpy → memcpy` - wrapper is the only `_memcpy` caller. - -8. **The bug is `cc.scm`-specific.** The gcc-built control - (`scripts/run-gcc-libc-flat-tcc.sh`) compiles the same `tcc.flat.c` and - passes 177/178 of the same fixtures. So the C source is fine; it's - cc.scm's lowering that breaks. - -9. **Layout-sensitive fix.** Inserting any 4-byte instruction (even - `%addi(t0, t0, 0)`) anywhere before the failing site in `cc__load` - makes the test pass. The CBZ at `0x73B6C0` (mod 64 = 0) is *not* - the cause — replacing all CBZ/CBNZ with CMP+B.cond+BR (which shifts - load() by ~144 bytes) didn't fix it. Layout sensitivity comes from - `tcc-boot2`'s runtime state changing as code positions shift, not from - any specific instruction's alignment. - -10. **`%li(rd, imm)` lowering was changed** from LDR-literal-pool to - MOVZ/MOVK chain (4 instructions, 16 bytes — same size as before). - This was investigated as a possible alignment fix; it isn't, but the - new lowering is kept as a defensible cleanup that eliminates literal - pool entries from the executable instruction stream. - -## Hypotheses, ranked by likelihood - -### H1 (most likely): cc.scm slot-allocator bug — `ret`'s slot overlaps memcpy state - -The fact that two slots in `cc__unary`'s frame (`SP+704` and `SP+712`) -hold sequential pointer values one byte apart is the signature of -`_memcpy`'s `dest`/`src` loop variables. cc.scm allocates locals as -fixed slot offsets per function, but if its bookkeeping for `ret`'s -slot collides with another local *or* if there's interference from a -helper called via `gfunc_call`, then `ret.type.t` and `ret.type.ref` get -clobbered. - -The shape of the corrupted value (`0x7B4C19` = vstack address midway -through an SValue) strongly suggests the leak is from inside an -SValue-copying memcpy. Likely sources of such struct copies between -line 7837 (set ret.type) and line 7840 (read ret.type.t): -- there are NO C statements between those lines, so the leak must - come from how cc.scm compiles **line 7837 itself** — `ret.type = s->type` - which is a 16-byte struct copy via memcpy. -- Or a related copy emitted by cc.scm for a temporary. - -**Investigation steps:** -1. Generate `cc__unary`'s P1pp around the function-call-return path - (lines ~7800-7870). Look for the slot offsets used for `ret.type` and - `s->type` and any `%call(&memcpy)` between them. -2. Compare cc.scm's computed slot offset for `ret.type.t` against - what's actually loaded at the failing address `0x6c6dd0` (loads from - `SP+704`). Do they match? -3. Hypothesis-test by adding a deliberate stack-padding local in - `unary()` (e.g. `volatile int __pad[64];` near `ret`) and re-running. - If that fixes it, slot allocation is the issue. - -### H2: cc.scm miscompile of `s->type` member access - -Maybe `s->type` is being computed wrong — reading from the wrong -offset within Sym, returning a pointer-like value. cc.scm's C grammar -includes `member-of-pointer-deref`; if `s->type` (where type is a -nested struct) is mis-translated to `&s->type` (the address) or to -`s + offsetof(Sym, type) + N` for the wrong N, the source of the -struct copy is wrong. - -**Investigation steps:** -1. Look at `cc.scm`'s parsing/codegen for `->` accessing a struct - member that is itself a struct. -2. Inspect the emitted P1pp for `ret.type = s->type` — does the - memcpy source address look correct? - -### H3: ABI mismatch in cc.scm's struct-copy lowering for `CType` - -`CType` is 16 bytes (int t + Sym *ref) with 4 bytes of padding for -8-byte alignment of `ref`. If cc.scm's struct-copy lowering walks bytes -0..16 of source but skips/duplicates the padding region differently -between source and destination, `ret.type.ref`'s bytes can land in -`ret.type.t`'s position. - -The previous SValue struct-copy fix (`cc/cg-assign-struct`) was -specifically called out in `docs/TCC-TODO.md` — a similar fix may be -needed for nested CType copy. - -**Investigation steps:** -1. Read `cc.scm`'s `cg-assign-struct` and verify it handles CType-sized - (16 byte) copies. Check whether `cc-assign` for the case - "lhs is a struct member that is itself a struct" routes through - `cg-assign-struct` correctly. -2. Try adding a regression test: a tiny C program that does `dst.type - = src.type;` where type is a `CType`-shaped struct, run through - cc.scm and verify field values are preserved. - -### H4 (less likely): vstack pointer corruption - -Maybe `vtop` itself is moving incorrectly, and the SValue at the -"failing vtop position" is actually unused garbage left behind from a -prior operation. This would mean the "wrong sv->r=20" was set in some -earlier vstack slot and we're just reading stale memory. - -We already confirmed via watchpoint that the `r=0x14` at -`vtop[0]+16` (= `0x7B4C1A`) was *written* by the memcpy of -`vtop[-1]` (a vswap), so the source of the bad value is `vtop[-1].r = -20` immediately before the swap. Trace continues upstream: who set -`vtop[-1].r = 20`? The trace showed the vsetc(r=0x14) at lr=0x6c777c -WROTE r=20 to its target slot — and that slot is the one that becomes -vtop[-1] after the next vpush. So this is downstream of H1-H3. - -### H5 (ruled out, listed for completeness) - -- LDR-literal alignment (8-byte literals at 4-byte aligned addresses): - ruled out — replaced `%li` with MOVZ/MOVK chain, bug persists. -- CBZ at cache-line boundary (Apple Silicon erratum): ruled out — replaced - CBZ with CMP+B.cond+BR, bug persists. -- Wrong opcode encoding for cond-branch: ruled out — count of CBZ vs CBNZ - in expanded.M1 matches the count of `%ifelse_nez` vs `%cmpset_eqz` in - source. -- Hex2 mis-resolving labels: verified literals in binary point to correct - branch targets. - -## Suggested next steps in priority order - -1. **Read `cc.scm`'s `cg-assign-struct`** (search for `cg-assign-struct` - and the nested struct copy path). Verify handling for CType-sized - nested struct copies. This is the most likely culprit (H3). - -2. **Build a minimal C reproducer** that triggers the corruption without - needing the whole tcc compile. Something like: - ```c - typedef struct { int t; void *ref; } CType; - typedef struct { CType type; int r; } SValue; - int test(SValue *sv) { - SValue ret; - ret.type = sv->type; - return ret.type.t; - } - ``` - Compile with cc.scm and verify the field copy works correctly. If it - doesn't reproduce in isolation, the trigger requires more state - (e.g. specific stack frame size or memcpy interaction). - -3. **Compare the emitted P1pp for the failing call site against a - working call site.** Find another place in `cc__unary` that does a - similar struct copy followed by an int read, and diff the two P1pp - sequences. The buggy one will have a structural anomaly. - -4. **If P1pp looks correct, drop down to gdb tracing.** Use - `scripts/dbg-load-cbz.sh` as a starting point. Set watchpoints on - stack slots in `cc__unary`'s frame to find what writes the - pointer-like value into `ret.type.t`'s slot. Don't add new code - inside `cc__unary` for diagnostic — it shifts the layout and the bug - disappears. - -## Useful addresses (will shift if anything in the load chain changes) - -In the **current broken binary** (no `%addi` workaround): -- `cc__unary` entry: `0x006BFC70` -- `cc__unary` is_float call (line 7840): `0x006C6E2C` -- `cc__unary` vsetc call (line 7864): `0x006C7778` -- `cc__load` entry: `0x007395EC` -- `cc__load` outer-2 if test CBZ: `0x0073B6C0` (the actual assert site) -- `cc__vsetc` entry: `0x006795BC` -- `cc__vpop` entry: `0x0067A6B8` -- `cc__save_reg` entry: `0x0067DB94` -- `cc__get_reg` entry: `0x0067F130` -- `cc__gv` entry: `0x00680918` -- `cc__is_float` entry: `0x00672C54` -- `cc__vtop` (global ptr): `0x007B4A32` -- `cc__pvtop` (global ptr): `0x007B4A2A` -- `cc____vstack` (array): `0x007B4A3A` -- `_memcpy` entry: `0x006068BC` - -To regenerate addresses after a build change, use the recipe in -`scripts/dbg-load-cbz.sh` (anchor on `cc__load`'s entry signature -`FF0324D1` = SUB SP, SP, #2304, then walk byte counts in expanded.M1). - -## Files of interest - -- `tcc.flat.c:7745-7870` — `unary()`'s symbol-lookup and function-call - paths. The C code allegedly sets up `ret` correctly. -- `tcc.flat.c:5006-5022` — `vsetc` source (where the wrong r=20 ends up - written into vtop). -- `tcc.flat.c:11999-12053` — `arm64-gen.c:load()` (where the assert - fires; not actually wrong). -- `cc/cc.scm` — the cc.scm compiler. Search for `cg-assign-struct`, - member access through pointer, slot allocation logic. -- `P1/P1-aarch64.M1pp:385-403` — `p1_li` (already changed to MOVZ/MOVK, - not relevant to the bug). -- `scripts/dbg-load-cbz.sh` — gdb diagnostic harness with working - breakpoints. - -## What's already in the tree from this investigation - -Committed: -- `3317ca3` — Makefile: ARCH controls TCC_TARGET (no longer silently - building x86_64-targeted tcc-boot2 when running with ARCH=aarch64). - -Uncommitted (working tree): -- `P1/P1-aarch64.M1pp` — `p1_li` rewritten as MOVZ/MOVK chain. Same - 16-byte size, no functional change beyond eliminating literal-pool - reads. **Defensible cleanup; does not fix the bug.** -- `scripts/dbg-load-cbz.sh` — gdb diagnostic helper (untracked). diff --git a/docs/TCC-TODO.md b/docs/TCC-TODO.md @@ -1,9 +1,7 @@ # tcc-boot2 Current TODO Current tracker for the scheme1-hosted `cc.scm` path that builds -`tcc.flat.c` into `tcc-boot2`. Historical parser, scratch, linker, and -runtime bring-up notes have been removed from this file; those fixes -are now covered by tests and by the build rules themselves. +`tcc.flat.c` into `tcc-boot2`. Companion docs: @@ -13,12 +11,10 @@ Companion docs: ## Current State -`cc.scm` can compile the flattened tcc translation unit, the P1pp -output assembles and links, and `tcc-boot2` starts. The old blockers -around whole-file parse coverage, scratch exhaustion, tentative -definitions, anonymous members, `offsetof` const-expr, large AArch64 -stack frames, argv preservation, and `for`-`continue` lowering are -done and have focused regression tests. +`cc.scm` compiles the flattened tcc translation unit, the P1pp output +assembles and links via the M1pp + hex2++ chain, and the resulting +`tcc-boot2` is at full parity with the gcc-built control on the +`tcc-cc` acceptance suite (see Latest Result below). Useful smoke checks: @@ -39,15 +35,15 @@ make test SUITE=tcc-cc ## `tcc-cc` Suite -`tcc-cc` is the next acceptance suite. It runs the plain `tests/cc` -fixtures through `tcc-boot2` instead of through `cc.scm` directly. -The Makefile builds an ARM64-targeted `tcc-boot2`, builds the tiny -aarch64 `_start` object with the host assembler, then the runner does: +`tcc-cc` runs the plain `tests/cc` fixtures through `tcc-boot2` +instead of through `cc.scm` directly. The Makefile builds an +ARM64-targeted `tcc-boot2`, builds the tiny aarch64 `_start` object +with the host assembler, then the runner does: ```sh build/aarch64/tcc-boot2/tcc-boot2 \ - -nostdlib build/aarch64/tcc-cc/start.o tests/cc/NAME.c \ - -o build/aarch64/tests/tcc-cc/NAME + -nostdlib build/aarch64/tcc-cc/start.o build/aarch64/tcc-cc/mem.o \ + tests/cc/NAME.c -o build/aarch64/tests/tcc-cc/NAME ./build/aarch64/tests/tcc-cc/NAME ``` @@ -63,113 +59,46 @@ Run a subset with `NAMES`: NAMES='002-arith 007-call-with-args' make test SUITE=tcc-cc ``` -## Latest `tcc-cc` Result - -Fresh run: - -```sh -make test SUITE=tcc-cc -``` - -Result: +## Latest Result ```text -163 passed, 15 failed +make test SUITE=tcc-cc cc.scm-built tcc-boot2: 176 passed, 2 failed +scripts/run-gcc-libc-flat-tcc.sh gcc-built tcc-gcc: 176 passed, 2 failed ``` -(178 fixtures total. The 148→163 jump came from adding a tiny -`tcc-cc/mem.c` runtime providing `memcpy`/`memmove`/`memset`, -compiled with tcc-boot2 and linked alongside `start.o` for every -fixture. tcc emits calls to those for struct copies and bulk -zero-init past its inline thresholds, and its ARM64 `libtcc1` -(`lib-arm64.o`) does not define them — upstream expects libc to, -but the suite links `-nostdlib`. That cleared all 15 fixtures in -the `mem*` cluster in one shot. The earlier 14→148 jump was a -cc.scm fix: cg-assign treated `=` as scalar (8-byte load+store) -for every type, so any struct/union assignment of size > 8 bytes -silently dropped fields at offset ≥ 8. `SValue` is 64 bytes and -`vswap()` does three struct copies, so every vswap was a partial -no-op and the dominant `vtop[-1].r < VT_CONST` cluster all turned -green. The fix routes struct/union `=` through a new -`cg-assign-struct` that emits a memcpy (see -`tests/cc/333-struct-assign-big.c`, plus `334-struct-assign-rval-rhs.c` -for the comma-operator rval-of-struct rhs path).) - -Raw run log: - -```text -build/aarch64/.work/tests/tcc-cc/full-run.log -``` - -The remaining `assert fail: 0` lines are still prefixed with a -`vfprintf: skipping second: l` line. That's mes-libc's `vfprintf` -warning that it ignored a second `l` length-modifier in tcc's -`%lld` format strings (vendor/mes-libc/stdio/vfprintf.c:89). The -warning is benign noise from the cc.scm-built tcc-boot2's runtime -libc — not from the failing fixture itself — and it appears in a -fixture's tcc.log because we capture tcc-boot2's stderr there. - -Failure groups from per-fixture `tcc.log` files: - -| group | count | examples | -|------:|------:|----------| -| `assert fail: 0`, then segfault | 14 | `001-kitchen-sink`, `003-compound`, `013-call`, `019-static`, `027-void-call`, `071-fnptr-call`, `082-union-basic`, `117-compound-literal`, `118-const-expr`, `127-string-escapes`, `129-extern-libp1pp`, `131-vararg-mixed`, `200-lex-char-type`, `250-stringize-punct` | -| compile succeeds, generated program exits wrong | 1 | `220-const-promote` | - -14 of 15 failures still happen before the generated fixture binary -runs. The previous `mem*` undefined-symbol cluster (15 fixtures) is -gone — see `tcc-cc/mem.c` for the runtime, wired up via -`build/<arch>/tcc-cc/mem.o` in the Makefile. - -One failure is not cc.scm miscompilation — it reproduces on the -gcc-built control (see Host Baseline below): - -- `200-lex-char-type` (exits 21 instead of 0 on both paths) +Exact parity. Both paths fail on the same two fixtures, neither of +which is a cc.scm bug: -This is an upstream tcc bug and would need a `simple-patches/` -patch to fix. It caps the achievable cc.scm-built result at -`177 passed, 1 failed` until tcc itself is patched. +- **`200-lex-char-type`** — upstream tcc 0.9.26 bug (also fails under + the gcc-built control). Fixing it requires a `simple-patches/` patch + against tcc itself. +- **`129-extern-libp1pp`** — linkage-only failure. The fixture extern's + `libp1pp__memcpy` / `_memcmp` / `_memset` (the namespaced public + entry points from libp1pp), which neither tcc-cc nor the gcc-libc-flat + control link against. The fixture is a regression test for the cc.scm + `extern`-passthrough rule, not for tcc; running it against + tcc-built binaries is out of suite scope. -Working hypothesis for the remaining `assert fail: 0` cluster: our -compiler is still miscompiling tcc itself in narrower spots. In this -suite, `tcc-boot2` is a tcc binary produced by `cc.scm`; the -remaining failures look like that produced tcc executing bad -compiler/codegen logic and therefore asserting while compiling -fixtures. The host baseline below rules out the fixtures and -expected files. +The path from earlier results to here: -A stronger control is to compile the same ARM64 `tcc.flat.c` with -Alpine gcc and use that gcc-built tcc to run the same `tests/cc` -fixtures: - -```text -gcc-built ARM64 tcc.flat.c (libc + tcc hdrs): 177 passed, 1 failed -cc.scm-built ARM64 tcc-boot2: 163 passed, 15 failed -``` - -The gcc-built control's only remaining failure (`200-lex-char-type`) -is an upstream tcc bug, not cc.scm miscompilation, and it caps the -achievable cc.scm result at 177/178 until tcc itself is patched. -Aside from that, the gcc-built control is green: it links the -flattened tcc against `libc.flat.c` + libtcc1 + a tiny mes-libc -string runtime, and passes `-I tcc/include` so the bundled -`<stdarg.h>` resolves under `-nostdlib`. Run with -`scripts/run-gcc-libc-flat-tcc.sh`. This proves the fixtures and the -flattened tcc source are coherent end-to-end, so the remaining 14 -cc.scm-only failures are evidence that our compiler is still -miscompiling tcc in some places. +| Result | Delta | +|--------|-------| +| 148/30 | baseline before mem-runtime | +| 163/15 | added `tcc-cc/mem.c` runtime; cleared the `mem*` undefined-symbol cluster | +| 175/3 | cc.scm migration to M1pp + hex2++ pipeline (dotted local labels, `.scope`/`.endscope`, `.align` directives, bare-hex string emission) cleared the entire `assert fail: 0@12051` cluster (14 fixtures) plus a hex2pp.P1 BSS-overlap fix that unblocked the tcc-boot2 link itself for inputs >1 MiB | +| 176/2 | ternary-arms common-type fix in `cg-ifelse-merge` cleared `220-const-promote` (was: arm 1's type leaked through as the result type, truncating wider arm 2 to 32-bit; tcc's `gen_opic` sign-extension idiom hit this) | ## Host Baseline -The `tests/cc` fixtures themselves are coherent under a host compiler. -A temporary host harness was used to compile, run, and compare every -fixture with plain host `cc`: +The `tests/cc` fixtures are coherent under a host compiler. The +temporary host harness compiled, ran, and compared every fixture with +plain host `cc`: ```sh build/aarch64/.work/tests/tcc-cc/run-host-cc.sh ``` -Current host baseline: +Recorded baseline: ```text HOST_CC=cc @@ -186,67 +115,47 @@ podman run --rm --pull=never --platform linux/arm64 \ sh scripts/run-gcc-libc-flat-tcc.sh ``` -Current result: +This is the canonical sanity reference for "tcc-built-from-our-source" +fixture coverage; `cc.scm`-built tcc-boot2 is now at exact parity with +it. -```text -tcc version 0.9.26 (AArch64 Linux) -177 passed, 1 failed -``` +## Patches + +`scripts/simple-patches/tcc-0.9.26/` carries fixes applied during +`stage1-flatten` so any tcc rebuilt from this tree picks them up: -The only remaining failure (`200-lex-char-type`) is an upstream tcc -bug, not a fixture or cc.scm issue — see the failure-modes notes -above. - -This is the canonical sanity reference for tcc-built-from-our-source. -The fixtures were cleaned up to drop assumptions about implicit -`char` signedness and `long double` size; `068-main-noret` was retired -because relying on the implicit return value of `main` is undefined. - -`scripts/simple-patches/tcc-0.9.26/` carries AArch64 vararg fixes and -a const-expr fix applied during stage1-flatten so any tcc rebuilt -from this tree picks them up. `aarch64-stdarg-array.{before,after}` -swaps the bundled `va_list` for `__va_list_struct[1]` (matches -glibc/musl/x86_64 ABI), and the -`arm64-va-{pointer-operand,arg-pointer}.{before,after}` pair teach -`gen_va_start`/`gen_va_arg` to skip `gaddrof()` when the operand is -already a pointer (the array-decayed/pointer-parameter case). -Without this, `va_list` forwarding into a non-variadic helper (the -`vfprintf` shape, e.g. `131-vararg-mixed`) hit `assert fail: 0` in -`arm64-gen.c`. `const-divzero-shortcircuit-int.{before,after}` gates -`gen_opic`'s "division by zero in constant" error on -`!nocode_wanted` so that the unevaluated arm of `&&`/`||`/`?:` in -constant expressions (C11 §6.6¶3) does not abort. - -Two fixture cleanups are part of that baseline: +- `aarch64-stdarg-array.{before,after}` — swaps the bundled + `va_list` for `__va_list_struct[1]` (matches glibc/musl/x86_64 ABI). +- `arm64-va-{pointer-operand,arg-pointer}.{before,after}` — teaches + `gen_va_start`/`gen_va_arg` to skip `gaddrof()` when the operand is + already a pointer (the array-decayed/pointer-parameter case). Without + this, `va_list` forwarding into a non-variadic helper (the + `vfprintf` shape, e.g. `131-vararg-mixed`) hit `assert fail: 0` in + `arm64-gen.c`. +- `const-divzero-shortcircuit-int.{before,after}` — gates `gen_opic`'s + "division by zero in constant" error on `!nocode_wanted` so the + unevaluated arm of `&&`/`||`/`?:` in constant expressions + (C11 §6.6¶3) does not abort. + +## Fixture cleanups + +Two small fixtures were rewritten to drop assumptions the regular `cc` +suite shouldn't depend on: - `tests/cc/125-anon-union.c` explicitly initializes its local struct - before probing anonymous-union aliasing. Tests should not depend on + before probing anonymous-union aliasing. Tests must not depend on implicit zeroing of automatic locals. - `tests/cc/132-tentative-bss-sizing.c` returns distinct numeric exit codes instead of calling `sys_write`/`strlen`. Plain `tests/cc` - fixtures should not need stdio/libc helpers. - -The cleaned fixtures also pass the regular aarch64 `cc` path: - -```sh -NAMES='125-anon-union 132-tentative-bss-sizing' \ - make test SUITE=cc ARCH=aarch64 -# 2 passed, 0 failed -``` - -## Next Debug Targets + fixtures must not need stdio/libc helpers. -Start with the earliest minimal failures in each remaining group: +## Next steps -- `003-compound`: small fixture in the `assert fail: 0` cluster; - good entry point for finding which tcc function is still being - miscompiled. -- `013-call`, `027-void-call`, `071-fnptr-call`: short call-site - failures — likely a different code path than the struct-copy fix - that cleared the vtop cluster. -- `220-const-promote`: only remaining "compile succeeds, exits wrong" - case — closest to "isolated codegen miscompile." +The cc.scm path matches the gcc baseline; further `tcc-cc` progress is +gated on upstream tcc bugs, not on our compiler. Options when those +become priorities: -Keep using `make test SUITE=cc ARCH=aarch64 NAMES=...` as the control -path for fixture semantics, and `make test SUITE=tcc-cc NAMES=...` as -the `tcc-boot2` acceptance path. +- Backport tcc's `200-lex-char-type` fix as a `simple-patches/` entry. +- Either move `tests/cc/129-extern-libp1pp.c` out of the directories + that `tcc-cc` runs against, or wire libp1pp into the tcc-cc link + set (mirrors what the cc-libc suite does). diff --git a/docs/TCC.md b/docs/TCC.md @@ -261,17 +261,19 @@ the further hop to tcc-0.9.27 lives outside this doc. ## What this unlocks for the scheme1 cc -The interface for the slot scheme CC will fill is fixed: +The interface for the slot scheme CC fills: - **Input**: `tcc.flat.c` produced by stage 1. -- **Output**: a working ELF tcc-host capable of compiling mes libc - and compile+linking the patched real `tcc.c` into `tcc-boot0-mes`. - -Stage 2 collapses to "scheme1-cc compiles tcc.flat.c and the mes libc -sources inside a busybox container." The alpine container goes away; -busybox + scheme1-cc covers everything from stage 1's output through -stage 3's `tcc-boot2`. `tcc.flat.c` is a known-good, host-cc-validated -artifact ready for scheme1-cc to chew on incrementally. +- **Output**: a working ELF capable of compiling the same `tests/cc` + fixtures the regular `cc` suite covers. + +`make tcc-boot2 ARCH=aarch64` now runs that path end-to-end: +`cc.scm + tcc.flat.c → tcc-boot2`, linking against a `cc.scm`-built +`libc.flat.c` instead of mes libc. The `tcc-cc` acceptance suite +(see [TCC-TODO.md](TCC-TODO.md)) shows full parity with the +gcc-built control. Alpine + gcc + `tcc-host` (stage 2 of the original +plan) is no longer in our boot2 path; the busybox + scheme1-cc chain +covers everything from stage 1's `tcc.flat.c` to a runnable tcc. ## Reproducibility diff --git a/scripts/dbg-load-cbz.sh b/scripts/dbg-load-cbz.sh @@ -1,36 +0,0 @@ -#!/bin/sh -set -e -apk add --quiet gdb >/dev/null 2>&1 - -cd /work -cat > /tmp/g.gdb << 'GDB' -set pagination off -set width 0 -set print address off - -# vstack[7] is at 0x7b4c0a (sv->r at +16 = 0x7b4c1a) -# Watch the r field for changes; trace what writes 20 to it. -# Break at cc__vsetc entry — dump ret.type.t -break *0x6795BC -commands - printf " vsetc(type.t=%d, r=0x%llx) lr=0x%llx\n", *(int *)$x0, $x1, $lr - continue -end - -break *0x6c6e2c -commands - printf " is_float at 7840: X0=0x%llx vtop=0x%llx\n", $x0, *(unsigned long long *)0x7b4a32 - printf " [vtop+0..7]=0x%llx [vtop+8..15]=0x%llx [vtop+16..23]=0x%llx\n", *(unsigned long long *)(*(unsigned long long *)0x7b4a32), *(unsigned long long *)(*(unsigned long long *)0x7b4a32 + 8), *(unsigned long long *)(*(unsigned long long *)0x7b4a32 + 16) - continue -end - -# Also: load entry to know when we're in the failing call -break *0x7395ec -commands - printf ">> load(r=%d sv=0x%llx sv->r=0x%x)\n", $x0, $x1, *(unsigned short *)($x1 + 16) - continue -end - -run -nostdlib build/aarch64/tcc-cc/start.o build/aarch64/tcc-cc/mem.o tests/cc/013-call.c -o /tmp/out_013 -GDB -gdb -batch -x /tmp/g.gdb build/aarch64/tcc-boot2/tcc-boot2 2>&1