boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit 89426553d1861d00e272ec9706e6b064eb4547e6
parent e04409821656586c20a1688e9a86ab8384cf2e2c
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Tue, 28 Apr 2026 17:50:52 -0700

tcc-boot2: end-to-end harness from tcc.flat.c to ELF

`make tcc-boot2 ARCH=...` chains stage1-flatten (host preprocessor),
cc.scm in the per-arch container, and the existing P1pp pipeline.
TCC_TARGET selects the codegen tcc itself targets (default X86_64).

scripts/boot-build-cc.sh is a thin in-container wrapper around
`scheme1 cc.scm <src> <out>` matching the boot-build-p1pp.sh contract;
CC_DEBUG=1 forwards --cc-debug for per-phase heap telemetry.

docs/TCC-TODO.md: collapse the offsetof blocker and rework the
next-tier section around what's downstream of cc.scm now that the
parse + cg-finish reach EOF on the full 608 KB TU.

Diffstat:
MMakefile | 35++++++++++++++++++++++++++++++++++-
Mdocs/TCC-TODO.md | 146++++++++++++++++++++++++++++++++++++++++++++-----------------------------------
Ascripts/boot-build-cc.sh | 35+++++++++++++++++++++++++++++++++++
3 files changed, 151 insertions(+), 65 deletions(-)

diff --git a/Makefile b/Makefile @@ -17,6 +17,8 @@ # make hello build hello via the bootstrap chain # make scheme1 build the scheme1 interpreter for ARCH # make cc catm the cc compiler source for ARCH +# make tcc-flat flatten upstream tcc.c into one TU +# make tcc-boot2 cc.scm + P1pp pipeline → tcc-boot2 ELF # make run run hello in the container # make test every suite, every arch # make test SUITE=m1pp m1pp suite, every arch @@ -63,7 +65,7 @@ PODMAN = podman run --rm --pull=never --platform $(PLATFORM_$(1)) \ # --- Targets -------------------------------------------------------------- .PHONY: all m1pp pokem hello scheme1 cc run test image tools tables \ - tools-native cloc clean help + tools-native cloc clean help tcc-boot2 tcc-flat all: m1pp pokem @@ -208,6 +210,37 @@ $(CC_BINS): build/%/cc/cc.scm: $(CC_SRCS) build/%/.image build/%/tools/M0 run: $(OUT_DIR)/hello $(IMAGE_STAMP) $(call PODMAN,$(ARCH)) ./$(OUT_DIR)/hello +# --- tcc-boot2 end-to-end harness ----------------------------------------- +# +# Drives stage1-flatten.sh (host preprocessor only — no container) to +# produce build/cc-bootstrap/$(TCC_TARGET)/tcc.flat.c, then runs cc.scm +# inside the per-arch container against the flattened TU, then assembles +# the resulting P1pp into a runnable ELF using the standard P1pp +# pipeline. The resulting binary embeds tcc's $(TCC_TARGET) codegen, so +# match $(ARCH) to it (amd64↔X86_64, riscv64↔RISCV64) if you want to +# run the binary natively in the container. + +TCC_TARGET ?= X86_64 +TCC_FLAT := build/cc-bootstrap/$(TCC_TARGET)/tcc.flat.c + +TCC_BOOT2_BINS := $(foreach a,$(ALL_ARCHES),build/$(a)/tcc-boot2/tcc-boot2) +TCC_BOOT2_P1PPS := $(foreach a,$(ALL_ARCHES),build/$(a)/tcc-boot2/tcc.flat.P1pp) + +tcc-flat: $(TCC_FLAT) +tcc-boot2: $(OUT_DIR)/tcc-boot2/tcc-boot2 + +$(TCC_FLAT): scripts/stage1-flatten.sh + sh scripts/stage1-flatten.sh --arch $(TCC_TARGET) + +$(TCC_BOOT2_P1PPS): build/%/tcc-boot2/tcc.flat.P1pp: \ + $(TCC_FLAT) build/%/scheme1 build/%/cc/cc.scm \ + scripts/boot-build-cc.sh build/%/.image + $(call PODMAN,$*) sh scripts/boot-build-cc.sh $(TCC_FLAT) $@ + +$(TCC_BOOT2_BINS): build/%/tcc-boot2/tcc-boot2: \ + build/%/tcc-boot2/tcc.flat.P1pp $(P1PP_BUILD_DEPS) + $(call PODMAN,$*) sh scripts/boot-build-p1pp.sh $< $@ + # --- Native tools (opt-in dev-loop helpers) ------------------------------- NATIVE_TOOLS := build/native-tools/M1 build/native-tools/hex2 build/native-tools/m1pp diff --git a/docs/TCC-TODO.md b/docs/TCC-TODO.md @@ -37,47 +37,56 @@ head -c 50000 build/cc-bootstrap/X86_64/tcc.flat.c \ # then re-run the podman invocation against tcc.head.c ``` -## Blocker — offsetof-style const expr in `options_W[]` (line 18026) +## Status — parse + cg-finish complete on tcc.flat.c -Current run gets past `asm_instrs[]` and stops in the option flag -tables: +The full 608 KB TU now parses to EOF (line 18800) and cg-finish emits +~6.5 MB of P1pp. No semantic-coverage gap remains in this TU. Last +aarch64 cc-debug run: -```c -static const FlagDef options_W[] = { - { 0, 0, "all" }, - { ((size_t) &((TCCState *)0)->warn_unsupported), 0, "unsupported" }, - { ((size_t) &((TCCState *)0)->warn_write_strings), 0, "write-strings" }, - ... -}; ``` - -Diagnostic: - +[cc] phase=start: heap 1 225 052 +[cc] phase=slurp: heap 3 101 100 src-bytes 608 547 +[cc] decl: line 14861 heap 61 008 644 +[cc] decl: line 18024 heap 66 002 084 +[cc] decl: line 18800 heap 64 824 540 ; final decl +[cc] phase=parse: heap 64 864 516 +[cc] phase=cg-finish: heap 90 674 020 out-bytes 6 489 215 ``` -build/cc-bootstrap/X86_64/tcc.flat.c:18026:17: error: const-expr: bad operand: amp -``` - -This is the classic `offsetof` idiom: take the address of a member -through a null pointer, cast it to `size_t`, and use the resulting -field offset as a static integer initializer. -The present `parse-const-expr` handles integer arithmetic, casts to -integer types, enum constants, and `sizeof`, but not address-of, -pointer casts, member access, or `->` in unevaluated/address contexts. - -Likely narrow fix shape: - -- allow const-expr casts to pointer types when the operand is an - integer constant, at least for `(T *)0` -- add a const-expression path for unary `&` over an lvalue expression - made of null pointer cast + `.` / `->` field selection -- compute the member offset from the ctype layout and return it as an - integer constant of the enclosing cast type -- keep this const-only; do not make general pointer values acceptable - as arbitrary integer constants - -This should cover the `options_W[]` / `options_f[]` tables without -expanding static initializer semantics beyond the tcc bootstrap need. +The remaining work is downstream of cc.scm: + +1. **Assemble the emitted P1pp** through the existing + `scripts/boot-build-p1pp.sh` pipeline (m1pp → M0 → hex2). The output + is large by P1pp standards — about 2× the scheme1 binary's input — + so this exercises m1pp/M0 throughput at a scale they haven't yet + been used at. Expect to find table size or scratch caps that need + bumping in those tools, or P1pp emission patterns cc.scm produces + that the macro layer doesn't accept verbatim. +2. **Run the resulting `tcc-boot2`** and verify `-version`. Beyond + that, milestone 4 in [CC.md §Validation milestones](CC.md) — full + self-host of tcc — is the end goal. + +Harness target: `make tcc-boot2 ARCH=amd64` (see Makefile + +`scripts/boot-build-cc.sh`) drives stage1-flatten on the host, runs +cc.scm on the flattened TU inside the container, and feeds the P1pp +into the standard `boot-build-p1pp.sh` pipeline. `TCC_TARGET` selects +which tcc codegen target gets baked into the binary +(default `X86_64`); pick `ARCH` to match if you want the result to +run natively in the per-arch container. + +## Resolved — offsetof-style const expr in `options_W[]` (line 18026) + +Done. `parse-const-cast` now accepts pointer-typed casts as a type +re-tag (the integer offset rides through unchanged), and +`parse-const-unary` has an `&` arm that runs a small postfix-style +designator parser: a `(T *)0` head (with optional grouping parens or +a `*` deref) followed by a chain of `->` / `.` field selectors. Field +lookup reuses `%cg-find-field`, so anonymous union/struct members +(needed by `struct Sym`-style layouts) work without extra plumbing. +Scope is intentionally narrow — only the offsetof shape is admitted; +no general pointer arithmetic in const-expr. +Test: `tests/cc/126-offsetof-const.c` — covers `&((T*)0)->FIELD`, +`&(*(T*)0).FIELD`, and the same form through anonymous union members. ## Resolved — scratch pressure on `asm_instrs[]` (line 14527) @@ -186,15 +195,17 @@ initializer unit rewind path ([CC-INIT-SCRATCH.md](CC-INIT-SCRATCH.md)) plus the recent scope-bind alist / scratch reclamation work. Current full-file aarch64 run against -`build/cc-bootstrap/X86_64/tcc.flat.c`: +`build/cc-bootstrap/X86_64/tcc.flat.c` — parse + cg-finish complete: ``` -[cc] phase=start: heap 1225052 -[cc] phase=slurp: heap 3101100 src-bytes 608547 -[cc] decl: line 14527 heap 50929348 -[cc] decl: line 14861 heap 61008644 -[cc] decl: line 18024 heap 66002084 -build/cc-bootstrap/X86_64/tcc.flat.c:18026:17: error: const-expr: bad operand: amp +[cc] phase=start: heap 1 225 052 +[cc] phase=slurp: heap 3 101 100 src-bytes 608 547 +[cc] decl: line 14527 heap 50 929 348 +[cc] decl: line 14861 heap 61 008 644 +[cc] decl: line 18024 heap 66 002 084 +[cc] decl: line 18800 heap 64 824 540 ; final decl +[cc] phase=parse: heap 64 864 516 +[cc] phase=cg-finish: heap 90 674 020 out-bytes 6 489 215 ``` Milestones from that run: @@ -205,16 +216,17 @@ Milestones from that run: | slurp | - | 3 101 100 | 1 876 048 | 608 547-byte source loaded | | before `asm_instrs[]` | 14 527 | 50 929 348 | 49 704 296 | enters large static table | | after `asm_instrs[]` | 14 861 | 61 008 644 | 59 783 592 | table completed | -| before current blocker | 18 024 | 66 002 084 | 64 777 032 | next decl, then line 18026 error | +| through options tables | 18 024 | 66 002 084 | 64 777 032 | offsetof const-expr region | +| end of TU | 18 800 | 64 824 540 | 63 599 488 | final decl reached | +| post cg-finish | - | 90 674 020 | 89 448 968 | text/data buffers + 6.5 MB out | Observed rates: -- Full progress to line 18024 consumes ~64.8 MB above start for - 608 547 source bytes loaded and most of the TU parsed. +- Parse to EOF holds steady around ~64.8 MB above start for 608 547 + source bytes — i.e. ~110 bytes of resident state per source byte. - The `asm_instrs[]` table adds ~10.1 MB over 333 rows, about 30 KB / row after the streaming initializer fix. -- The compiler now reaches line 18026 under the existing caps; the - current failure is semantic parser coverage, not memory exhaustion. +- cg-finish adds ~26 MB on top of parse and produces 6.5 MB of P1pp. Older prefix probes that end at clean top-level `};` boundaries (HEAP_CAP_BYTES = 256 MiB, SCRATCH_CAP_BYTES = 128 MiB): @@ -265,22 +277,28 @@ enum constants) overflowed even 128 MiB of scratch because O(N²) in member count. The recent scratch / alist work makes that decl complete with parse heap at ~31 MB on the 1612-line cut. -## Expected next-tier blockers - -After the line 18026 const-expression issue, the remaining wave is -still likely to include: - -- **More static initializer const-expr forms** — tcc tables use C - implementation idioms (`offsetof`, pointer-ish integer constants, - possibly address arithmetic) that are not covered by the current - integer-only const evaluator. -- **`_Bool`, bitfield-typed struct fields, `setjmp.h` typedefs** — - same "parse, don't codegen" softening as floats. tcc.c carries these - under `HAVE_BITFIELD` / `HAVE_SETJMP` gates that are off but leave - the declarations in the flattened text. -- **Throughput / wall-clock** — the current failing aarch64 run takes - about 29 seconds to parse to line 18026 under scheme1. A successful - full compile will add cg-finish and P1pp assembly time. +## Expected next-tier blockers (downstream of cc.scm) + +The semantic parser has covered every construct in this TU. The next +likely walls live in the assembly side and at runtime: + +- **m1pp / M0 / hex2 caps under a 6.5 MB P1pp**. These tools have only + ever been driven against scheme1-scale inputs (tens to hundreds of + KB of source, maybe a few MB after expansion). cc.scm's tcc.c output + is ~6.5 MB pre-expansion. Expect symbol-table, line-buffer, or + scratch-arena caps to need bumping. +- **Patterns cc.scm emits that m1pp / M0 don't accept**. Until now the + cc has only been validated against the small `tests/cc/*` programs. + Larger programs may hit edge cases in label naming, literal sizing, + or directive ordering that the existing tests didn't reach. +- **Wall-clock**. Parsing to EOF takes ~30 s under scheme1 today; + cg-finish adds another bump. Assembly is in addition. A first end- + to-end run will set the baseline. +- **`tcc-boot2 -version` correctness**. Even when the toolchain + produces an ELF, the runtime still has to walk through tcc's setup + (string-table init, command-line parsing, output for `-version`) + without tripping on cg semantics that pass the small tests but + diverge from C in subtle ways. The end goal is milestone 4 in [CC.md §Validation milestones](CC.md) — "Compile tcc.c (under the tcc-mes defines) → tcc-lispcc; verify diff --git a/scripts/boot-build-cc.sh b/scripts/boot-build-cc.sh @@ -0,0 +1,35 @@ +#!/bin/sh +## boot-build-cc.sh — in-container .c -> .P1pp via scheme1 + cc.scm. +## +## Pure transformation. Caller (the Makefile) ensures every fixed-path +## input below already exists: the per-arch scheme1 ELF and the catm'd +## cc.scm source. Mirrors boot-build-p1pp.sh's contract: env-driven, +## one thing only, no host work. +## +## Env: ARCH=aarch64|amd64|riscv64 +## CC_DEBUG=1 (optional) — pass --cc-debug to cc.scm so it prints +## per-phase heap usage on stderr. +## Usage: boot-build-cc.sh <src.c> <out.P1pp> + +set -eu + +: "${ARCH:?ARCH must be set}" +[ "$#" -eq 2 ] || { echo "usage: ARCH=<arch> $0 <src> <out>" >&2; exit 2; } + +SRC=$1 +OUT=$2 + +SCHEME1_BIN=build/$ARCH/scheme1 +CC_SRC=build/$ARCH/cc/cc.scm + +[ -x "$SCHEME1_BIN" ] || { echo "missing $SCHEME1_BIN" >&2; exit 1; } +[ -e "$CC_SRC" ] || { echo "missing $CC_SRC" >&2; exit 1; } +[ -e "$SRC" ] || { echo "missing $SRC" >&2; exit 1; } + +mkdir -p "$(dirname "$OUT")" + +if [ "${CC_DEBUG:-0}" = "1" ]; then + "$SCHEME1_BIN" "$CC_SRC" --cc-debug "$SRC" "$OUT" +else + "$SCHEME1_BIN" "$CC_SRC" "$SRC" "$OUT" +fi