Windows Self-Host (current state and roadmap)
This roadmap tracks bringing kit up as a self-hosted Windows toolchain:
cross-compiling the kit binary into a PE/COFF kit.exe and running kit cc
(AOT) and kit run (JIT) natively on Windows. The hosted target profile,
sysroot mechanics, and VM are described in ../WINDOWS.md; the
ABI work is in [windows-abi context]; this document tracks the self-host goal,
the baseline already in tree, and the genuinely-open follow-ups.
Targets are PE/COFF, 64-bit only: aarch64-windows (the reference, and what
runs natively on the Apple-silicon ARM64 Win11 VM) and x86_64-windows (runs
via the in-box x64 emulator on the same VM). The hosted profile is mingw-w64
UCRT via llvm-mingw (not MSVC): kit advertises __MINGW32__/__MINGW64__, never
_MSC_VER.
Baseline (done — context, not planned work)
Cross-build + native cc + native JIT all work on aarch64-windows, verified on the VM.
- Cross-build:
scripts/windows_cross.sh aarch64buildskit.exe(PE32+ console, ~20 MB) with the hostbuild/kitas the cross-compiler against the llvm-mingw UCRT sysroot, overridingHOST_OS=windows HOST_ARCH=aarch64somk/env.mkselectsdriver/env/windows.c. (Windows can't bootstrap natively — no seed C compiler in the VM — so the binary is cross-produced on the dev host; contrast the native Linux/FreeBSD bootstraps.) kit cc: compiles + links real C programs natively on the VM (hello.c→ ahello.exethat prints and returns the right exit code), compilinglibkit_rton demand.kit run(JIT): executes self-contained programs (compute, data/globals), external calls (rand), and libc I/O via dlsym (puts).- Toy corpus on the VM (native
kit.exe, compare.expected):- AOT (
kit cccompile + link + execute): 166 / 166 pass, 0 miscompile, 0 crash, at-O0. - JIT (
kit run): 166 / 166 — the printf-family and thread-local gaps (§1) are fixed; the two thread-local cases (141,142) now return their values (43, 134).
- AOT (
The items below are what is not yet done. The concrete open bugs/blockers are collected first; the numbered sections are the larger roadmap items.
Open bugs / blockers
Concrete defects surfaced during bring-up, each blocking a roadmap item below.
- x64 self-host
kit.execrashes (0xC0000005) on most subcommands — OPEN. The x64kit.exeloads and dispatches (help, hash work), butnm/size/cpp/ascrash. Suspected root cause: the kit.exe binary itself still imports__intrinsic_setjmpfromapi-ms-win-crt-private-l1-1-0.dll(a CRT-private api-set that may not load cleanly on Prism). Despite the two-archive rt placement fix (§2) that resolves the issue for kit-compiled programs, something in kit.exe's own link is still pulling the COFF WEAK_EXTERNAL_setjmp → __intrinsic_setjmpalias from libucrt.a. Exact trigger still under investigation (the strong_setjmpfrom libkit_rt.a does appear defined before libucrt.a scans, yet the import stub is pulled anyway — possibly via__imp___intrinsic_setjmpor a symbol not yet traced).
2. x86_64-windows parity
aarch64-windows is the reference. x64 codegen now shares the large-frame
stack-probe gate (§Baseline) via abi_stack_probe_interval, and the cross-link
now succeeds end to end.
What landed (this cycle):
rt/lib/coro/x86_64_win.cnow exports strong_setjmpand_setjmpexglobals (in addition to the existing weaksetjmp). mingw x64'ssetjmp.hexpandssetjmp(BUF)to_setjmp(BUF, frame)and declares_setjmpas a COFF WEAK_EXTERNAL aliasing__intrinsic_setjmpfromapi-ms-win-crt-private-l1-1-0.dll— a private CRT api-set that fails to load on some runtimes (e.g.0xC0000139on Prism). Providing a strong_setjmpin rt preempts the alias.driver/cmd/cc.cplaceslibkit_rt.aat two positions in the Windows link order:- rt#1: immediately before the hosted
aftergroup (before libmingw32.a, libucrt.a, …). This ensures_setjmpis indefinedbefore libucrt.a's lazy-pull loop runs, so the WEAK_EXTERNAL member that introduces__intrinsic_setjmpis never pulled. - rt#2: before
crtend.oonly (the original position). libucrt.a pulls libc members (printf, malloc, …) that contain large GCC-compiled stack frames whose___chkstk_msundefs appear only after libucrt.a's own scan completes; the second rt entry catches those late undefs. Because archive scanning is lazy, no symbol is ever defined twice.
- rt#1: immediately before the hosted
driver/env/windows.c: fixed a forward-reference todriver_join_path(used at its call site before the definition appeared in the same file).
VM verification status (Prism, Win11 25H2 ARM64):
- x64
kit.exeloads and dispatches (kitwith no args prints the multitool help). ✓ - x64
kit.exehashworks (compute + file I/O, exit 0, correct digest). ✓ - A kit-compiled x64 non-setjmp hosted program (printf
hello) runs and returns its exit code. ✓ - A kit-compiled x64 setjmp program runs end-to-end. ✓ (Fixed by the
strong
_setjmpin rt + two-archive placement.) - x64 Toy AOT corpus (
kit cccompile + link + execute, compare.expected): 164 / 165 at both-O0and-O1. The one failure (118_decl_extra_attrs) is an aarch64-only ADRP-range link issue — pre-existing, not an x64 regression. - The x64 self-host
kit.exeitself is not yet usable as a compiler on the VM —nm/size/cpp/ascrash (see Open bugs). Running the Toy AOT corpus through a native x64kit.exeis blocked on fixing the__intrinsic_setjmpimport. - Known x64-windows codegen gaps already on file (from the toy VM lanes):
36/37*srettail-call crash at -O1, and118_decl_extra_attrsADRP-range link is aarch64-only. Re-confirm against a native x64kit.exeonce the above are triaged.
3. A committed "compile-on-VM" test lane
test/toy/vm.sh windows today cross-compiles the toy cases on the host and
only executes on the VM, so it cannot catch self-host compile bugs (the §Baseline
stack-probe crash, for one, was invisible to it). The native self-host path
(kit.exe compiling on the VM) has been exercised ad hoc. Generalize it into a
committed lane (e.g. test/toy/vm.sh windows --native or a new harness) that
ships the case sources + a <bindir>/support/rt + the mingw sysroot to the VM,
compiles with kit.exe there, runs, and compares .expected. Gotchas to bake
in:
- PowerShell
Start-Process -PassThruwithout-Waitreports.ExitCode = 0always — use a[Diagnostics.Process]withWaitForExit(ms)+.ExitCode, or& exe; $LASTEXITCODEwith a Defender path exclusion. Likewisecmd%ERRORLEVEL%expands at parse time — usecmd /v:on+!ERRORLEVEL!. - macOS
taradds AppleDouble._*sidecars — ship case sources withCOPYFILE_DISABLE=1(same trap as the FreeBSD bootstrap). - The mingw sysroot's
<arch>-w64-mingw32/includeis a symlink togeneric-w64-mingw32/include; dereference (cp -RL) before shipping.
4. Distribution + default sysroot
- Ship the Windows distribution as
<bindir>/support/rt(whatkit installproduces); rt now resolves from the real image path, so no cwd dependence. - The Windows hosted profile requires
--sysroot/KIT_SYSROOT(no default). Windows ships the runtime DLLs (ucrtbase.dll, theapi-ms-win-crt-*api-sets,kernel32.dll) but no headers or import libs, so a sysroot is mandatory. Consider: bundling the mingw UCRT headers + import libs into the kit distribution (a baked sysroot), and/or teaching the COFF linker to synthesize imports directly from a system DLL's export table (drops the import-lib half of the sysroot, but thecrt2.ostartup +libmingwexhelpers —__mingw_setjmp,__local_stdio_printf_options, … — still have no DLL home and would have to move intolibkit_rtfor Windows).
5. Windows 3-stage bootstrap
The self-host milestone: use the cross-built kit.exe on the VM to compile
kit's own sources into a stage-2 kit.exe, then stage-3, and assert stage-2 ==
stage-3 byte-for-byte (cf. BOOTSTRAP.md, and the
native-bootstrap analogs scripts/{linux,freebsd}_bootstrap.sh). Unblocked
now that the large-frame stack-probe crash and the __FILE__ malformed UCN
bug are fixed — kit.exe compiles + links + runs the full Toy AOT corpus
(166/166) on the VM, and now compiles the rt tree on the VM cleanly (the
on-demand kit cc rt build succeeds). The malformed UCN bug turned out to be
a host-path stringization defect, not a self-compile codegen defect, so it
does not threaten stage-2==stage-3 identity. Next: confirm kit.exe can compile
the full libkit/driver source set on the VM, then add a
scripts/windows_bootstrap.sh to drive the VM-side stages and the
stage-2==stage-3 byte-identity check.
6. Debugger fault-guard / SEH on Windows
driver/env/windows.c's driver_run_with_crash_guard is a no-op on Windows
(the POSIX path uses sigaction + sigsetjmp); a crashing kit run program
takes down kit.exe instead of reporting on_crash. The dbg guarded_copy also
relies on the VEH backstop rather than __try (kit's C frontend has no SEH).
A proper vectored-exception-handler port would give kit run/kit dbg
fault isolation on Windows.
Operational notes
- Build:
scripts/windows_cross.sh [aarch64|x64](needsmake bin, the mingw sysroot viascripts/llvm_mingw_sysroot.sh prepare <arch>, and the rt variant). kit cc emits no-MMDdepfiles, so the script wipes objects for a correct full rebuild each run. - VM:
scripts/windows_vm.sh boot|wait-ssh|run|ssh|stop(see ../WINDOWS.md). One ARM64 VM serves both arches.