kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 51b7bafacd32b879a65c5cb63ef0d8efa335e41b
parent 6cedc8f3b379c3369f243742bc1e3e23a5474e84
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Fri, 29 May 2026 10:15:02 -0700

CG runtime doc plan

Diffstat:
Adoc/GO_RUNTIME_CG_JIT.md | 279+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 279 insertions(+), 0 deletions(-)

diff --git a/doc/GO_RUNTIME_CG_JIT.md b/doc/GO_RUNTIME_CG_JIT.md @@ -0,0 +1,279 @@ +# Go-like runtime support in CG and JIT + +This note sketches the interface changes cfree would likely want before +supporting a language with Go-like runtime semantics. + +Go is statically typed, so the main gap is not dynamic typing in the codegen +interface. The gap is the managed runtime model: precise GC, goroutines, +managed stacks, interfaces/reflection metadata, panic/defer/recover, and a +long-lived JIT image whose code and metadata can evolve safely. + +## Current fit + +`include/cfree/cg.h` is a typed storage and ABI interface. That is still the +right boundary for a Go-like frontend. Source constructs such as slices, +strings, interfaces, closures, maps, and channels can lower to records, +pointers, indirect calls, and runtime helper calls. + +The parts that need new public or semi-public contracts are the places where a +managed runtime must cooperate with generated code and the JIT: + +- where GC roots live at safepoints +- which pointer stores need barriers +- how managed stack growth and goroutine context are represented +- how panic paths and implicit checks map back to language semantics +- how runtime metadata is attached to JIT code +- how live JIT code is appended, replaced, and retired + +The incremental and replacement side should build on `doc/INCREMENTAL_LINK.md` +and `doc/HOT_RELOAD.md`. + +## Design principles + +- Keep CG typed. Do not turn `CfreeCg` into a dynamically typed language IR. +- Keep source language concepts out of the backend where possible. Lower them + to storage types plus runtime metadata. +- Make managed runtime behavior explicit. GC roots, safepoints, stack growth, + and barriers should not be inferred from raw loads and stores late in the + backend. +- Keep AOT and normal C/Toy/Wasm JIT behavior unchanged by default. Managed + runtime features should be opt-in through code options, function attributes, + or language-specific frontend options. +- Keep all runtime services on context/session structs. Do not add global + runtime state. + +## CG interface extensions + +### Precise GC metadata + +Add a way for frontends to describe managed roots and safepoints. A precise GC +needs to know, for each safepoint PC, which frame slots, registers, params, +spills, and globals contain managed pointers. + +Possible public shape: + +```c +typedef enum CfreeCgGcRootKind { + CFREE_CG_GC_ROOT_LOCAL, + CFREE_CG_GC_ROOT_PARAM, + CFREE_CG_GC_ROOT_GLOBAL, +} CfreeCgGcRootKind; + +typedef struct CfreeCgGcRoot { + uint8_t kind; + uint8_t pointer_kind; + uint16_t flags; + CfreeCgLocal local; + CfreeCgSym global; + uint32_t offset; + CfreeCgTypeId type; +} CfreeCgGcRoot; + +CFREE_API void cfree_cg_safepoint(CfreeCg*, const CfreeCgGcRoot* roots, + uint32_t nroots); +``` + +The exact encoding should probably become compact backend metadata rather than +DWARF-only data. DWARF can describe variables for humans, but the runtime needs +fast PC-to-stackmap lookup. + +### Managed pointer identity + +Raw pointers and managed heap pointers need to be distinguishable. Options: + +- address spaces for managed heap pointers +- pointer type attributes +- explicit managed load/store/allocation intrinsics + +Address spaces are attractive because `cfree_cg_type_ptr` already accepts an +address space. The missing piece is policy: which address spaces are scanned, +movable, non-moving, interior, or raw. + +### Write barriers and allocation + +A Go-like runtime needs write barriers for pointer stores into managed heap +objects. Do not require every frontend to open-code barrier sequences. + +Add either explicit operations: + +```c +CFREE_API void cfree_cg_managed_store(CfreeCg*, CfreeCgMemAccess access, + CfreeCgEffAddr ea); +``` + +or runtime intrinsics: + +```c +CFREE_CG_INTRIN_GC_ALLOC +CFREE_CG_INTRIN_GC_WRITE_BARRIER +CFREE_CG_INTRIN_GC_READ_BARRIER +``` + +The barrier operation should carry enough metadata for the runtime to identify +the object base, field offset, and pointer kind. Ordinary C stores should remain +ordinary stores. + +### Runtime calling convention + +The current call convention enum is mostly C ABI oriented. A managed runtime may +need: + +- a hidden runtime/thread/goroutine context parameter +- reserved registers +- stack-bound checks in prologues +- a runtime-specific helper-call ABI +- better support for source-level multiple returns + +This could be expressed with a new call convention plus function attributes: + +```c +CFREE_CG_CC_MANAGED + +typedef enum CfreeCgFuncFlag { + ... + CFREE_CG_FUNC_MANAGED_STACK = 1u << N, + CFREE_CG_FUNC_GC_SAFEPOINTS = 1u << M, +} CfreeCgFuncFlag; +``` + +Multiple returns can be lowered through structs or sret today, but a first-class +multi-result function model would better match Go and the internal IR's +semantic multi-result direction. + +### Stack growth and goroutines + +Stackful coroutines already exist in the runtime, but Go-style goroutines need a +different contract: + +- prologue stack checks +- runtime calls to grow or switch stacks +- maps describing live pointers before a growth call +- frame relocation metadata if stacks can move +- unwind/debug compatibility after stack movement + +This should be a managed-stack function attribute, not a generic default for all +CG functions. + +### Panic and implicit checks + +Go-like languages turn many checks into language panics: + +- nil dereference +- bounds check +- divide by zero +- failed type assertion +- explicit `panic` + +CG should expose either explicit check operations or metadata attached to trap +sites. The JIT/debug layer should be able to translate a trap PC into the +language panic path instead of treating it as only a process signal. + +Useful metadata: + +- check kind +- source location +- recovery/cleanup target if any +- runtime helper to call + +### Defer/recover and cleanup edges + +`defer` and `recover` require non-local control behavior that is more structured +than plain `longjmp`. The minimal path is to lower defer management into runtime +calls and make panic edges explicit enough for stack maps and debugger stepping. + +A more complete path would add cleanup/landing-pad metadata to CG, but that +should be deferred until the runtime semantics are clearer. + +## JIT interface extensions + +### Transactional publish + +The existing `cfree_jit_publish` shape is the right direction. A managed runtime +needs stronger guarantees around transactions: + +- compile/link failure leaves the live image unchanged +- metadata is published atomically with code +- old code remains executable while any frame may return into it +- replacement increments the JIT generation +- readers can detect generation changes + +This aligns with `doc/INCREMENTAL_LINK.md` and `doc/HOT_RELOAD.md`. + +### Runtime metadata registry + +Add a JIT metadata channel separate from object/debug inspection: + +```c +typedef enum CfreeJitMetadataKind { + CFREE_JIT_META_STACK_MAP, + CFREE_JIT_META_FUNC_TABLE, + CFREE_JIT_META_TYPE_DESC, + CFREE_JIT_META_TRAP_TABLE, + CFREE_JIT_META_INLINE_TABLE, +} CfreeJitMetadataKind; +``` + +The runtime needs fast queries: + +- PC to function +- PC to stack map +- PC to trap/check record +- PC to inline/source frame +- symbol to active generation + +`cfree_jit_view` can remain object/DWARF oriented. Runtime metadata should be +compact, stable, and queryable without parsing DWARF. + +### Code lifetime and reclamation + +Hot reload for a managed runtime cannot immediately unmap old code. Add +lifetime hooks or state transitions: + +- active +- replaced but still callable by existing frames +- retired +- reclaimable + +The runtime or debugger should be able to veto reclamation until stack scanning +proves no frame or return address is inside an old generation. + +### Managed entry invocation + +The current debugger entry call helpers are narrow: argv-style or `u64` values. +Managed language entry points need a descriptor-based call path or explicit +language trampolines so the host can pass runtime context, managed arguments, +and return slots safely. + +The conservative design is to keep the low-level JIT call ABI simple and require +the frontend/runtime to emit C-callable trampolines. + +### Thread and stop-the-world coordination + +`CfreeJitSession` currently models controlled in-process execution for a single +JIT image. A managed runtime eventually needs hooks for: + +- safepoint polling +- cooperative stop requests +- thread/goroutine enumeration +- stack scanning while stopped +- metadata refresh while workers are paused + +This does not need to be part of the first CG change, but the JIT metadata and +publish APIs should avoid assuming a single worker forever. + +## Suggested sequence + +1. Add managed pointer/address-space policy and explicit safepoint records. +2. Emit and query compact stack maps from the JIT image. +3. Add managed allocation and write-barrier intrinsics. +4. Add managed-stack function attrs and stack-check lowering. +5. Publish runtime metadata transactionally with JIT appends. +6. Add trap/check tables for panic lowering. +7. Add function replacement/lifetime support on top of hot reload. +8. Only then broaden debugger/session APIs for multi-threaded managed runtime + coordination. + +The first useful milestone is smaller than "compile Go": compile a tiny +managed-language frontend that can allocate traced objects, hit safepoints, let +the host enumerate roots from a JIT stack map, and call through a C-callable +runtime trampoline.