commit 51b7bafacd32b879a65c5cb63ef0d8efa335e41b
parent 6cedc8f3b379c3369f243742bc1e3e23a5474e84
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Fri, 29 May 2026 10:15:02 -0700
CG runtime doc plan
Diffstat:
1 file changed, 279 insertions(+), 0 deletions(-)
diff --git a/doc/GO_RUNTIME_CG_JIT.md b/doc/GO_RUNTIME_CG_JIT.md
@@ -0,0 +1,279 @@
+# Go-like runtime support in CG and JIT
+
+This note sketches the interface changes cfree would likely want before
+supporting a language with Go-like runtime semantics.
+
+Go is statically typed, so the main gap is not dynamic typing in the codegen
+interface. The gap is the managed runtime model: precise GC, goroutines,
+managed stacks, interfaces/reflection metadata, panic/defer/recover, and a
+long-lived JIT image whose code and metadata can evolve safely.
+
+## Current fit
+
+`include/cfree/cg.h` is a typed storage and ABI interface. That is still the
+right boundary for a Go-like frontend. Source constructs such as slices,
+strings, interfaces, closures, maps, and channels can lower to records,
+pointers, indirect calls, and runtime helper calls.
+
+The parts that need new public or semi-public contracts are the places where a
+managed runtime must cooperate with generated code and the JIT:
+
+- where GC roots live at safepoints
+- which pointer stores need barriers
+- how managed stack growth and goroutine context are represented
+- how panic paths and implicit checks map back to language semantics
+- how runtime metadata is attached to JIT code
+- how live JIT code is appended, replaced, and retired
+
+The incremental and replacement side should build on `doc/INCREMENTAL_LINK.md`
+and `doc/HOT_RELOAD.md`.
+
+## Design principles
+
+- Keep CG typed. Do not turn `CfreeCg` into a dynamically typed language IR.
+- Keep source language concepts out of the backend where possible. Lower them
+ to storage types plus runtime metadata.
+- Make managed runtime behavior explicit. GC roots, safepoints, stack growth,
+ and barriers should not be inferred from raw loads and stores late in the
+ backend.
+- Keep AOT and normal C/Toy/Wasm JIT behavior unchanged by default. Managed
+ runtime features should be opt-in through code options, function attributes,
+ or language-specific frontend options.
+- Keep all runtime services on context/session structs. Do not add global
+ runtime state.
+
+## CG interface extensions
+
+### Precise GC metadata
+
+Add a way for frontends to describe managed roots and safepoints. A precise GC
+needs to know, for each safepoint PC, which frame slots, registers, params,
+spills, and globals contain managed pointers.
+
+Possible public shape:
+
+```c
+typedef enum CfreeCgGcRootKind {
+ CFREE_CG_GC_ROOT_LOCAL,
+ CFREE_CG_GC_ROOT_PARAM,
+ CFREE_CG_GC_ROOT_GLOBAL,
+} CfreeCgGcRootKind;
+
+typedef struct CfreeCgGcRoot {
+ uint8_t kind;
+ uint8_t pointer_kind;
+ uint16_t flags;
+ CfreeCgLocal local;
+ CfreeCgSym global;
+ uint32_t offset;
+ CfreeCgTypeId type;
+} CfreeCgGcRoot;
+
+CFREE_API void cfree_cg_safepoint(CfreeCg*, const CfreeCgGcRoot* roots,
+ uint32_t nroots);
+```
+
+The exact encoding should probably become compact backend metadata rather than
+DWARF-only data. DWARF can describe variables for humans, but the runtime needs
+fast PC-to-stackmap lookup.
+
+### Managed pointer identity
+
+Raw pointers and managed heap pointers need to be distinguishable. Options:
+
+- address spaces for managed heap pointers
+- pointer type attributes
+- explicit managed load/store/allocation intrinsics
+
+Address spaces are attractive because `cfree_cg_type_ptr` already accepts an
+address space. The missing piece is policy: which address spaces are scanned,
+movable, non-moving, interior, or raw.
+
+### Write barriers and allocation
+
+A Go-like runtime needs write barriers for pointer stores into managed heap
+objects. Do not require every frontend to open-code barrier sequences.
+
+Add either explicit operations:
+
+```c
+CFREE_API void cfree_cg_managed_store(CfreeCg*, CfreeCgMemAccess access,
+ CfreeCgEffAddr ea);
+```
+
+or runtime intrinsics:
+
+```c
+CFREE_CG_INTRIN_GC_ALLOC
+CFREE_CG_INTRIN_GC_WRITE_BARRIER
+CFREE_CG_INTRIN_GC_READ_BARRIER
+```
+
+The barrier operation should carry enough metadata for the runtime to identify
+the object base, field offset, and pointer kind. Ordinary C stores should remain
+ordinary stores.
+
+### Runtime calling convention
+
+The current call convention enum is mostly C ABI oriented. A managed runtime may
+need:
+
+- a hidden runtime/thread/goroutine context parameter
+- reserved registers
+- stack-bound checks in prologues
+- a runtime-specific helper-call ABI
+- better support for source-level multiple returns
+
+This could be expressed with a new call convention plus function attributes:
+
+```c
+CFREE_CG_CC_MANAGED
+
+typedef enum CfreeCgFuncFlag {
+ ...
+ CFREE_CG_FUNC_MANAGED_STACK = 1u << N,
+ CFREE_CG_FUNC_GC_SAFEPOINTS = 1u << M,
+} CfreeCgFuncFlag;
+```
+
+Multiple returns can be lowered through structs or sret today, but a first-class
+multi-result function model would better match Go and the internal IR's
+semantic multi-result direction.
+
+### Stack growth and goroutines
+
+Stackful coroutines already exist in the runtime, but Go-style goroutines need a
+different contract:
+
+- prologue stack checks
+- runtime calls to grow or switch stacks
+- maps describing live pointers before a growth call
+- frame relocation metadata if stacks can move
+- unwind/debug compatibility after stack movement
+
+This should be a managed-stack function attribute, not a generic default for all
+CG functions.
+
+### Panic and implicit checks
+
+Go-like languages turn many checks into language panics:
+
+- nil dereference
+- bounds check
+- divide by zero
+- failed type assertion
+- explicit `panic`
+
+CG should expose either explicit check operations or metadata attached to trap
+sites. The JIT/debug layer should be able to translate a trap PC into the
+language panic path instead of treating it as only a process signal.
+
+Useful metadata:
+
+- check kind
+- source location
+- recovery/cleanup target if any
+- runtime helper to call
+
+### Defer/recover and cleanup edges
+
+`defer` and `recover` require non-local control behavior that is more structured
+than plain `longjmp`. The minimal path is to lower defer management into runtime
+calls and make panic edges explicit enough for stack maps and debugger stepping.
+
+A more complete path would add cleanup/landing-pad metadata to CG, but that
+should be deferred until the runtime semantics are clearer.
+
+## JIT interface extensions
+
+### Transactional publish
+
+The existing `cfree_jit_publish` shape is the right direction. A managed runtime
+needs stronger guarantees around transactions:
+
+- compile/link failure leaves the live image unchanged
+- metadata is published atomically with code
+- old code remains executable while any frame may return into it
+- replacement increments the JIT generation
+- readers can detect generation changes
+
+This aligns with `doc/INCREMENTAL_LINK.md` and `doc/HOT_RELOAD.md`.
+
+### Runtime metadata registry
+
+Add a JIT metadata channel separate from object/debug inspection:
+
+```c
+typedef enum CfreeJitMetadataKind {
+ CFREE_JIT_META_STACK_MAP,
+ CFREE_JIT_META_FUNC_TABLE,
+ CFREE_JIT_META_TYPE_DESC,
+ CFREE_JIT_META_TRAP_TABLE,
+ CFREE_JIT_META_INLINE_TABLE,
+} CfreeJitMetadataKind;
+```
+
+The runtime needs fast queries:
+
+- PC to function
+- PC to stack map
+- PC to trap/check record
+- PC to inline/source frame
+- symbol to active generation
+
+`cfree_jit_view` can remain object/DWARF oriented. Runtime metadata should be
+compact, stable, and queryable without parsing DWARF.
+
+### Code lifetime and reclamation
+
+Hot reload for a managed runtime cannot immediately unmap old code. Add
+lifetime hooks or state transitions:
+
+- active
+- replaced but still callable by existing frames
+- retired
+- reclaimable
+
+The runtime or debugger should be able to veto reclamation until stack scanning
+proves no frame or return address is inside an old generation.
+
+### Managed entry invocation
+
+The current debugger entry call helpers are narrow: argv-style or `u64` values.
+Managed language entry points need a descriptor-based call path or explicit
+language trampolines so the host can pass runtime context, managed arguments,
+and return slots safely.
+
+The conservative design is to keep the low-level JIT call ABI simple and require
+the frontend/runtime to emit C-callable trampolines.
+
+### Thread and stop-the-world coordination
+
+`CfreeJitSession` currently models controlled in-process execution for a single
+JIT image. A managed runtime eventually needs hooks for:
+
+- safepoint polling
+- cooperative stop requests
+- thread/goroutine enumeration
+- stack scanning while stopped
+- metadata refresh while workers are paused
+
+This does not need to be part of the first CG change, but the JIT metadata and
+publish APIs should avoid assuming a single worker forever.
+
+## Suggested sequence
+
+1. Add managed pointer/address-space policy and explicit safepoint records.
+2. Emit and query compact stack maps from the JIT image.
+3. Add managed allocation and write-barrier intrinsics.
+4. Add managed-stack function attrs and stack-check lowering.
+5. Publish runtime metadata transactionally with JIT appends.
+6. Add trap/check tables for panic lowering.
+7. Add function replacement/lifetime support on top of hot reload.
+8. Only then broaden debugger/session APIs for multi-threaded managed runtime
+ coordination.
+
+The first useful milestone is smaller than "compile Go": compile a tiny
+managed-language frontend that can allocate traced objects, hit safepoints, let
+the host enumerate roots from a JIT stack map, and call through a C-callable
+runtime trampoline.