kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 04a9f5552fdff742765b8d0b2b3a1f4d72f1d38f
parent 1bf64d6836d6e075edcf407dc848c199f777f2ad
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Wed, 13 May 2026 10:59:36 -0700

cg-ext.md plan

Diffstat:
Adoc/cg-ext.md | 551+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 551 insertions(+), 0 deletions(-)

diff --git a/doc/cg-ext.md b/doc/cg-ext.md @@ -0,0 +1,551 @@ +# Public CG Extension Plan + +Scope: extensions needed for `include/cfree/cg.h` to serve as a portable +direct codegen API for frontends other than C. This is not a plan for a stored +LLVM-like IR. `CfreeCg` remains an imperative emitter bound to a +`CfreeObjBuilder`; frontends lower their own AST/HIR/MIR directly into the API. + +This API is new enough that compatibility with the current draft is not a +constraint. Make breaking changes. One clean way of doing things. + +The target user is a language frontend with its own parser, type checker, and +high-level lowering: C, Zig, Rust-like languages, toy languages, emulators, and +system DSLs. The frontend should not include internal `src/` headers, should +not know `Type*`, `ObjSymId`, `CGTarget`, or `MCEmitter`, and should be able to +generate correct code for every backend supported by `CfreeTarget`. + +## 1. Goals + +- Preserve the direct-emission model: no public module/value/block IR object is + required. +- Focus on backend codegen coverage and correctness, not frontend ergonomics. +- Keep backend decisions in the backend: ABI classification, TLS sequences, + GOT/PLT/stubs/IAT, branch relaxation, relocation encoding, and section layout. +- Let frontends state facts that materially affect generated code: calling + convention, ABI attributes, memory access properties, atomics, volatility, + linkage, object placement, and source/debug identity. +- Keep the surface portable but not lowest-common-denominator. Unsupported + target combinations should be diagnosable from API calls. +- Keep public handles opaque/integer-sized and context-owned. No global state. +- Maintain one way to spell each codegen fact. + +## 2. Non-goals + +- A serialized IR, textual IR, pass manager, verifier over stored functions, or + reusable use-def graph. +- Language semantics above the codegen boundary. Borrow checking, comptime, + monomorphization, generics, trait dispatch, overload resolution, destructor + insertion, and safety checks belong in the frontend. +- Arbitrary source-language types. The frontend lowers them to codegen storage, + ABI, memory, and debug facts before calling CG. +- Unwind/exception handling beyond the existing setjmp/longjmp intrinsics. + Panic/throw paths must lower to explicit normal control flow plus calls, or + to noreturn runtime helpers. +- Full LTO. Direct CG may still feed the existing optimizer wrapper, but that is + an implementation detail below this public API. + +## 3. Current Shape + +The existing public CG API already provides useful pieces: + +- Target context through `CfreeCompiler` / `CfreeTarget`. +- Builtin integer, float, pointer, array, function, record, enum, alias, and + qualified types. +- Symbol declarations, visibility, TLS model, object definitions, relocatable + data expressions, and direct/indirect calls. +- A value stack with lvalue/rvalue conversion, local/param slots, labels, + structured scopes, arithmetic, comparisons, conversions, intrinsics, atomics, + inline asm, and varargs. + +The largest limitation is that too many important backend facts are currently +implicit, C-shaped, duplicated between type and operation APIs, or +unrepresentable. + +## 4. Type Model + +The type model should describe codegen storage and ABI classification, not +source-language semantics. A Rust `u32`, C `unsigned int`, Zig `u32`, and an +emulator's 32-bit guest register can all use the same codegen integer type. + +### 4.1 Integers + +Use width-only integer storage types. Signedness belongs on operations, +comparisons, conversions, and ABI extension attributes, not on the integer type. + +Recommended integer builtins: + +- `i1`/`bool` as the branch and compare-result type. +- `i8`, `i16`, `i32`, `i64`. +- `i128` to helper-lowered arithmetic and ABI handling for targets that lack + native support +- `isize`/`usize` are frontend aliases, not distinct codegen storage types. The + frontend can choose `i32` or `i64` from the target pointer size. + +Consequences: + +- Remove separate signed/unsigned integer type constructors or builtins. +- Keep signed/unsigned operation variants where semantics differ: + signed/unsigned div/rem/compare, sign/zero extension, arithmetic right shift + versus logical right shift. +- Constants are bit patterns interpreted by the operation that consumes them. + +### 4.2 Floating-Point + +Support only the floating storage types the backend can define and lower +correctly. + +Baseline: + +- `f32` +- `f64` + +Later additions should be explicit project choices: + +- `f16` / `bf16` if frontend SIMD/platform intrinsics need them. +- `f80` / `f128` only with target ABI and helper-call support. + +Floating arithmetic and comparisons still need operation-level attributes; see +section 6. + +### 4.3 Pointers + +Keep pointer types as codegen storage/ABI facts. Pointee types are useful for +load/store defaults and debug synthesis, but memory access semantics should +come from `CfreeCgMemAccess`, not from type qualifiers. + +Recommended pointer model: + +- One thin pointer type constructor: pointee type + address space. +- Address space 0 is the normal target data address space. +- No type-level nullability, restrict, readonly, volatile, or mutability. + Express these at the operation, declaration, or parameter-attribute site. +- Fat pointers are frontend-lowered aggregates. Capability pointers should wait + until a real target requires them. + +### 4.4 Aggregates and Layout + +Keep aggregate support only where the backend needs the aggregate shape for +correct codegen: + +- ABI classification of parameters and returns. +- Natural target layout for C-like records. +- Data object sizing/alignment. +- Debug synthesis when possible. + +Frontends can lower many patterns to existing codegen constructs. + +The gap to close is not richer source aggregate modeling. The useful backend +primitive is generic address arithmetic: + +```c +/* Pops a pointer or lvalue address, pushes address + byte_offset as a pointer + * or lvalue address with the requested result type. */ +void cfree_cg_addr_offset(CfreeCg*, int64_t byte_offset, + CfreeCgTypeId result_type); +``` + +This gives frontends one way to lower non-C layouts without asking CG to +understand the source aggregate. + +### 4.5 Qualifiers + +Remove C-style qualified codegen types as behavior-carrying types. + +- `const` is a frontend type-checking fact or an object/read-only declaration + fact. +- `volatile` is a memory access fact. +- `restrict` / `noalias` is a pointer parameter or memory access fact. + +If debug info needs source qualifiers, they belong in debug metadata derived +from declarations, not in backend codegen types. + +### 4.6 Type Queries + +Keep target-layout queries that frontends need for lowering: + +- Type kind. +- Size and alignment. +- Integer/float width. +- Pointer address space and pointee. +- Array element/count. +- Record field offset where CG owns natural record layout. +- Function ABI/calling-convention attributes. + +Avoid queries whose only purpose is reconstructing source-language types. + +## 5. Memory Access + +Memory semantics should have exactly one spelling: a memory access descriptor +on every operation that touches memory. Do not split behavior between type +qualifiers, lvalue flags, and special load/store variants. + +Recommended descriptor: + +```c +typedef struct CfreeCgMemAccess { + CfreeCgTypeId type; /* value type loaded/stored, or element type */ + uint32_t align; /* 0 = natural for type */ + uint32_t address_space; /* normally inherited from pointer type */ + uint32_t flags; /* VOLATILE, NONTEMPORAL, INVARIANT, etc. */ + uint32_t alias_scope; + uint32_t noalias_scope; +} CfreeCgMemAccess; +``` + +Recommended operations: + +```c +void cfree_cg_load(CfreeCg*, CfreeCgMemAccess access); +void cfree_cg_store(CfreeCg*, CfreeCgMemAccess access); +void cfree_cg_memcpy(CfreeCg*, uint64_t size, + CfreeCgMemAccess dst, CfreeCgMemAccess src); +void cfree_cg_memmove(CfreeCg*, uint64_t size, + CfreeCgMemAccess dst, CfreeCgMemAccess src); +void cfree_cg_memset(CfreeCg*, uint8_t value, uint64_t size, + CfreeCgMemAccess dst); +``` + +Consequences: + +- Remove type-level volatile behavior. +- Remove separate fixed-size aggregate memory APIs that take only size/align. +- Remove implicit load/store type inference when it can be ambiguous. The + access descriptor is the authority. +- Keep convenience constructors for common descriptors if desired, but not + alternate semantic entry points. + +Needed access facts: + +- Explicit alignment, including known under-alignment. +- Volatile load/store. +- Non-temporal/cache hints. +- Invariant/readonly memory for constants and promoted immutable globals. +- Alias scopes and noalias scopes. Rust `&mut`, C `restrict`, Zig `noalias`, + and frontend escape analysis can all feed this conservatively. + +## 6. Operation Semantics + +Integer and floating operations need attributes describing language semantics. + +### 6.1 Integer Ops + +Keep signedness on operations, not on types. + +Required operation families: + +- Add, sub, mul, bitwise and/or/xor. +- Signed and unsigned div/rem. +- Left shift, logical right shift, arithmetic right shift. +- Signed and unsigned comparisons. +- Sign extension, zero extension, truncation. +- Pointer/integer casts where the target permits them. + +Add operation flags: + +- No signed wrap / no unsigned wrap. +- Exact division/shift where applicable. +- Trap-on-overflow versus wrap. +- Saturating arithmetic if a frontend/runtime wants direct lowering. + +Checked arithmetic can use intrinsics that return `(result, overflow_or_ok)`. +That is a backend-relevant primitive and avoids forcing frontends to reproduce +target flag idioms manually. + +### 6.2 Floating Ops + +Add floating arithmetic; the current API can push floats and convert but cannot +fully lower C, Zig, or Rust arithmetic. + +Required: + +- Floating add/sub/mul/div/rem/neg. +- Ordered and unordered comparisons. +- Conversion between floats and integers with explicit signedness and rounding + behavior. +- Fused multiply-add intrinsic or operation. + +Attributes: + +- Strict default semantics. +- Optional fast-math flags: reassoc, no-NaNs, no-infs, no-signed-zeros, allow + reciprocal, approximate functions. +- Rounding mode and exception behavior only if strict FP support is a goal. + +### 6.3 Bitcasts + +`convert` should mean semantic conversion. Add a distinct bit-preserving +operation: + +- Scalar bitcast. +- Aggregate/vector bitcast only when size matches and the backend can lower it + as a copy/reinterpretation. + +## 7. Control Flow and Stack Values + +Add: + +- `switch` / jump table primitive with target-chosen lowering. +- Indirect branch (needed for C computed goto / interpreters) +- `unreachable` as a real terminator, not only a side-effect intrinsic. + +Do not add landing pads, cleanup edges, or exception successors unless the +project expands beyond setjmp/longjmp. + +## 8. Calls, ABI, and Function Attributes + +The function type currently carries return type, params, and ABI variadic. That +is not enough for multi-language direct codegen. + +Add: + +- Calling convention on function type or call site: target C default, SysV, + Win64, AAPCS, wasm, interrupt, and any target-specific conventions that the + backends actually implement. +- Per-function attributes: noreturn, cold, hot, naked, interrupt, stack + alignment, red-zone use, target features. +- Per-call attributes: tail policy, musttail, notail, cold. +- Per-parameter and return attributes: sret, byval, byref, inreg, noalias, + readonly, writeonly, nonnull, dereferenceable, signext, zeroext, align, + nest/context pointer. + +Avoid exception-related attributes such as `nounwind` unless they affect a +supported backend output. With no unwind model, calls either return normally or +do not return. + +`musttail` is important for languages that depend on tail calls or lower +coroutines/state machines through helper functions. It should fail +diagnostically if ABI shapes are incompatible. + +## 9. Symbols, Linkage, and Names + +The declaration API should not force C symbol mangling. C mangling is one +frontend policy, not a universal codegen rule. + +Use one name model: + +- Linkage name: exact linker-visible spelling after the frontend has applied + its language mangling and any desired object-format C decoration. +- Optional display/source name for debug info. + +Do not keep a separate "C source name" declaration path in the core CG API. If +the C frontend wants C decoration, it should call a helper before declaring the +symbol or use a C-frontend wrapper. + +Add: + +- COMDAT/linkonce/select-any groups. +- Weak/weak-odr where object formats support it. +- Section and partition attributes on functions and data. +- Constructor/destructor arrays with priority. +- Symbol versioning hooks later for ELF shared libraries. + +## 10. Data Definitions and Constants + +Keep data emission close to object bytes and relocations. That matches the +direct-codegen model and avoids a parallel constant IR. + +Needed additions: + +- Typed null pointer constants. +- Zero initializer and arbitrary bytes. +- Function/data address constants with pointer address space. +- Relocation expressions already exist; keep target-selected lowering as the + default. Add explicit policy only when the target needs a frontend-visible + distinction. +- Per-object COMDAT, alignment, section, retention, merge/string flags, and TLS + model. + +Do not add structured aggregate constants unless they are needed to avoid +incorrect backend output. Frontends can lay out aggregate initializers into +bytes plus relocations. + +## 11. Atomics and Memory Model + +The current atomics have C-like memory orders. Multi-language support needs a +few more backend-relevant details: + +- Atomic width legality query. +- Strong versus weak compare-exchange. +- Memory scope if a supported target exposes scopes beyond system-wide atomics. +- Volatile atomic distinction for languages that expose both. +- Fence sync scope if scopes are supported. + +Do not add wait/wake or futex-like primitives to core CG. They should remain +library/runtime calls unless a backend can lower them specially. + +Atomic operations should also use `CfreeCgMemAccess` so type, address space, +alignment, volatility, and alias information have the same spelling as ordinary +memory operations. + +## 12. Inline Assembly + +The existing GCC-style constraint model is a practical starting point for C and +Zig. Rust-style `asm!` needs a slightly more structured form, but only add +pieces that affect backend lowering. + +Add: + +- Explicit dialect: ATT, Intel, target default. +- Options: pure, nomem, readonly, preserves_flags, nostack, noreturn. +- Register class and explicit register operands independent from raw constraint + strings. +- Lateout/earlyclobber/tied operands. +- Target feature requirements and target arch guard. +- Clobber ABI sets such as "clobber all caller-saved". + +## 13. Dynamic Stack Allocation + +Rust and Zig generally avoid C VLAs but still need stack temporaries, alignment, +and sometimes alloca-like lowering. + +Add: + +- Local slot allocation with explicit alignment and debug/address-taken flags. +- Dynamic `alloca(size, align)` returning a pointer. +- Stack probing for large frames as a target-selected behavior, with an option + to require it where platform ABI demands it. + +## 14. Debug Information + +Debug info should ride alongside ordinary CG usage as much as possible. The +default path should not require frontends to make a second set of debug-specific +calls for every function, parameter, local, and type. + +Auto-populate debug records from existing CG calls: + +- `cfree_cg_decl` carries linkage name, display/source name, declaration attrs, + type, and current source location. This is enough to create function/global + DIE skeletons. +- `cfree_cg_func_begin` / `func_end` define function ranges. +- `cfree_cg_param_slot` carries parameter index, type, name, and current source + location. This can create parameter DIEs and initial locations. +- `cfree_cg_local_slot` carries local type, name, alignment, flags, and current + source location. This can create local variable DIEs when the name is nonzero. +- `cfree_cg_set_loc` drives line table rows for subsequent instructions and + data definitions. +- Type constructors carry enough layout information for basic debug type DIEs: + scalars, pointers, arrays, functions, and natural-layout records. + +The regular API needs a few debug-oriented fields so this works: + +- Source/display name separate from linkage name. +- Compile-unit language tag and producer string. +- Public file registration or a documented way for frontends to obtain stable + `CfreeSrcLoc.file_id` values. +- Local/param flags: artificial, address-taken, optimized-out, compiler-temp. +- Optional lexical-scope markers for frontends that want nested scopes. These + can be ordinary CG control-flow scope calls with debug names/flags rather + than a separate debug API family. + +Limits of auto-population: + +- Inlined call-site info needs explicit frontend input because ordinary CG + locations only describe the current emitted instruction. +- Optimized variable locations beyond frame slots/registers may need later + hooks from the optimizer wrapper. +- Source-language-specific debug types may need optional metadata. That metadata + should decorate normal CG types/declarations rather than replacing them with a + separate debug-only API. + +## 15. Target Capability Queries + +A portable direct CG frontend needs to ask what the selected target can lower +without guessing from enum values. + +Add queries for: + +- Legal scalar widths and floating types. +- Legal atomic widths and lock-free status. +- Supported calling conventions. +- Supported inline asm dialect/constraint families. +- Object-format features: COMDAT, weak, protected visibility, TLS models, + common symbols, merge sections, constructor priorities. +- Backend feature flags: SIMD extensions, unaligned memory support, strict + alignment, red zone, pointer authentication, branch protection. + +Capability queries should answer "can this target/API lower it correctly", not +"is this fast". + +## 16. Diagnostics and Error Model + +Most current CG misuse paths panic. That is acceptable for internal compiler +bugs, but external frontends benefit from diagnosable unsupported-feature +failures. + +Use this distinction: + +- Malformed CG usage that indicates a frontend/compiler bug may panic. +- Unsupported but well-formed target features should emit diagnostics and fail + cleanly. +- Type/call/memory descriptors should be validated early enough that bad input + does not produce partial object corruption. + +## 17. Frontend Registration + +The current `CfreeLanguage` enum is fixed. That is enough for built-ins and the +toy frontend, but not for general external language plugins. + +Add: + +- Dynamic language registration by name, default suffixes, and compile callback. +- Per-language option payload passed through `CfreeCompileOptions`, or a generic + frontend user pointer. +- A standard way for a frontend to declare whether it needs preprocessing, + debug info, or target feature strings. + +Because this API can break, the fixed enum can be removed from the generic +frontend path. Builtin C/asm can still have fast internal dispatch. + +## 18. Suggested Phasing + +### Phase 1: One Clean Codegen Contract + +- Replace signed/unsigned integer types with width-only integer types. +- Remove behavior-carrying type qualifiers. +- Make `CfreeCgMemAccess` mandatory for loads, stores, memory ops, and atomics. +- Use raw linkage names plus optional display/source names. +- Add function/call/parameter attributes with calling convention and ABI attrs. +- Add integer operation flags. +- Add explicit sign-extension, zero-extension, truncation, pointer/integer casts, + and a distinct bitcast operation. +- Add floating arithmetic, ordered/unordered comparisons, and float/integer + conversions. +- Add atomic access shape: `CfreeCgMemAccess`, strong/weak compare-exchange, and + legality/lock-free queries. +- Add target capability queries for scalar types, call convs, and object-format + symbol features. + +### Phase 2: Backend and Object Coverage Gaps + +- Generic address-offset primitive for frontend-lowered layouts. +- Switch/jump-table primitive. +- Dynamic alloca and local slot alignment/flags. +- COMDAT/groups and constructor/destructor arrays. +- Structured inline asm operands/options. + +### Phase 3: Debug and Frontend Integration + +- Complete auto debug emission from declarations, function ranges, locations, + params, locals, and type constructors. +- Compile-unit language/source registration. +- Optional lexical-scope markers through ordinary CG scopes. +- Dynamic frontend registration. + +## 19. Design Rule + +When deciding whether a feature belongs in public CG, use this test: + +- If the fact changes ABI, object contents, relocation choice, instruction + selection, memory ordering, or debug output, CG probably needs to express it. +- If the fact is source-language-only and can be fully lowered into existing + storage, calls, memory accesses, and operations, it belongs in the frontend. +- If the fact exists only to make frontend modeling easier, keep it out unless + omitting it causes incorrect backend output. +- If the fact requires whole-function analysis but does not need to be visible + to direct backends, it may belong in the optimizer wrapper rather than the + public direct-emission API. + +The goal is not to expose every internal compiler concept. The goal is to make +the direct codegen boundary honest enough that C, Zig, Rust-like languages, and +machine lifters can all lower to it without depending on internal headers or +silently losing backend-relevant semantics.