commit 04a9f5552fdff742765b8d0b2b3a1f4d72f1d38f
parent 1bf64d6836d6e075edcf407dc848c199f777f2ad
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Wed, 13 May 2026 10:59:36 -0700
cg-ext.md plan
Diffstat:
| A | doc/cg-ext.md | | | 551 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
1 file changed, 551 insertions(+), 0 deletions(-)
diff --git a/doc/cg-ext.md b/doc/cg-ext.md
@@ -0,0 +1,551 @@
+# Public CG Extension Plan
+
+Scope: extensions needed for `include/cfree/cg.h` to serve as a portable
+direct codegen API for frontends other than C. This is not a plan for a stored
+LLVM-like IR. `CfreeCg` remains an imperative emitter bound to a
+`CfreeObjBuilder`; frontends lower their own AST/HIR/MIR directly into the API.
+
+This API is new enough that compatibility with the current draft is not a
+constraint. Make breaking changes. One clean way of doing things.
+
+The target user is a language frontend with its own parser, type checker, and
+high-level lowering: C, Zig, Rust-like languages, toy languages, emulators, and
+system DSLs. The frontend should not include internal `src/` headers, should
+not know `Type*`, `ObjSymId`, `CGTarget`, or `MCEmitter`, and should be able to
+generate correct code for every backend supported by `CfreeTarget`.
+
+## 1. Goals
+
+- Preserve the direct-emission model: no public module/value/block IR object is
+ required.
+- Focus on backend codegen coverage and correctness, not frontend ergonomics.
+- Keep backend decisions in the backend: ABI classification, TLS sequences,
+ GOT/PLT/stubs/IAT, branch relaxation, relocation encoding, and section layout.
+- Let frontends state facts that materially affect generated code: calling
+ convention, ABI attributes, memory access properties, atomics, volatility,
+ linkage, object placement, and source/debug identity.
+- Keep the surface portable but not lowest-common-denominator. Unsupported
+ target combinations should be diagnosable from API calls.
+- Keep public handles opaque/integer-sized and context-owned. No global state.
+- Maintain one way to spell each codegen fact.
+
+## 2. Non-goals
+
+- A serialized IR, textual IR, pass manager, verifier over stored functions, or
+ reusable use-def graph.
+- Language semantics above the codegen boundary. Borrow checking, comptime,
+ monomorphization, generics, trait dispatch, overload resolution, destructor
+ insertion, and safety checks belong in the frontend.
+- Arbitrary source-language types. The frontend lowers them to codegen storage,
+ ABI, memory, and debug facts before calling CG.
+- Unwind/exception handling beyond the existing setjmp/longjmp intrinsics.
+ Panic/throw paths must lower to explicit normal control flow plus calls, or
+ to noreturn runtime helpers.
+- Full LTO. Direct CG may still feed the existing optimizer wrapper, but that is
+ an implementation detail below this public API.
+
+## 3. Current Shape
+
+The existing public CG API already provides useful pieces:
+
+- Target context through `CfreeCompiler` / `CfreeTarget`.
+- Builtin integer, float, pointer, array, function, record, enum, alias, and
+ qualified types.
+- Symbol declarations, visibility, TLS model, object definitions, relocatable
+ data expressions, and direct/indirect calls.
+- A value stack with lvalue/rvalue conversion, local/param slots, labels,
+ structured scopes, arithmetic, comparisons, conversions, intrinsics, atomics,
+ inline asm, and varargs.
+
+The largest limitation is that too many important backend facts are currently
+implicit, C-shaped, duplicated between type and operation APIs, or
+unrepresentable.
+
+## 4. Type Model
+
+The type model should describe codegen storage and ABI classification, not
+source-language semantics. A Rust `u32`, C `unsigned int`, Zig `u32`, and an
+emulator's 32-bit guest register can all use the same codegen integer type.
+
+### 4.1 Integers
+
+Use width-only integer storage types. Signedness belongs on operations,
+comparisons, conversions, and ABI extension attributes, not on the integer type.
+
+Recommended integer builtins:
+
+- `i1`/`bool` as the branch and compare-result type.
+- `i8`, `i16`, `i32`, `i64`.
+- `i128` to helper-lowered arithmetic and ABI handling for targets that lack
+ native support
+- `isize`/`usize` are frontend aliases, not distinct codegen storage types. The
+ frontend can choose `i32` or `i64` from the target pointer size.
+
+Consequences:
+
+- Remove separate signed/unsigned integer type constructors or builtins.
+- Keep signed/unsigned operation variants where semantics differ:
+ signed/unsigned div/rem/compare, sign/zero extension, arithmetic right shift
+ versus logical right shift.
+- Constants are bit patterns interpreted by the operation that consumes them.
+
+### 4.2 Floating-Point
+
+Support only the floating storage types the backend can define and lower
+correctly.
+
+Baseline:
+
+- `f32`
+- `f64`
+
+Later additions should be explicit project choices:
+
+- `f16` / `bf16` if frontend SIMD/platform intrinsics need them.
+- `f80` / `f128` only with target ABI and helper-call support.
+
+Floating arithmetic and comparisons still need operation-level attributes; see
+section 6.
+
+### 4.3 Pointers
+
+Keep pointer types as codegen storage/ABI facts. Pointee types are useful for
+load/store defaults and debug synthesis, but memory access semantics should
+come from `CfreeCgMemAccess`, not from type qualifiers.
+
+Recommended pointer model:
+
+- One thin pointer type constructor: pointee type + address space.
+- Address space 0 is the normal target data address space.
+- No type-level nullability, restrict, readonly, volatile, or mutability.
+ Express these at the operation, declaration, or parameter-attribute site.
+- Fat pointers are frontend-lowered aggregates. Capability pointers should wait
+ until a real target requires them.
+
+### 4.4 Aggregates and Layout
+
+Keep aggregate support only where the backend needs the aggregate shape for
+correct codegen:
+
+- ABI classification of parameters and returns.
+- Natural target layout for C-like records.
+- Data object sizing/alignment.
+- Debug synthesis when possible.
+
+Frontends can lower many patterns to existing codegen constructs.
+
+The gap to close is not richer source aggregate modeling. The useful backend
+primitive is generic address arithmetic:
+
+```c
+/* Pops a pointer or lvalue address, pushes address + byte_offset as a pointer
+ * or lvalue address with the requested result type. */
+void cfree_cg_addr_offset(CfreeCg*, int64_t byte_offset,
+ CfreeCgTypeId result_type);
+```
+
+This gives frontends one way to lower non-C layouts without asking CG to
+understand the source aggregate.
+
+### 4.5 Qualifiers
+
+Remove C-style qualified codegen types as behavior-carrying types.
+
+- `const` is a frontend type-checking fact or an object/read-only declaration
+ fact.
+- `volatile` is a memory access fact.
+- `restrict` / `noalias` is a pointer parameter or memory access fact.
+
+If debug info needs source qualifiers, they belong in debug metadata derived
+from declarations, not in backend codegen types.
+
+### 4.6 Type Queries
+
+Keep target-layout queries that frontends need for lowering:
+
+- Type kind.
+- Size and alignment.
+- Integer/float width.
+- Pointer address space and pointee.
+- Array element/count.
+- Record field offset where CG owns natural record layout.
+- Function ABI/calling-convention attributes.
+
+Avoid queries whose only purpose is reconstructing source-language types.
+
+## 5. Memory Access
+
+Memory semantics should have exactly one spelling: a memory access descriptor
+on every operation that touches memory. Do not split behavior between type
+qualifiers, lvalue flags, and special load/store variants.
+
+Recommended descriptor:
+
+```c
+typedef struct CfreeCgMemAccess {
+ CfreeCgTypeId type; /* value type loaded/stored, or element type */
+ uint32_t align; /* 0 = natural for type */
+ uint32_t address_space; /* normally inherited from pointer type */
+ uint32_t flags; /* VOLATILE, NONTEMPORAL, INVARIANT, etc. */
+ uint32_t alias_scope;
+ uint32_t noalias_scope;
+} CfreeCgMemAccess;
+```
+
+Recommended operations:
+
+```c
+void cfree_cg_load(CfreeCg*, CfreeCgMemAccess access);
+void cfree_cg_store(CfreeCg*, CfreeCgMemAccess access);
+void cfree_cg_memcpy(CfreeCg*, uint64_t size,
+ CfreeCgMemAccess dst, CfreeCgMemAccess src);
+void cfree_cg_memmove(CfreeCg*, uint64_t size,
+ CfreeCgMemAccess dst, CfreeCgMemAccess src);
+void cfree_cg_memset(CfreeCg*, uint8_t value, uint64_t size,
+ CfreeCgMemAccess dst);
+```
+
+Consequences:
+
+- Remove type-level volatile behavior.
+- Remove separate fixed-size aggregate memory APIs that take only size/align.
+- Remove implicit load/store type inference when it can be ambiguous. The
+ access descriptor is the authority.
+- Keep convenience constructors for common descriptors if desired, but not
+ alternate semantic entry points.
+
+Needed access facts:
+
+- Explicit alignment, including known under-alignment.
+- Volatile load/store.
+- Non-temporal/cache hints.
+- Invariant/readonly memory for constants and promoted immutable globals.
+- Alias scopes and noalias scopes. Rust `&mut`, C `restrict`, Zig `noalias`,
+ and frontend escape analysis can all feed this conservatively.
+
+## 6. Operation Semantics
+
+Integer and floating operations need attributes describing language semantics.
+
+### 6.1 Integer Ops
+
+Keep signedness on operations, not on types.
+
+Required operation families:
+
+- Add, sub, mul, bitwise and/or/xor.
+- Signed and unsigned div/rem.
+- Left shift, logical right shift, arithmetic right shift.
+- Signed and unsigned comparisons.
+- Sign extension, zero extension, truncation.
+- Pointer/integer casts where the target permits them.
+
+Add operation flags:
+
+- No signed wrap / no unsigned wrap.
+- Exact division/shift where applicable.
+- Trap-on-overflow versus wrap.
+- Saturating arithmetic if a frontend/runtime wants direct lowering.
+
+Checked arithmetic can use intrinsics that return `(result, overflow_or_ok)`.
+That is a backend-relevant primitive and avoids forcing frontends to reproduce
+target flag idioms manually.
+
+### 6.2 Floating Ops
+
+Add floating arithmetic; the current API can push floats and convert but cannot
+fully lower C, Zig, or Rust arithmetic.
+
+Required:
+
+- Floating add/sub/mul/div/rem/neg.
+- Ordered and unordered comparisons.
+- Conversion between floats and integers with explicit signedness and rounding
+ behavior.
+- Fused multiply-add intrinsic or operation.
+
+Attributes:
+
+- Strict default semantics.
+- Optional fast-math flags: reassoc, no-NaNs, no-infs, no-signed-zeros, allow
+ reciprocal, approximate functions.
+- Rounding mode and exception behavior only if strict FP support is a goal.
+
+### 6.3 Bitcasts
+
+`convert` should mean semantic conversion. Add a distinct bit-preserving
+operation:
+
+- Scalar bitcast.
+- Aggregate/vector bitcast only when size matches and the backend can lower it
+ as a copy/reinterpretation.
+
+## 7. Control Flow and Stack Values
+
+Add:
+
+- `switch` / jump table primitive with target-chosen lowering.
+- Indirect branch (needed for C computed goto / interpreters)
+- `unreachable` as a real terminator, not only a side-effect intrinsic.
+
+Do not add landing pads, cleanup edges, or exception successors unless the
+project expands beyond setjmp/longjmp.
+
+## 8. Calls, ABI, and Function Attributes
+
+The function type currently carries return type, params, and ABI variadic. That
+is not enough for multi-language direct codegen.
+
+Add:
+
+- Calling convention on function type or call site: target C default, SysV,
+ Win64, AAPCS, wasm, interrupt, and any target-specific conventions that the
+ backends actually implement.
+- Per-function attributes: noreturn, cold, hot, naked, interrupt, stack
+ alignment, red-zone use, target features.
+- Per-call attributes: tail policy, musttail, notail, cold.
+- Per-parameter and return attributes: sret, byval, byref, inreg, noalias,
+ readonly, writeonly, nonnull, dereferenceable, signext, zeroext, align,
+ nest/context pointer.
+
+Avoid exception-related attributes such as `nounwind` unless they affect a
+supported backend output. With no unwind model, calls either return normally or
+do not return.
+
+`musttail` is important for languages that depend on tail calls or lower
+coroutines/state machines through helper functions. It should fail
+diagnostically if ABI shapes are incompatible.
+
+## 9. Symbols, Linkage, and Names
+
+The declaration API should not force C symbol mangling. C mangling is one
+frontend policy, not a universal codegen rule.
+
+Use one name model:
+
+- Linkage name: exact linker-visible spelling after the frontend has applied
+ its language mangling and any desired object-format C decoration.
+- Optional display/source name for debug info.
+
+Do not keep a separate "C source name" declaration path in the core CG API. If
+the C frontend wants C decoration, it should call a helper before declaring the
+symbol or use a C-frontend wrapper.
+
+Add:
+
+- COMDAT/linkonce/select-any groups.
+- Weak/weak-odr where object formats support it.
+- Section and partition attributes on functions and data.
+- Constructor/destructor arrays with priority.
+- Symbol versioning hooks later for ELF shared libraries.
+
+## 10. Data Definitions and Constants
+
+Keep data emission close to object bytes and relocations. That matches the
+direct-codegen model and avoids a parallel constant IR.
+
+Needed additions:
+
+- Typed null pointer constants.
+- Zero initializer and arbitrary bytes.
+- Function/data address constants with pointer address space.
+- Relocation expressions already exist; keep target-selected lowering as the
+ default. Add explicit policy only when the target needs a frontend-visible
+ distinction.
+- Per-object COMDAT, alignment, section, retention, merge/string flags, and TLS
+ model.
+
+Do not add structured aggregate constants unless they are needed to avoid
+incorrect backend output. Frontends can lay out aggregate initializers into
+bytes plus relocations.
+
+## 11. Atomics and Memory Model
+
+The current atomics have C-like memory orders. Multi-language support needs a
+few more backend-relevant details:
+
+- Atomic width legality query.
+- Strong versus weak compare-exchange.
+- Memory scope if a supported target exposes scopes beyond system-wide atomics.
+- Volatile atomic distinction for languages that expose both.
+- Fence sync scope if scopes are supported.
+
+Do not add wait/wake or futex-like primitives to core CG. They should remain
+library/runtime calls unless a backend can lower them specially.
+
+Atomic operations should also use `CfreeCgMemAccess` so type, address space,
+alignment, volatility, and alias information have the same spelling as ordinary
+memory operations.
+
+## 12. Inline Assembly
+
+The existing GCC-style constraint model is a practical starting point for C and
+Zig. Rust-style `asm!` needs a slightly more structured form, but only add
+pieces that affect backend lowering.
+
+Add:
+
+- Explicit dialect: ATT, Intel, target default.
+- Options: pure, nomem, readonly, preserves_flags, nostack, noreturn.
+- Register class and explicit register operands independent from raw constraint
+ strings.
+- Lateout/earlyclobber/tied operands.
+- Target feature requirements and target arch guard.
+- Clobber ABI sets such as "clobber all caller-saved".
+
+## 13. Dynamic Stack Allocation
+
+Rust and Zig generally avoid C VLAs but still need stack temporaries, alignment,
+and sometimes alloca-like lowering.
+
+Add:
+
+- Local slot allocation with explicit alignment and debug/address-taken flags.
+- Dynamic `alloca(size, align)` returning a pointer.
+- Stack probing for large frames as a target-selected behavior, with an option
+ to require it where platform ABI demands it.
+
+## 14. Debug Information
+
+Debug info should ride alongside ordinary CG usage as much as possible. The
+default path should not require frontends to make a second set of debug-specific
+calls for every function, parameter, local, and type.
+
+Auto-populate debug records from existing CG calls:
+
+- `cfree_cg_decl` carries linkage name, display/source name, declaration attrs,
+ type, and current source location. This is enough to create function/global
+ DIE skeletons.
+- `cfree_cg_func_begin` / `func_end` define function ranges.
+- `cfree_cg_param_slot` carries parameter index, type, name, and current source
+ location. This can create parameter DIEs and initial locations.
+- `cfree_cg_local_slot` carries local type, name, alignment, flags, and current
+ source location. This can create local variable DIEs when the name is nonzero.
+- `cfree_cg_set_loc` drives line table rows for subsequent instructions and
+ data definitions.
+- Type constructors carry enough layout information for basic debug type DIEs:
+ scalars, pointers, arrays, functions, and natural-layout records.
+
+The regular API needs a few debug-oriented fields so this works:
+
+- Source/display name separate from linkage name.
+- Compile-unit language tag and producer string.
+- Public file registration or a documented way for frontends to obtain stable
+ `CfreeSrcLoc.file_id` values.
+- Local/param flags: artificial, address-taken, optimized-out, compiler-temp.
+- Optional lexical-scope markers for frontends that want nested scopes. These
+ can be ordinary CG control-flow scope calls with debug names/flags rather
+ than a separate debug API family.
+
+Limits of auto-population:
+
+- Inlined call-site info needs explicit frontend input because ordinary CG
+ locations only describe the current emitted instruction.
+- Optimized variable locations beyond frame slots/registers may need later
+ hooks from the optimizer wrapper.
+- Source-language-specific debug types may need optional metadata. That metadata
+ should decorate normal CG types/declarations rather than replacing them with a
+ separate debug-only API.
+
+## 15. Target Capability Queries
+
+A portable direct CG frontend needs to ask what the selected target can lower
+without guessing from enum values.
+
+Add queries for:
+
+- Legal scalar widths and floating types.
+- Legal atomic widths and lock-free status.
+- Supported calling conventions.
+- Supported inline asm dialect/constraint families.
+- Object-format features: COMDAT, weak, protected visibility, TLS models,
+ common symbols, merge sections, constructor priorities.
+- Backend feature flags: SIMD extensions, unaligned memory support, strict
+ alignment, red zone, pointer authentication, branch protection.
+
+Capability queries should answer "can this target/API lower it correctly", not
+"is this fast".
+
+## 16. Diagnostics and Error Model
+
+Most current CG misuse paths panic. That is acceptable for internal compiler
+bugs, but external frontends benefit from diagnosable unsupported-feature
+failures.
+
+Use this distinction:
+
+- Malformed CG usage that indicates a frontend/compiler bug may panic.
+- Unsupported but well-formed target features should emit diagnostics and fail
+ cleanly.
+- Type/call/memory descriptors should be validated early enough that bad input
+ does not produce partial object corruption.
+
+## 17. Frontend Registration
+
+The current `CfreeLanguage` enum is fixed. That is enough for built-ins and the
+toy frontend, but not for general external language plugins.
+
+Add:
+
+- Dynamic language registration by name, default suffixes, and compile callback.
+- Per-language option payload passed through `CfreeCompileOptions`, or a generic
+ frontend user pointer.
+- A standard way for a frontend to declare whether it needs preprocessing,
+ debug info, or target feature strings.
+
+Because this API can break, the fixed enum can be removed from the generic
+frontend path. Builtin C/asm can still have fast internal dispatch.
+
+## 18. Suggested Phasing
+
+### Phase 1: One Clean Codegen Contract
+
+- Replace signed/unsigned integer types with width-only integer types.
+- Remove behavior-carrying type qualifiers.
+- Make `CfreeCgMemAccess` mandatory for loads, stores, memory ops, and atomics.
+- Use raw linkage names plus optional display/source names.
+- Add function/call/parameter attributes with calling convention and ABI attrs.
+- Add integer operation flags.
+- Add explicit sign-extension, zero-extension, truncation, pointer/integer casts,
+ and a distinct bitcast operation.
+- Add floating arithmetic, ordered/unordered comparisons, and float/integer
+ conversions.
+- Add atomic access shape: `CfreeCgMemAccess`, strong/weak compare-exchange, and
+ legality/lock-free queries.
+- Add target capability queries for scalar types, call convs, and object-format
+ symbol features.
+
+### Phase 2: Backend and Object Coverage Gaps
+
+- Generic address-offset primitive for frontend-lowered layouts.
+- Switch/jump-table primitive.
+- Dynamic alloca and local slot alignment/flags.
+- COMDAT/groups and constructor/destructor arrays.
+- Structured inline asm operands/options.
+
+### Phase 3: Debug and Frontend Integration
+
+- Complete auto debug emission from declarations, function ranges, locations,
+ params, locals, and type constructors.
+- Compile-unit language/source registration.
+- Optional lexical-scope markers through ordinary CG scopes.
+- Dynamic frontend registration.
+
+## 19. Design Rule
+
+When deciding whether a feature belongs in public CG, use this test:
+
+- If the fact changes ABI, object contents, relocation choice, instruction
+ selection, memory ordering, or debug output, CG probably needs to express it.
+- If the fact is source-language-only and can be fully lowered into existing
+ storage, calls, memory accesses, and operations, it belongs in the frontend.
+- If the fact exists only to make frontend modeling easier, keep it out unless
+ omitting it causes incorrect backend output.
+- If the fact requires whole-function analysis but does not need to be visible
+ to direct backends, it may belong in the optimizer wrapper rather than the
+ public direct-emission API.
+
+The goal is not to expose every internal compiler concept. The goal is to make
+the direct codegen boundary honest enough that C, Zig, Rust-like languages, and
+machine lifters can all lower to it without depending on internal headers or
+silently losing backend-relevant semantics.