kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 4ae07c0721915d9b00d24e96bc6469e9eb3a38b3
parent 04a9f5552fdff742765b8d0b2b3a1f4d72f1d38f
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Wed, 13 May 2026 11:59:54 -0700

CG api updates

Diffstat:
Mdoc/cg-ext.md | 165+++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------
Minclude/cfree/cg.h | 765++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------
2 files changed, 724 insertions(+), 206 deletions(-)

diff --git a/doc/cg-ext.md b/doc/cg-ext.md @@ -44,9 +44,9 @@ generate correct code for every backend supported by `CfreeTarget`. - Full LTO. Direct CG may still feed the existing optimizer wrapper, but that is an implementation detail below this public API. -## 3. Current Shape +## 3. Pre-Phase-1 Shape -The existing public CG API already provides useful pieces: +The pre-Phase-1 public CG API already provided useful pieces: - Target context through `CfreeCompiler` / `CfreeTarget`. - Builtin integer, float, pointer, array, function, record, enum, alias, and @@ -57,9 +57,8 @@ The existing public CG API already provides useful pieces: structured scopes, arithmetic, comparisons, conversions, intrinsics, atomics, inline asm, and varargs. -The largest limitation is that too many important backend facts are currently -implicit, C-shaped, duplicated between type and operation APIs, or -unrepresentable. +The largest limitation was that too many important backend facts were implicit, +C-shaped, duplicated between type and operation APIs, or unrepresentable. ## 4. Type Model @@ -135,7 +134,7 @@ correct codegen: Frontends can lower many patterns to existing codegen constructs. The gap to close is not richer source aggregate modeling. The useful backend -primitive is generic address arithmetic: +primitive is generic address arithmetic, now part of the Phase 1 contract: ```c /* Pops a pointer or lvalue address, pushes address + byte_offset as a pointer @@ -145,7 +144,10 @@ void cfree_cg_addr_offset(CfreeCg*, int64_t byte_offset, ``` This gives frontends one way to lower non-C layouts without asking CG to -understand the source aggregate. +understand the source aggregate. `cfree_cg_index` remains the typed +scaled-index form for ordinary pointer/array indexing; `addr_offset` is the +byte-granular escape hatch for frontend-owned record layouts and packed/custom +field offsets. ### 4.5 Qualifiers @@ -218,8 +220,11 @@ Needed access facts: - Explicit alignment, including known under-alignment. - Volatile load/store. -- Non-temporal/cache hints. -- Invariant/readonly memory for constants and promoted immutable globals. +- Non-temporal/cache hints: streaming accesses unlikely to be reused soon, so + targets may select non-temporal instructions or ignore the hint. +- Invariant memory: contents known stable for the relevant program region + except through this access path. This is stronger than readonly object + placement and should be set only when the frontend can prove it. - Alias scopes and noalias scopes. Rust `&mut`, C `restrict`, Zig `noalias`, and frontend escape analysis can all feed this conservatively. @@ -244,12 +249,17 @@ Add operation flags: - No signed wrap / no unsigned wrap. - Exact division/shift where applicable. -- Trap-on-overflow versus wrap. -- Saturating arithmetic if a frontend/runtime wants direct lowering. +- Explicit signed and unsigned trap-on-overflow. Generic "overflow" is not + enough because integer types are width-only. +- Explicit signed and unsigned saturating arithmetic if a frontend/runtime + wants direct lowering. -Checked arithmetic can use intrinsics that return `(result, overflow_or_ok)`. -That is a backend-relevant primitive and avoids forcing frontends to reproduce -target flag idioms manually. +Checked arithmetic uses signed and unsigned intrinsics that return +`(result, overflow_bool)`. That is a backend-relevant primitive and avoids +forcing frontends to reproduce target flag idioms manually. + +`clz` and `ctz` have defined zero-input behavior: when the operand is zero, +the result is the operand bit width. ### 6.2 Floating Ops @@ -282,11 +292,21 @@ operation: ## 7. Control Flow and Stack Values -Add: +Phase 1 contract: - `switch` / jump table primitive with target-chosen lowering. -- Indirect branch (needed for C computed goto / interpreters) -- `unreachable` as a real terminator, not only a side-effect intrinsic. +- Computed goto through first-class function-local label-address values plus an + indirect local branch. This must support direct-threaded interpreters, where + label addresses are stored in dispatch tables, indexed by opcode, loaded, and + jumped through. Label-address data constants must be emitted while the + defining function is open, after the label handles are created; labels need + not be placed yet. Data emission is allowed inside an open function, so the + intended direct-threaded lowering is: declare the dispatch-table symbol, begin + the function, create labels, define the table contents as data while the + function remains open, then resume code emission. The value is opaque and + valid only for equality, storage/loading, table selection, and computed gotos + in the label's defining function. +- `unreachable` as a real terminator, not a side-effect intrinsic. Do not add landing pads, cleanup edges, or exception successors unless the project expands beyond setjmp/longjmp. @@ -298,9 +318,10 @@ is not enough for multi-language direct codegen. Add: -- Calling convention on function type or call site: target C default, SysV, - Win64, AAPCS, wasm, interrupt, and any target-specific conventions that the - backends actually implement. +- Calling convention on function type or call site. The common path is + backend-selected target C default; explicit SysV, Win64, AAPCS, wasm, + interrupt, and target-specific conventions are frontend requests for ABI + interop and must be supported by the selected backend or diagnosed. - Per-function attributes: noreturn, cold, hot, naked, interrupt, stack alignment, red-zone use, target features. - Per-call attributes: tail policy, musttail, notail, cold. @@ -349,6 +370,13 @@ Needed additions: - Typed null pointer constants. - Zero initializer and arbitrary bytes. - Function/data address constants with pointer address space. +- Function-local label-address constants for direct-threaded dispatch tables. + These are emitted while the defining function is open; ordinary data + definitions may be interleaved with function emission for block-scope statics + and dispatch tables. +- Enum constants are unsigned bit patterns (`uint64_t`) interpreted by the + enum's width-only integer base type; source signedness is not part of the + codegen enum type. - Relocation expressions already exist; keep target-selected lowering as the default. Add explicit policy only when the target needs a frontend-visible distinction. @@ -379,29 +407,40 @@ memory operations. ## 12. Inline Assembly -The existing GCC-style constraint model is a practical starting point for C and -Zig. Rust-style `asm!` needs a slightly more structured form, but only add -pieces that affect backend lowering. +The target constraint string is the operand contract. This is intentionally raw +because C/Zig-level inline asm needs the full target grammar: register classes, +explicit registers, immediate classes/ranges, memory/address constraints, +alternatives, matching/tied operands, earlyclobber, and target-specific +modifiers. A partial structured vocabulary would be less expressive and would +create a second spelling for facts the backend already parses from +constraints. -Add: +Phase 1 contract: -- Explicit dialect: ATT, Intel, target default. - Options: pure, nomem, readonly, preserves_flags, nostack, noreturn. -- Register class and explicit register operands independent from raw constraint - strings. -- Lateout/earlyclobber/tied operands. -- Target feature requirements and target arch guard. - Clobber ABI sets such as "clobber all caller-saved". +Later additions: + +- Target feature requirements and target arch guard. + +Phase 1 keeps template strings and raw target constraints, wrapped in +`CfreeCgInlineAsm` so asm-wide options and operand arrays have a single +descriptor. + ## 13. Dynamic Stack Allocation Rust and Zig generally avoid C VLAs but still need stack temporaries, alignment, and sometimes alloca-like lowering. -Add: +Phase 1 contract: - Local slot allocation with explicit alignment and debug/address-taken flags. +- Parameter slot allocation with the same debug/artificial/temp flags. - Dynamic `alloca(size, align)` returning a pointer. + +Later addition: + - Stack probing for large frames as a target-selected behavior, with an option to require it where platform ABI demands it. @@ -457,7 +496,7 @@ Add queries for: - Legal scalar widths and floating types. - Legal atomic widths and lock-free status. - Supported calling conventions. -- Supported inline asm dialect/constraint families. +- Supported inline asm constraint families. - Object-format features: COMDAT, weak, protected visibility, TLS models, common symbols, merge sections, constructor priorities. - Backend feature flags: SIMD extensions, unaligned memory support, strict @@ -500,28 +539,56 @@ frontend path. Builtin C/asm can still have fast internal dispatch. ### Phase 1: One Clean Codegen Contract -- Replace signed/unsigned integer types with width-only integer types. -- Remove behavior-carrying type qualifiers. -- Make `CfreeCgMemAccess` mandatory for loads, stores, memory ops, and atomics. -- Use raw linkage names plus optional display/source names. -- Add function/call/parameter attributes with calling convention and ABI attrs. -- Add integer operation flags. -- Add explicit sign-extension, zero-extension, truncation, pointer/integer casts, - and a distinct bitcast operation. -- Add floating arithmetic, ordered/unordered comparisons, and float/integer - conversions. -- Add atomic access shape: `CfreeCgMemAccess`, strong/weak compare-exchange, and - legality/lock-free queries. -- Add target capability queries for scalar types, call convs, and object-format - symbol features. +Status: public contract defined in `include/cfree/cg.h`. Implementation and +call-site migration are intentionally separate work. + +Phase 1 makes these breaking API choices: + +- Builtin integer types are width-only: `bool`/`i1`, `i8`, `i16`, `i32`, `i64`, + and `i128`. Signedness exists only on integer operations, comparisons, + conversions, and ABI extension attributes. +- Behavior-carrying qualified types are removed. `const` is an object/debug + fact, `volatile` is a memory-access fact, and `restrict`/`noalias` is an ABI + or memory-access fact. +- Pointer types carry pointee type plus address space. Address space 0 is the + normal target data address space. +- Generic byte-address offset is included for frontend-owned aggregate layouts. +- Function types are built from `CfreeCgFuncSig`: return type/attrs, + parameter type/attrs, calling convention, and ABI variadic bit. +- Declarations use exact raw linkage names plus optional display/source names. + CG does not apply C symbol spelling policy. +- `CfreeCgMemAccess` is the only way to spell memory semantics for loads, + stores, fixed-size memory ops, and atomics. +- Integer operations are split from floating operations and accept explicit + operation flags such as no-wrap, exact, signed/unsigned trap-on-overflow, and + signed/unsigned saturation. +- Semantic conversions are explicit: sign extension, zero extension, + truncation, pointer/integer casts, float extension/truncation, float/integer + conversions with rounding, and a distinct bitcast operation. +- Floating arithmetic and ordered/unordered comparisons are first-class API + operations, with strict defaults and optional fast-math flags. +- Calls use `CfreeCgCallAttrs` for tail policy and call-site flags. `musttail` + is represented as a contract the backend must accept or diagnose. +- Intrinsics include the backend primitives assumed by + `rt/include/cfree/{syscall,baremetal,coro}.h`. +- Atomics take `CfreeCgMemAccess`, include strong/weak compare-exchange, and + expose legality and lock-free capability queries. +- Target capability queries cover scalar type support, calling conventions, and + object-format symbol features. +- Inline assembly uses raw target constraints as the canonical operand contract. +- Switch/jump-table, computed goto, and unreachable terminator are explicit + control-flow operations. +- Dynamic alloca and local/parameter slot attributes are explicit stack-slot + operations. +- Inline assembly includes ABI clobber sets. +- Backend feature flags are queryable. +- Data address constants carry pointer address space. ### Phase 2: Backend and Object Coverage Gaps -- Generic address-offset primitive for frontend-lowered layouts. -- Switch/jump-table primitive. -- Dynamic alloca and local slot alignment/flags. - COMDAT/groups and constructor/destructor arrays. -- Structured inline asm operands/options. +- Stack probe requirement/request for large frames. +- More complete inline asm target-feature guards. ### Phase 3: Debug and Frontend Integration diff --git a/include/cfree/cg.h b/include/cfree/cg.h @@ -22,22 +22,17 @@ typedef uint32_t CfreeCgTypeId; #define CFREE_CG_TYPE_NONE 0u /* ============================================================ - * Types + * Types, ABI, and Target Capabilities * ============================================================ */ typedef enum CfreeCgBuiltinType { CFREE_CG_BUILTIN_VOID, - CFREE_CG_BUILTIN_BOOL, + CFREE_CG_BUILTIN_BOOL, /* i1: compare result and branch condition */ CFREE_CG_BUILTIN_I8, - CFREE_CG_BUILTIN_U8, CFREE_CG_BUILTIN_I16, - CFREE_CG_BUILTIN_U16, CFREE_CG_BUILTIN_I32, - CFREE_CG_BUILTIN_U32, CFREE_CG_BUILTIN_I64, - CFREE_CG_BUILTIN_U64, - CFREE_CG_BUILTIN_ISIZE, - CFREE_CG_BUILTIN_USIZE, + CFREE_CG_BUILTIN_I128, CFREE_CG_BUILTIN_F32, CFREE_CG_BUILTIN_F64, CFREE_CG_BUILTIN_VARARG_STATE, @@ -48,11 +43,67 @@ typedef struct CfreeCgBuiltinTypes { CfreeCgTypeId id[CFREE_CG_BUILTIN_COUNT]; } CfreeCgBuiltinTypes; -typedef enum CfreeCgTypeQual { - CFREE_CG_TQ_CONST = 1u << 0, - CFREE_CG_TQ_VOLATILE = 1u << 1, - CFREE_CG_TQ_RESTRICT = 1u << 2, -} CfreeCgTypeQual; +typedef enum CfreeCgTypeKind { + CFREE_CG_TYPE_VOID, + CFREE_CG_TYPE_BOOL, + CFREE_CG_TYPE_INT, + CFREE_CG_TYPE_FLOAT, + CFREE_CG_TYPE_PTR, + CFREE_CG_TYPE_ARRAY, + CFREE_CG_TYPE_FUNC, + CFREE_CG_TYPE_RECORD, + CFREE_CG_TYPE_ENUM, + CFREE_CG_TYPE_ALIAS, + CFREE_CG_TYPE_VARARG_STATE, +} CfreeCgTypeKind; + +typedef enum CfreeCgCallConv { + /* Backend-selected C ABI for the target triple. Frontends should use this + * unless source semantics or ABI interop explicitly require another + * convention. Non-default values are requests that must be supported by the + * selected backend or diagnosed. */ + CFREE_CG_CC_TARGET_C, + CFREE_CG_CC_SYSV, + CFREE_CG_CC_WIN64, + CFREE_CG_CC_AAPCS, + CFREE_CG_CC_WASM, + CFREE_CG_CC_INTERRUPT, +} CfreeCgCallConv; + +typedef enum CfreeCgAbiAttrFlag { + CFREE_CG_ABI_NONE = 0, + CFREE_CG_ABI_SIGNEXT = 1u << 0, + CFREE_CG_ABI_ZEROEXT = 1u << 1, + CFREE_CG_ABI_SRET = 1u << 2, + CFREE_CG_ABI_BYVAL = 1u << 3, + CFREE_CG_ABI_BYREF = 1u << 4, + CFREE_CG_ABI_INREG = 1u << 5, + CFREE_CG_ABI_NOALIAS = 1u << 6, + CFREE_CG_ABI_READONLY = 1u << 7, + CFREE_CG_ABI_WRITEONLY = 1u << 8, + CFREE_CG_ABI_NONNULL = 1u << 9, + CFREE_CG_ABI_NEST = 1u << 10, +} CfreeCgAbiAttrFlag; + +typedef struct CfreeCgAbiAttrs { + uint32_t flags; /* CfreeCgAbiAttrFlag */ + uint32_t align; /* 0 = ABI default */ + uint64_t dereferenceable_size; +} CfreeCgAbiAttrs; + +typedef struct CfreeCgParam { + CfreeCgTypeId type; + CfreeCgAbiAttrs attrs; +} CfreeCgParam; + +typedef struct CfreeCgFuncSig { + CfreeCgTypeId ret; + CfreeCgAbiAttrs ret_attrs; + const CfreeCgParam* params; + uint32_t nparams; + CfreeCgCallConv call_conv; + int abi_variadic; +} CfreeCgFuncSig; typedef struct CfreeCgField { CfreeSym name; /* 0 for anonymous fields/tuple elements */ @@ -62,26 +113,26 @@ typedef struct CfreeCgField { typedef struct CfreeCgEnumValue { CfreeSym name; - int64_t value; + uint64_t value; /* bit pattern interpreted using the enum's integer base */ } CfreeCgEnumValue; -/* Builtin ids are stable for the compiler. Pointer, array, qualified, and - * function constructors return a stable id for the same shape within one - * compiler; aliases, records, and enums allocate fresh user-facing - * identities. */ +/* Builtin ids are stable for the compiler. Pointer, array, and function + * constructors return a stable id for the same shape within one compiler; + * aliases, records, and enums allocate fresh user-facing identities. + * + * Integer types are width-only storage types. Signedness is carried by + * operations, comparisons, conversions, and ABI extension attributes. */ CfreeCgBuiltinTypes cfree_cg_builtin_types(CfreeCompiler*); -/* Interned structural types. */ -CfreeCgTypeId cfree_cg_type_func(CfreeCompiler*, CfreeCgTypeId ret, - const CfreeCgTypeId* params, uint32_t nparams, - int abi_variadic); -CfreeCgTypeId cfree_cg_type_ptr(CfreeCompiler*, CfreeCgTypeId pointee); +/* Interned structural types. Address space 0 is the normal target data + * address space. */ +CfreeCgTypeId cfree_cg_type_func(CfreeCompiler*, CfreeCgFuncSig sig); +CfreeCgTypeId cfree_cg_type_ptr(CfreeCompiler*, CfreeCgTypeId pointee, + uint32_t address_space); CfreeCgTypeId cfree_cg_type_array(CfreeCompiler*, CfreeCgTypeId elem, - uint32_t count); -CfreeCgTypeId cfree_cg_type_qualified(CfreeCompiler*, CfreeCgTypeId base, - uint32_t quals); + uint64_t count); -/* Fresh nominal/source-facing types. */ +/* Fresh nominal/source-facing types. Enums use a width-only integer base. */ CfreeCgTypeId cfree_cg_type_alias(CfreeCompiler*, CfreeSym name, CfreeCgTypeId base); CfreeCgTypeId cfree_cg_type_record(CfreeCompiler*, CfreeSym tag, @@ -92,23 +143,89 @@ CfreeCgTypeId cfree_cg_type_enum(CfreeCompiler*, CfreeSym tag, const CfreeCgEnumValue* values, uint32_t nvalues); -/* Type queries. */ +/* Type queries. These report codegen storage, ABI, and target layout facts. */ +CfreeCgTypeKind cfree_cg_type_kind(CfreeCompiler*, CfreeCgTypeId); uint64_t cfree_cg_type_size(CfreeCompiler*, CfreeCgTypeId); uint32_t cfree_cg_type_align(CfreeCompiler*, CfreeCgTypeId); - -int cfree_cg_type_is_ptr(CfreeCompiler*, CfreeCgTypeId); -int cfree_cg_type_is_func(CfreeCompiler*, CfreeCgTypeId); -int cfree_cg_type_is_record(CfreeCompiler*, CfreeCgTypeId); +uint32_t cfree_cg_type_int_width(CfreeCompiler*, CfreeCgTypeId); +uint32_t cfree_cg_type_float_width(CfreeCompiler*, CfreeCgTypeId); CfreeCgTypeId cfree_cg_type_ptr_pointee(CfreeCompiler*, CfreeCgTypeId); +uint32_t cfree_cg_type_ptr_address_space(CfreeCompiler*, CfreeCgTypeId); +CfreeCgTypeId cfree_cg_type_array_elem(CfreeCompiler*, CfreeCgTypeId); +uint64_t cfree_cg_type_array_count(CfreeCompiler*, CfreeCgTypeId); + CfreeCgTypeId cfree_cg_type_func_ret(CfreeCompiler*, CfreeCgTypeId); +CfreeCgAbiAttrs cfree_cg_type_func_ret_attrs(CfreeCompiler*, CfreeCgTypeId); uint32_t cfree_cg_type_func_nparams(CfreeCompiler*, CfreeCgTypeId); -CfreeCgTypeId cfree_cg_type_func_param(CfreeCompiler*, CfreeCgTypeId, - uint32_t index); +CfreeCgParam cfree_cg_type_func_param(CfreeCompiler*, CfreeCgTypeId, + uint32_t index); +CfreeCgCallConv cfree_cg_type_func_call_conv(CfreeCompiler*, CfreeCgTypeId); +int cfree_cg_type_func_is_variadic(CfreeCompiler*, CfreeCgTypeId); uint32_t cfree_cg_type_record_nfields(CfreeCompiler*, CfreeCgTypeId); int cfree_cg_type_record_field(CfreeCompiler*, CfreeCgTypeId, uint32_t index, - CfreeCgField* out); + CfreeCgField* out, uint64_t* offset_out); + +typedef enum CfreeCgSymbolFeature { + CFREE_CG_SYMFEAT_WEAK, + CFREE_CG_SYMFEAT_PROTECTED_VISIBILITY, + CFREE_CG_SYMFEAT_DLLIMPORT, + CFREE_CG_SYMFEAT_DLLEXPORT, + CFREE_CG_SYMFEAT_COMDAT, + CFREE_CG_SYMFEAT_COMMON, + CFREE_CG_SYMFEAT_MERGE_SECTIONS, + CFREE_CG_SYMFEAT_CONSTRUCTOR_PRIORITY, + CFREE_CG_SYMFEAT_TLS_LOCAL_EXEC, + CFREE_CG_SYMFEAT_TLS_INITIAL_EXEC, + CFREE_CG_SYMFEAT_TLS_LOCAL_DYNAMIC, + CFREE_CG_SYMFEAT_TLS_GENERAL_DYNAMIC, +} CfreeCgSymbolFeature; + +typedef enum CfreeCgBackendFeatureFlag { + CFREE_CG_BACKEND_UNALIGNED_MEMORY = 1ull << 0, + CFREE_CG_BACKEND_STRICT_ALIGNMENT = 1ull << 1, + CFREE_CG_BACKEND_RED_ZONE = 1ull << 2, + CFREE_CG_BACKEND_SIMD = 1ull << 3, + CFREE_CG_BACKEND_POINTER_AUTH = 1ull << 4, + CFREE_CG_BACKEND_BRANCH_PROTECTION = 1ull << 5, +} CfreeCgBackendFeatureFlag; + +/* Capability queries answer whether the selected target/API can lower the + * requested feature correctly, not whether it is fast. These are target + * facts, not knobs: frontends use them to choose a legal lowering or to emit + * an unsupported-feature diagnostic before asking CG to produce output. */ +int cfree_cg_target_supports_call_conv(CfreeCompiler*, CfreeCgCallConv); +int cfree_cg_target_supports_symbol_feature(CfreeCompiler*, + CfreeCgSymbolFeature); +uint64_t cfree_cg_target_backend_features(CfreeCompiler*); + +/* ============================================================ + * Memory Access + * ============================================================ */ + +typedef enum CfreeCgMemAccessFlag { + CFREE_CG_MEM_NONE = 0, + /* Access is an externally observable side effect and must not be merged, + * removed, or reordered across other volatile accesses. */ + CFREE_CG_MEM_VOLATILE = 1u << 0, + /* Streaming/cache hint: the access is unlikely to be reused soon. Targets + * may select non-temporal load/store instructions or ignore the hint. */ + CFREE_CG_MEM_NONTEMPORAL = 1u << 1, + /* The pointed-to contents are known not to change for the relevant program + * region except through this access path. This is stronger than readonly + * object placement and should be set only when the frontend can prove it. */ + CFREE_CG_MEM_INVARIANT = 1u << 2, +} CfreeCgMemAccessFlag; + +typedef struct CfreeCgMemAccess { + CfreeCgTypeId type; /* value type loaded/stored, or element type */ + uint32_t align; /* 0 = natural for type */ + uint32_t address_space; /* normally inherited from pointer type */ + uint32_t flags; /* CfreeCgMemAccessFlag */ + uint32_t alias_scope; + uint32_t noalias_scope; +} CfreeCgMemAccess; /* ============================================================ * Declarations and Symbols @@ -141,10 +258,18 @@ typedef enum CfreeCgFuncFlag { CFREE_CG_FUNC_NONE = 0, CFREE_CG_FUNC_NORETURN = 1u << 0, CFREE_CG_FUNC_IFUNC = 1u << 1, + CFREE_CG_FUNC_COLD = 1u << 2, + CFREE_CG_FUNC_HOT = 1u << 3, + CFREE_CG_FUNC_NAKED = 1u << 4, + CFREE_CG_FUNC_INTERRUPT = 1u << 5, + CFREE_CG_FUNC_NO_RED_ZONE = 1u << 6, } CfreeCgFuncFlag; typedef struct CfreeCgFuncAttrs { - uint32_t flags; /* CfreeCgFuncFlag */ + uint32_t flags; /* CfreeCgFuncFlag */ + uint32_t stack_align; /* 0 = ABI default */ + CfreeSym section; /* 0 = target default */ + CfreeSym target_features; } CfreeCgFuncAttrs; typedef enum CfreeCgTlsModel { @@ -169,6 +294,8 @@ typedef enum CfreeCgObjectFlag { typedef struct CfreeCgObjectAttrs { CfreeCgTlsModel tls_model; uint32_t flags; /* CfreeCgObjectFlag */ + CfreeSym section; /* 0 = target default */ + uint32_t align; /* 0 = natural */ } CfreeCgObjectAttrs; typedef enum CfreeCgDeclKind { @@ -176,30 +303,37 @@ typedef enum CfreeCgDeclKind { CFREE_CG_DECL_OBJECT, } CfreeCgDeclKind; -typedef struct CfreeCgDeclAttrs { +typedef struct CfreeCgDecl { CfreeCgDeclKind kind; + CfreeSym linkage_name; /* exact linker-visible spelling */ + CfreeSym display_name; /* optional source/debug spelling; 0 = linkage_name */ + CfreeCgTypeId type; CfreeCgSymbolAttrs sym; union { CfreeCgFuncAttrs func; CfreeCgObjectAttrs object; } as; -} CfreeCgDeclAttrs; +} CfreeCgDecl; + +typedef struct CfreeCgAlias { + CfreeSym linkage_name; + CfreeSym display_name; /* optional source/debug spelling; 0 = linkage_name */ + CfreeCgSym target; + CfreeCgSymbolAttrs sym; +} CfreeCgAlias; /* The declared type is the function type for function declarations and the - * object type for object declarations. A nonzero name is the linkage name - * before target object-format decoration; frontends should uniquify internal - * symbols whose source spelling is not link-unique. + * object type for object declarations. linkage_name is already mangled and + * object-format decorated as desired by the frontend; CG does not apply a + * C-language name policy. * * Undefined weak references are ordinary declarations with sym.bind = * CFREE_SB_WEAK and no definition. Weak aliases are aliases whose attrs bind * is CFREE_SB_WEAK. */ -CfreeCgSym cfree_cg_decl(CfreeCg*, CfreeSym name, CfreeCgTypeId type, - CfreeCgDeclAttrs attrs); +CfreeCgSym cfree_cg_decl(CfreeCg*, CfreeCgDecl decl); -/* Defines alias_name as another symbol for target. attrs supplies the alias - * symbol's binding, visibility, and platform export/import flags. */ -CfreeCgSym cfree_cg_alias(CfreeCg*, CfreeSym alias_name, CfreeCgSym target, - CfreeCgSymbolAttrs attrs); +/* Defines alias.linkage_name as another symbol for alias.target. */ +CfreeCgSym cfree_cg_alias(CfreeCg*, CfreeCgAlias alias); /* ============================================================ * Lifecycle and Source Locations @@ -219,9 +353,28 @@ void cfree_cg_set_loc(CfreeCg*, CfreeSrcLoc); void cfree_cg_func_begin(CfreeCg*, CfreeCgSym sym); void cfree_cg_func_end(CfreeCg*); -CfreeCgSlot cfree_cg_local_slot(CfreeCg*, CfreeCgTypeId type, CfreeSym name); +typedef enum CfreeCgSlotFlag { + CFREE_CG_SLOTFLAG_NONE = 0, + CFREE_CG_SLOT_ADDRESS_TAKEN = 1u << 0, + CFREE_CG_SLOT_ARTIFICIAL = 1u << 1, + CFREE_CG_SLOT_OPTIMIZED_OUT = 1u << 2, + CFREE_CG_SLOT_COMPILER_TEMP = 1u << 3, +} CfreeCgSlotFlag; + +typedef struct CfreeCgSlotAttrs { + CfreeSym name; + uint32_t align; /* 0 = natural */ + uint32_t flags; /* CfreeCgSlotFlag */ +} CfreeCgSlotAttrs; + +CfreeCgSlot cfree_cg_local_slot(CfreeCg*, CfreeCgTypeId type, + CfreeCgSlotAttrs attrs); CfreeCgSlot cfree_cg_param_slot(CfreeCg*, uint32_t index, CfreeCgTypeId type, - CfreeSym name); + CfreeCgSlotAttrs attrs); + +/* Pops a byte size and pushes a pointer to stack storage with at least align + * alignment. The allocation lifetime is the current function activation. */ +void cfree_cg_alloca(CfreeCg*, uint32_t align, CfreeCgTypeId result_ptr_type); /* ============================================================ * Control flow @@ -258,6 +411,46 @@ void cfree_cg_jump(CfreeCg*, CfreeCgLabel); void cfree_cg_branch_true(CfreeCg*, CfreeCgLabel); void cfree_cg_branch_false(CfreeCg*, CfreeCgLabel); +typedef struct CfreeCgSwitchCase { + uint64_t value; /* bit pattern interpreted using selector_type */ + CfreeCgLabel label; +} CfreeCgSwitchCase; + +typedef enum CfreeCgSwitchHint { + CFREE_CG_SWITCH_TARGET_DEFAULT, + CFREE_CG_SWITCH_BRANCH_CHAIN, + CFREE_CG_SWITCH_JUMP_TABLE, +} CfreeCgSwitchHint; + +typedef struct CfreeCgSwitch { + CfreeCgTypeId selector_type; + CfreeCgLabel default_label; + const CfreeCgSwitchCase* cases; + uint32_t ncases; + CfreeCgSwitchHint hint; +} CfreeCgSwitch; + +/* Pops an integer selector and branches to the matching case or default. The + * target may ignore hint when another lowering is required for correctness. */ +void cfree_cg_switch(CfreeCg*, CfreeCgSwitch sw); + +/* Pushes the address of a label in the current function. Label addresses are + * first-class pointer values for direct-threaded interpreters: they may be + * stored, loaded, selected from tables, compared for equality, and consumed by + * cfree_cg_computed_goto. They are only valid within the defining function's + * dynamic activation and must not be called or dereferenced as data. */ +void cfree_cg_push_label_addr(CfreeCg*, CfreeCgLabel, CfreeCgTypeId ptr_type); + +/* Pops a label address and branches to it. valid_targets may be NULL when the + * frontend cannot enumerate them, but providing it lets targets validate and + * apply branch-protection lowering. */ +void cfree_cg_computed_goto(CfreeCg*, const CfreeCgLabel* valid_targets, + uint32_t ntargets); + +/* Terminates the current block with unreachable code. This is a real + * terminator, not a side-effect intrinsic. */ +void cfree_cg_unreachable(CfreeCg*); + /* ============================================================ * Value Stack and Lvalues * ============================================================ */ @@ -269,6 +462,7 @@ void cfree_cg_rot3(CfreeCg*); /* [..., a, b, c] -> [..., b, c, a] */ void cfree_cg_push_int(CfreeCg*, uint64_t value, CfreeCgTypeId type); void cfree_cg_push_float(CfreeCg*, double value, CfreeCgTypeId type); +void cfree_cg_push_null(CfreeCg*, CfreeCgTypeId ptr_type); void cfree_cg_push_local(CfreeCg*, CfreeCgSlot slot); /* Anonymous immutable data. Returns a local readonly object symbol; callers @@ -288,10 +482,16 @@ void cfree_cg_push_symbol_addr(CfreeCg*, CfreeCgSym sym, int64_t addend); * indirect lvalue. */ void cfree_cg_push_symbol_lvalue(CfreeCg*, CfreeCgSym sym, int64_t addend); -/* Computes base + offset + index * elemsz and pushes the element lvalue. - * Stack is [base, index]. elemsz is inferred from the base pointer/array - * type; index may be a constant produced by cfree_cg_push_int. */ -void cfree_cg_index(CfreeCg*, uint32_t offset); +/* Pops a pointer rvalue or lvalue address and pushes address + byte_offset as + * the requested result pointer/lvalue type. This is the generic primitive for + * frontend-owned aggregate layouts and non-standard record field offsets. */ +void cfree_cg_addr_offset(CfreeCg*, int64_t byte_offset, + CfreeCgTypeId result_type); + +/* Computes base + offset + index * element-size and pushes the element lvalue. + * Stack is [base, index]. The element size comes from the base pointer/array + * type and the access descriptor used by the eventual memory operation. */ +void cfree_cg_index(CfreeCg*, uint64_t offset); /* Pops a record lvalue and pushes the field lvalue. Offset is inferred from * the record type and field_index. Use cfree_cg_addr after this when an @@ -300,9 +500,9 @@ void cfree_cg_field(CfreeCg*, uint32_t field_index); /* Converts a pointer rvalue TOS from *T to an lvalue T. */ void cfree_cg_indirect(CfreeCg*); -void cfree_cg_load(CfreeCg*); +void cfree_cg_load(CfreeCg*, CfreeCgMemAccess access); void cfree_cg_addr(CfreeCg*); -void cfree_cg_store(CfreeCg*); /* [..., lv, rv] -> [] */ +void cfree_cg_store(CfreeCg*, CfreeCgMemAccess access); /* [lv, rv] -> [] */ /* ============================================================ * ABI variadic argument access @@ -321,92 +521,245 @@ void cfree_cg_vararg_end(CfreeCg*); /* pop &state */ void cfree_cg_vararg_copy(CfreeCg*); /* pop &dst, &src */ /* ============================================================ - * Operators, Calls, Intrinsics, and Atomics + * Integer Operations + * ============================================================ */ + +typedef enum CfreeCgIntBinOp { + CFREE_CG_INT_ADD, + CFREE_CG_INT_SUB, + CFREE_CG_INT_MUL, + CFREE_CG_INT_SDIV, + CFREE_CG_INT_UDIV, + CFREE_CG_INT_SREM, + CFREE_CG_INT_UREM, + CFREE_CG_INT_AND, + CFREE_CG_INT_OR, + CFREE_CG_INT_XOR, + CFREE_CG_INT_SHL, + CFREE_CG_INT_LSHR, + CFREE_CG_INT_ASHR, +} CfreeCgIntBinOp; + +typedef enum CfreeCgIntOpFlag { + CFREE_CG_INTOP_NONE = 0, + CFREE_CG_INTOP_NSW = 1u << 0, + CFREE_CG_INTOP_NUW = 1u << 1, + CFREE_CG_INTOP_EXACT = 1u << 2, + /* Overflow semantics are explicit because integer types are width-only. + * Signed and unsigned trap/saturate flags are mutually exclusive. */ + CFREE_CG_INTOP_TRAP_SIGNED_OVERFLOW = 1u << 3, + CFREE_CG_INTOP_TRAP_UNSIGNED_OVERFLOW = 1u << 4, + CFREE_CG_INTOP_SATURATE_SIGNED = 1u << 5, + CFREE_CG_INTOP_SATURATE_UNSIGNED = 1u << 6, +} CfreeCgIntOpFlag; + +typedef enum CfreeCgIntCmpOp { + CFREE_CG_INT_EQ, + CFREE_CG_INT_NE, + CFREE_CG_INT_LT_S, + CFREE_CG_INT_LE_S, + CFREE_CG_INT_GT_S, + CFREE_CG_INT_GE_S, + CFREE_CG_INT_LT_U, + CFREE_CG_INT_LE_U, + CFREE_CG_INT_GT_U, + CFREE_CG_INT_GE_U, +} CfreeCgIntCmpOp; + +typedef enum CfreeCgIntUnOp { + CFREE_CG_INT_NEG, + CFREE_CG_INT_NOT, + CFREE_CG_INT_BNOT, +} CfreeCgIntUnOp; + +void cfree_cg_int_binop(CfreeCg*, CfreeCgIntBinOp, uint32_t flags); +void cfree_cg_int_unop(CfreeCg*, CfreeCgIntUnOp, uint32_t flags); +void cfree_cg_int_cmp(CfreeCg*, CfreeCgIntCmpOp); + +/* ============================================================ + * Floating-Point Operations + * ============================================================ */ + +typedef enum CfreeCgFpBinOp { + CFREE_CG_FP_ADD, + CFREE_CG_FP_SUB, + CFREE_CG_FP_MUL, + CFREE_CG_FP_DIV, + CFREE_CG_FP_REM, +} CfreeCgFpBinOp; + +typedef enum CfreeCgFpCmpOp { + CFREE_CG_FP_OEQ, + CFREE_CG_FP_ONE, + CFREE_CG_FP_OLT, + CFREE_CG_FP_OLE, + CFREE_CG_FP_OGT, + CFREE_CG_FP_OGE, + CFREE_CG_FP_UEQ, + CFREE_CG_FP_UNE, + CFREE_CG_FP_ULT, + CFREE_CG_FP_ULE, + CFREE_CG_FP_UGT, + CFREE_CG_FP_UGE, +} CfreeCgFpCmpOp; + +typedef enum CfreeCgFpUnOp { + CFREE_CG_FP_NEG, +} CfreeCgFpUnOp; + +typedef enum CfreeCgFpFlag { + CFREE_CG_FP_NONE = 0, + CFREE_CG_FP_REASSOC = 1u << 0, + CFREE_CG_FP_NO_NANS = 1u << 1, + CFREE_CG_FP_NO_INFS = 1u << 2, + CFREE_CG_FP_NO_SIGNED_ZEROS = 1u << 3, + CFREE_CG_FP_ALLOW_RECIP = 1u << 4, + CFREE_CG_FP_APPROX = 1u << 5, +} CfreeCgFpFlag; + +void cfree_cg_fp_binop(CfreeCg*, CfreeCgFpBinOp, uint32_t flags); +void cfree_cg_fp_unop(CfreeCg*, CfreeCgFpUnOp, uint32_t flags); +void cfree_cg_fp_cmp(CfreeCg*, CfreeCgFpCmpOp); + +/* ============================================================ + * Conversions + * ============================================================ */ + +typedef enum CfreeCgRounding { + CFREE_CG_ROUND_DEFAULT, + CFREE_CG_ROUND_NEAREST_EVEN, + CFREE_CG_ROUND_TOWARD_ZERO, + CFREE_CG_ROUND_DOWN, + CFREE_CG_ROUND_UP, +} CfreeCgRounding; + +void cfree_cg_sext(CfreeCg*, CfreeCgTypeId dst); +void cfree_cg_zext(CfreeCg*, CfreeCgTypeId dst); +void cfree_cg_trunc(CfreeCg*, CfreeCgTypeId dst); +void cfree_cg_ptr_to_int(CfreeCg*, CfreeCgTypeId dst); +void cfree_cg_int_to_ptr(CfreeCg*, CfreeCgTypeId dst); +void cfree_cg_bitcast(CfreeCg*, CfreeCgTypeId dst); +void cfree_cg_fpext(CfreeCg*, CfreeCgTypeId dst); +void cfree_cg_fptrunc(CfreeCg*, CfreeCgTypeId dst); +void cfree_cg_sint_to_float(CfreeCg*, CfreeCgTypeId dst, + CfreeCgRounding rounding); +void cfree_cg_uint_to_float(CfreeCg*, CfreeCgTypeId dst, + CfreeCgRounding rounding); +void cfree_cg_float_to_sint(CfreeCg*, CfreeCgTypeId dst, + CfreeCgRounding rounding); +void cfree_cg_float_to_uint(CfreeCg*, CfreeCgTypeId dst, + CfreeCgRounding rounding); + +/* ============================================================ + * Calls and Returns * ============================================================ */ -typedef enum CfreeCgBinOp { - CFREE_CG_ADD, - CFREE_CG_SUB, - CFREE_CG_MUL, - CFREE_CG_SDIV, - CFREE_CG_UDIV, - CFREE_CG_SREM, - CFREE_CG_UREM, - CFREE_CG_AND, - CFREE_CG_OR, - CFREE_CG_XOR, - CFREE_CG_SHL, - CFREE_CG_SHR_S, - CFREE_CG_SHR_U, -} CfreeCgBinOp; - -typedef enum CfreeCgCmpOp { - CFREE_CG_EQ, - CFREE_CG_NE, - CFREE_CG_LT_S, - CFREE_CG_LE_S, - CFREE_CG_GT_S, - CFREE_CG_GE_S, - CFREE_CG_LT_U, - CFREE_CG_LE_U, - CFREE_CG_GT_U, - CFREE_CG_GE_U, -} CfreeCgCmpOp; - -typedef enum CfreeCgUnOp { - CFREE_CG_NEG, - CFREE_CG_NOT, - CFREE_CG_BNOT, -} CfreeCgUnOp; - -void cfree_cg_binop(CfreeCg*, CfreeCgBinOp); -void cfree_cg_unop(CfreeCg*, CfreeCgUnOp); -void cfree_cg_cmp(CfreeCg*, CfreeCgCmpOp); -void cfree_cg_convert(CfreeCg*, CfreeCgTypeId dst); +typedef enum CfreeCgTailPolicy { + CFREE_CG_TAIL_DEFAULT, + CFREE_CG_TAIL_ALLOWED, + CFREE_CG_TAIL_MUST, + CFREE_CG_TAIL_NEVER, +} CfreeCgTailPolicy; + +typedef enum CfreeCgCallFlag { + CFREE_CG_CALL_NONE = 0, + CFREE_CG_CALL_COLD = 1u << 0, +} CfreeCgCallFlag; + +typedef struct CfreeCgCallAttrs { + CfreeCgTailPolicy tail; + uint32_t flags; /* CfreeCgCallFlag */ +} CfreeCgCallAttrs; /* cfree_cg_call pops a computed function pointer plus nargs arguments. * cfree_cg_call_symbol emits a direct call to the declared function symbol, - * allowing the backend/linker to choose PLT/stub/IAT/direct/IFUNC handling. */ -void cfree_cg_call(CfreeCg*, uint32_t nargs, CfreeCgTypeId fn_type); -void cfree_cg_tail_call(CfreeCg*, uint32_t nargs, CfreeCgTypeId fn_type); -void cfree_cg_call_symbol(CfreeCg*, CfreeCgSym sym, uint32_t nargs); -void cfree_cg_tail_call_symbol(CfreeCg*, CfreeCgSym sym, uint32_t nargs); + * allowing the backend/linker to choose PLT/stub/IAT/direct/IFUNC handling. + * MUST tail calls should fail diagnostically if the ABI shapes are not + * compatible. */ +void cfree_cg_call(CfreeCg*, uint32_t nargs, CfreeCgTypeId fn_type, + CfreeCgCallAttrs attrs); +void cfree_cg_call_symbol(CfreeCg*, CfreeCgSym sym, uint32_t nargs, + CfreeCgCallAttrs attrs); void cfree_cg_ret(CfreeCg*); void cfree_cg_ret_void(CfreeCg*); +/* ============================================================ + * Intrinsics + * ============================================================ */ + typedef enum CfreeCgIntrinsic { CFREE_CG_INTRIN_TRAP, - CFREE_CG_INTRIN_UNREACHABLE, - CFREE_CG_INTRIN_CLZ, - CFREE_CG_INTRIN_CTZ, + CFREE_CG_INTRIN_CLZ, /* zero input returns bit width */ + CFREE_CG_INTRIN_CTZ, /* zero input returns bit width */ CFREE_CG_INTRIN_POPCOUNT, CFREE_CG_INTRIN_BSWAP, CFREE_CG_INTRIN_SETJMP, /* pop &buf; push i32 */ CFREE_CG_INTRIN_LONGJMP, /* pop &buf, val; no return */ - CFREE_CG_INTRIN_ADD_OVERFLOW, /* pop a, b; push result, ok_bool */ - CFREE_CG_INTRIN_SUB_OVERFLOW, /* pop a, b; push result, ok_bool */ - CFREE_CG_INTRIN_MUL_OVERFLOW, /* pop a, b; push result, ok_bool */ + CFREE_CG_INTRIN_SADD_OVERFLOW, /* pop a, b; push result, overflow */ + CFREE_CG_INTRIN_UADD_OVERFLOW, /* pop a, b; push result, overflow */ + CFREE_CG_INTRIN_SSUB_OVERFLOW, /* pop a, b; push result, overflow */ + CFREE_CG_INTRIN_USUB_OVERFLOW, /* pop a, b; push result, overflow */ + CFREE_CG_INTRIN_SMUL_OVERFLOW, /* pop a, b; push result, overflow */ + CFREE_CG_INTRIN_UMUL_OVERFLOW, /* pop a, b; push result, overflow */ + CFREE_CG_INTRIN_FMA, /* pop a, b, c; push a * b + c */ CFREE_CG_INTRIN_PREFETCH, /* pop addr; no result */ CFREE_CG_INTRIN_EXPECT, /* pop val, expected; push val */ CFREE_CG_INTRIN_ASSUME_ALIGNED, /* pop ptr; push aligned ptr */ - CFREE_CG_INTRIN_MEMCPY, /* pop dst, src, n; push dst */ - CFREE_CG_INTRIN_MEMMOVE, /* pop dst, src, n; push dst */ - CFREE_CG_INTRIN_MEMSET, /* pop dst, byte, n; push dst */ - CFREE_CG_INTRIN_MEMCMP, /* pop lhs, rhs, n; push i32 */ + CFREE_CG_INTRIN_SYSCALL, /* pop nr, args...; push long */ + CFREE_CG_INTRIN_IRQ_SAVE, /* push unsigned long */ + CFREE_CG_INTRIN_IRQ_RESTORE, /* pop prev */ + CFREE_CG_INTRIN_IRQ_DISABLE, + CFREE_CG_INTRIN_IRQ_ENABLE, + CFREE_CG_INTRIN_DMB, /* pop CfreeCgBarrierScope */ + CFREE_CG_INTRIN_DSB, /* pop CfreeCgBarrierScope */ + CFREE_CG_INTRIN_ISB, + CFREE_CG_INTRIN_DCACHE_CLEAN, /* pop ptr, size */ + CFREE_CG_INTRIN_DCACHE_INVALIDATE, + CFREE_CG_INTRIN_DCACHE_CLEAN_INVALIDATE, + CFREE_CG_INTRIN_ICACHE_INVALIDATE, + CFREE_CG_INTRIN_CPU_NOP, + CFREE_CG_INTRIN_CPU_YIELD, + CFREE_CG_INTRIN_WFI, + CFREE_CG_INTRIN_WFE, /* arm/aarch64 only */ + CFREE_CG_INTRIN_SEV, /* arm/aarch64 only */ + CFREE_CG_INTRIN_CORO_SWITCH, /* pop from, to, value; push value */ } CfreeCgIntrinsic; +typedef enum CfreeCgBarrierScope { + CFREE_CG_BARRIER_FULL, + CFREE_CG_BARRIER_INNER, + CFREE_CG_BARRIER_INNER_STORE, + CFREE_CG_BARRIER_OUTER, + CFREE_CG_BARRIER_OUTER_STORE, + CFREE_CG_BARRIER_NON_SHARE, +} CfreeCgBarrierScope; + /* Pops nargs operands. Pushes result_type unless result_type is * CFREE_CG_TYPE_NONE or void. Overflow intrinsics push two values: - * result, ok_bool regardless of result_type. */ + * result, overflow_bool regardless of result_type. Syscall uses nargs = argc + 1: + * the syscall number plus 0..6 long arguments. Runtime-extension intrinsics + * mirror rt/include/cfree/{syscall,baremetal,coro}.h and must diagnose targets + * where the primitive has no legal lowering. */ void cfree_cg_intrinsic(CfreeCg*, CfreeCgIntrinsic, uint32_t nargs, CfreeCgTypeId result_type); -/* Fixed-size aggregate memory operations. Stack: +/* ============================================================ + * Fixed-Sized Memory Operations + * ============================================================ */ + +/* Stack: * memcpy/memmove: [dst, src] -> [] * memset: [dst] -> [] */ -void cfree_cg_memcpy(CfreeCg*, uint32_t size, uint32_t align); -void cfree_cg_memmove(CfreeCg*, uint32_t size, uint32_t align); -void cfree_cg_memset(CfreeCg*, uint8_t val, uint32_t size, uint32_t align); +void cfree_cg_memcpy(CfreeCg*, uint64_t size, CfreeCgMemAccess dst, + CfreeCgMemAccess src); +void cfree_cg_memmove(CfreeCg*, uint64_t size, CfreeCgMemAccess dst, + CfreeCgMemAccess src); +void cfree_cg_memset(CfreeCg*, uint8_t val, uint64_t size, + CfreeCgMemAccess dst); + +/* ============================================================ + * Atomics + * ============================================================ */ typedef enum CfreeCgAtomicOp { CFREE_CG_ATOMIC_XCHG, @@ -427,12 +780,19 @@ typedef enum CfreeCgMemOrder { CFREE_CG_MO_SEQ_CST, } CfreeCgMemOrder; -void cfree_cg_atomic_load(CfreeCg*, CfreeCgMemOrder); -void cfree_cg_atomic_store(CfreeCg*, CfreeCgMemOrder); -void cfree_cg_atomic_rmw(CfreeCg*, CfreeCgAtomicOp, CfreeCgMemOrder); +int cfree_cg_atomic_is_legal(CfreeCompiler*, CfreeCgMemAccess access, + CfreeCgMemOrder order); +int cfree_cg_atomic_is_lock_free(CfreeCompiler*, CfreeCgMemAccess access); +void cfree_cg_atomic_load(CfreeCg*, CfreeCgMemAccess access, + CfreeCgMemOrder order); +void cfree_cg_atomic_store(CfreeCg*, CfreeCgMemAccess access, + CfreeCgMemOrder order); +void cfree_cg_atomic_rmw(CfreeCg*, CfreeCgMemAccess access, CfreeCgAtomicOp, + CfreeCgMemOrder order); /* Stack: [ptr, expected, desired] -> [prior, ok_bool]. */ -void cfree_cg_atomic_cmpxchg(CfreeCg*, CfreeCgMemOrder success, - CfreeCgMemOrder failure); +void cfree_cg_atomic_cmpxchg(CfreeCg*, CfreeCgMemAccess access, + CfreeCgMemOrder success, + CfreeCgMemOrder failure, int weak); void cfree_cg_atomic_fence(CfreeCg*, CfreeCgMemOrder); /* ============================================================ @@ -448,26 +808,48 @@ typedef enum CfreeCgAsmDir { typedef enum CfreeCgAsmFlag { CFREE_CG_ASM_NONE = 0, CFREE_CG_ASM_VOLATILE = 1u << 0, + CFREE_CG_ASM_PURE = 1u << 1, + CFREE_CG_ASM_NOMEM = 1u << 2, + CFREE_CG_ASM_READONLY = 1u << 3, + CFREE_CG_ASM_PRESERVES_FLAGS = 1u << 4, + CFREE_CG_ASM_NOSTACK = 1u << 5, + CFREE_CG_ASM_NORETURN = 1u << 6, } CfreeCgAsmFlag; +typedef enum CfreeCgAsmClobberAbiSet { + CFREE_CG_ASM_CLOBBER_ABI_NONE = 0, + CFREE_CG_ASM_CLOBBER_ABI_CALLER_SAVED = 1u << 0, +} CfreeCgAsmClobberAbiSet; + typedef struct CfreeCgAsmOperand { - CfreeSym constraint; /* interned GCC-style constraint string */ + CfreeSym constraint; /* interned target constraint string */ CfreeSym name; /* interned symbolic operand name; 0 if absent */ CfreeCgTypeId type; uint8_t dir; /* CfreeCgAsmDir */ uint8_t pad[3]; } CfreeCgAsmOperand; +typedef struct CfreeCgInlineAsm { + CfreeSym tmpl; + const CfreeCgAsmOperand* outputs; + uint32_t noutputs; + const CfreeCgAsmOperand* inputs; + uint32_t ninputs; + const CfreeSym* clobbers; + uint32_t nclobbers; + uint32_t flags; /* CfreeCgAsmFlag */ + uint32_t clobber_abi_sets; /* CfreeCgAsmClobberAbiSet */ +} CfreeCgInlineAsm; + /* Inputs are popped in declaration order. Outputs are pushed in declaration * order as fresh values after the asm block. INOUT outputs consume one * initial value each after the explicit inputs, in output declaration order; - * the implementation binds those values with matching constraints. Template, - * constraints, and clobbers are pre-interned strings. */ -void cfree_cg_inline_asm(CfreeCg*, CfreeSym tmpl, - const CfreeCgAsmOperand* outputs, uint32_t noutputs, - const CfreeCgAsmOperand* inputs, uint32_t ninputs, - const CfreeSym* clobbers, uint32_t nclobbers, - uint32_t flags); + * tied operands, earlyclobber, register classes, explicit registers, + * immediates, memory operands, and target-specific alternatives are expressed + * in the per-operand constraint string. Template, constraints, and clobbers + * are pre-interned strings. clobber_abi_sets names target-defined ABI register + * sets such as all caller-saved registers. */ +void cfree_cg_inline_asm(CfreeCg*, CfreeCgInlineAsm asm_block); /* ============================================================ * Data Definitions @@ -489,7 +871,12 @@ typedef struct CfreeCgDataDefAttrs { /* data_begin defines storage for an already-declared object symbol. * data_common defines tentative/common zero-initialized storage when the - * target format supports it. */ + * target format supports it. + * + * Data definitions may be emitted while a function body is open. The current + * function remains open across data_begin/data_end so frontends can define + * block-scope statics and computed-goto dispatch tables that need function + * label context. */ void cfree_cg_data_begin(CfreeCg*, CfreeCgSym sym, CfreeCgDataDefAttrs attrs); void cfree_cg_data_common(CfreeCg*, CfreeCgSym sym, uint64_t size, uint32_t align); @@ -505,9 +892,24 @@ void cfree_cg_data_zero(CfreeCg*, uint64_t size); /* Relocatable data expressions. These describe the value encoded in the data * stream; they do not request a lowering strategy such as GOT, PLT, TLVP, or - * a TLS access model. width is the encoded field width in bytes. */ + * a TLS access model. width is the encoded field width in bytes. address_space + * is the pointer address space of address constants; use 0 for the target data + * address space. */ void cfree_cg_data_addr(CfreeCg*, CfreeCgSym target, int64_t addend, - uint32_t width); + uint32_t width, uint32_t address_space); +/* Encodes a function-local label address for direct-threaded dispatch tables. + * The target label must have been created by cfree_cg_label_new; it does not + * need to be placed yet, and the containing function must still be open so the + * label-address object can be tied to that function's label namespace. This + * supports the normal direct-threaded lowering: declare a dispatch-table + * symbol, begin the function, create labels, emit the table as data while the + * function is open, then resume code emission. The resulting value is an opaque + * label-address pointer: it may be loaded, stored, compared for equality, + * selected from tables, and consumed by cfree_cg_computed_goto in the label's + * defining function. It must not be called, dereferenced as data, or used by + * another function's computed goto. */ +void cfree_cg_data_label_addr(CfreeCg*, CfreeCgLabel target, int64_t addend, + uint32_t width, uint32_t address_space); void cfree_cg_data_pcrel(CfreeCg*, CfreeCgSym target, int64_t addend, uint32_t width); void cfree_cg_data_symdiff(CfreeCg*, CfreeCgSym lhs, CfreeCgSym rhs, @@ -524,28 +926,77 @@ static inline void cfree_cg_push_bytes(CfreeCg* cg, const uint8_t* data, cfree_cg_push_symbol_addr(cg, sym, 0); } +static inline void cfree_cg_call_default(CfreeCg* cg, uint32_t nargs, + CfreeCgTypeId fn_type) { + CfreeCgCallAttrs attrs; + attrs.tail = CFREE_CG_TAIL_DEFAULT; + attrs.flags = 0; + cfree_cg_call(cg, nargs, fn_type, attrs); +} + +static inline void cfree_cg_call_symbol_default(CfreeCg* cg, CfreeCgSym sym, + uint32_t nargs) { + CfreeCgCallAttrs attrs; + attrs.tail = CFREE_CG_TAIL_DEFAULT; + attrs.flags = 0; + cfree_cg_call_symbol(cg, sym, nargs, attrs); +} + +static inline void cfree_cg_tail_call(CfreeCg* cg, uint32_t nargs, + CfreeCgTypeId fn_type) { + CfreeCgCallAttrs attrs; + attrs.tail = CFREE_CG_TAIL_ALLOWED; + attrs.flags = 0; + cfree_cg_call(cg, nargs, fn_type, attrs); +} + +static inline void cfree_cg_tail_call_symbol(CfreeCg* cg, CfreeCgSym sym, + uint32_t nargs) { + CfreeCgCallAttrs attrs; + attrs.tail = CFREE_CG_TAIL_ALLOWED; + attrs.flags = 0; + cfree_cg_call_symbol(cg, sym, nargs, attrs); +} + +static inline void cfree_cg_musttail_call(CfreeCg* cg, uint32_t nargs, + CfreeCgTypeId fn_type) { + CfreeCgCallAttrs attrs; + attrs.tail = CFREE_CG_TAIL_MUST; + attrs.flags = 0; + cfree_cg_call(cg, nargs, fn_type, attrs); +} + +static inline void cfree_cg_musttail_call_symbol(CfreeCg* cg, CfreeCgSym sym, + uint32_t nargs) { + CfreeCgCallAttrs attrs; + attrs.tail = CFREE_CG_TAIL_MUST; + attrs.flags = 0; + cfree_cg_call_symbol(cg, sym, nargs, attrs); +} + /* Increment/decrement an lvalue in place. Stack: [lv] -> [result]. * post=1 pushes the old value; post=0 pushes the new value. - * op is CFREE_CG_ADD or CFREE_CG_SUB. ty is the promoted integer type + * op is CFREE_CG_INT_ADD or CFREE_CG_INT_SUB. ty is the promoted integer type * of the lvalue. */ -static inline void cfree_cg_inc_dec(CfreeCg* cg, CfreeCgBinOp op, int post, - CfreeCgTypeId ty) { - cfree_cg_dup(cg); /* [lv, lv] */ - cfree_cg_load(cg); /* [lv, old] */ +static inline void cfree_cg_inc_dec(CfreeCg* cg, CfreeCgIntBinOp op, int post, + CfreeCgTypeId ty, + CfreeCgMemAccess access) { + cfree_cg_dup(cg); /* [lv, lv] */ + cfree_cg_load(cg, access); /* [lv, old] */ if (post) { - cfree_cg_dup(cg); /* [lv, old, old] */ - cfree_cg_push_int(cg, 1, ty); /* [lv, old, old, 1] */ - cfree_cg_binop(cg, op); /* [lv, old, new] */ - cfree_cg_rot3(cg); /* [old, new, lv] */ - cfree_cg_swap(cg); /* [old, lv, new] */ - cfree_cg_store(cg); /* [old] */ + cfree_cg_dup(cg); /* [lv, old, old] */ + cfree_cg_push_int(cg, 1, ty); /* [lv, old, old, 1] */ + cfree_cg_int_binop(cg, op, 0); /* [lv, old, new] */ + cfree_cg_rot3(cg); /* [old, new, lv] */ + cfree_cg_swap(cg); /* [old, lv, new] */ + cfree_cg_store(cg, access); /* [old] */ } else { - cfree_cg_push_int(cg, 1, ty); /* [lv, old, 1] */ - cfree_cg_binop(cg, op); /* [lv, new] */ - cfree_cg_dup(cg); /* [lv, new, new] */ - cfree_cg_rot3(cg); /* [new, new, lv] */ - cfree_cg_swap(cg); /* [new, lv, new] */ - cfree_cg_store(cg); /* [new] */ + cfree_cg_push_int(cg, 1, ty); /* [lv, old, 1] */ + cfree_cg_int_binop(cg, op, 0); /* [lv, new] */ + cfree_cg_dup(cg); /* [lv, new, new] */ + cfree_cg_rot3(cg); /* [new, new, lv] */ + cfree_cg_swap(cg); /* [new, lv, new] */ + cfree_cg_store(cg, access); /* [new] */ } } @@ -578,11 +1029,11 @@ static inline void cfree_cg_bitget(CfreeCg* cg, CfreeCgTypeId ty, uint32_t lo, uint32_t width) { if (lo > 0) { cfree_cg_push_int(cg, lo, ty); - cfree_cg_binop(cg, CFREE_CG_SHR_U); + cfree_cg_int_binop(cg, CFREE_CG_INT_LSHR, 0); } if (width < 64) { cfree_cg_push_int(cg, (1ULL << width) - 1, ty); - cfree_cg_binop(cg, CFREE_CG_AND); + cfree_cg_int_binop(cg, CFREE_CG_INT_AND, 0); } } @@ -598,17 +1049,17 @@ static inline void cfree_cg_bitset(CfreeCg* cg, CfreeCgTypeId ty, uint32_t lo, cfree_cg_swap(cg); /* [src, dst] */ cfree_cg_dup(cg); /* [src, dst, dst] */ cfree_cg_push_int(cg, clear_mask, ty); - cfree_cg_binop(cg, CFREE_CG_AND); /* [src, dst, dst_cleared] */ - cfree_cg_rot3(cg); /* [dst, dst_cleared, src] */ + cfree_cg_int_binop(cg, CFREE_CG_INT_AND, 0); /* [src, dst, dst_cleared] */ + cfree_cg_rot3(cg); /* [dst, dst_cleared, src] */ if (lo > 0) { cfree_cg_push_int(cg, lo, ty); - cfree_cg_binop(cg, CFREE_CG_SHL); /* [dst, dst_cleared, src<<lo] */ + cfree_cg_int_binop(cg, CFREE_CG_INT_SHL, 0); /* [dst, dst_cleared, src<<lo] */ } cfree_cg_push_int(cg, field_mask, ty); - cfree_cg_binop(cg, CFREE_CG_AND); /* [dst, dst_cleared, bits] */ - cfree_cg_rot3(cg); /* [dst_cleared, bits, dst] */ - cfree_cg_drop(cg); /* [dst_cleared, bits] */ - cfree_cg_binop(cg, CFREE_CG_OR); /* [result] */ + cfree_cg_int_binop(cg, CFREE_CG_INT_AND, 0); /* [dst, dst_cleared, bits] */ + cfree_cg_rot3(cg); /* [dst_cleared, bits, dst] */ + cfree_cg_drop(cg); /* [dst_cleared, bits] */ + cfree_cg_int_binop(cg, CFREE_CG_INT_OR, 0); /* [result] */ } #endif