kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit e807eee24c6272883531ca5fa1add0b7a4659ef8
parent 4e76fca39ef26ce1dee46abb8952f1a57ebd111a
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Wed, 13 May 2026 20:24:14 -0700

TOY.md spec

Diffstat:
Adoc/TOY.md | 839+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Ddoc/toy-todo.md | 623-------------------------------------------------------------------------------
2 files changed, 839 insertions(+), 623 deletions(-)

diff --git a/doc/TOY.md b/doc/TOY.md @@ -0,0 +1,839 @@ +# Toy Language Specification + +Toy is a small, explicit language for exercising the public code generation API +in `include/cfree/cg.h`. It is not C. It uses C-like expressions where that +keeps tests readable, and prefix-oriented syntax where C syntax would make +parsing or lowering ambiguous. + +Toy source is statically typed, single-pass friendly, and LL(1)-oriented. +Definitions are local/private by default. Public linkage, external +declarations, target details, ABI details, and low-level code generation +features are spelled explicitly. + +## Program Structure + +A program is a sequence of declarations: + +```toy +type Word = u64; + +record Pair { + a: i32, + b: i32, +} + +pub fn main(): i32 { + let p: Pair = Pair { a: 10 }; + return p.a + p.b; +} +``` + +Declarations must appear before use unless a declaration form explicitly +defines an external symbol. Source names use C linkage spelling for object-file +symbols. + +## Lexical Conventions + +Identifiers are ASCII names beginning with a letter or `_`, followed by letters, +digits, or `_`. Integer literals are decimal. Floating literals contain a +decimal point. Byte string literals use double quotes. + +Byte string literals support `\0`, `\n`, `\t`, `\"`, `\\`, and `\xNN` escapes. +The type of a byte string literal is `[N]u8`, where `N` is the number of +emitted bytes. There is no separate text string type and no implicit trailing +zero byte. + +Language builtins use an `@` prefix. Builtin constants, attribute names, and +attribute arguments use dot names, for example `.seq_cst`, `.sysv`, +`.strict_alignment`, `.bind`, and `.section`. This keeps names such as `bind`, +`visibility`, and `section` available as ordinary identifiers. + +Attribute lists use `@[...]`. + +## Types + +Scalar types: + +- `void` +- `bool` +- `i8`, `u8`, `i16`, `u16`, `i32`, `u32`, `i64`, `u64`, `i128`, `u128` +- `isize`, `usize` +- `f32`, `f64` +- `va_list` + +Compound and user types: + +- Pointer: `*T` +- Address-space pointer: `*addrspace(N) T` +- Array: `[N]T` +- Function: `fn(T, U): R` +- Variadic function: `fn(T, ...): R` +- Transparent alias: `type Name = T;` +- Record: `record Name { field: T, ... }` +- Tuple record: `tuple Name { T, U }` +- Anonymous record: `record { field: T, ... }` +- Enum: `enum Name: BaseInt { .value = N, ... }` + +`type Name = T;` creates a transparent source alias. Values of `T` type-check +where `Name` is expected and values of `Name` type-check where `T` is expected. +The alias preserves source spelling for diagnostics and debug info but does not +create a distinct nominal type. + +Named records, tuple records, and enums are nominal. To get a distinct type with +the same representation as another type, wrap it in a record or tuple record: + +```toy +tuple Word { u64 } +``` + +Recursive types are expressed with nominal records and pointers. A record body +may refer to its own name through a pointer, and mutually recursive records use +forward declarations: + +```toy +record Node { + value: i32, + next: *Node, +} + +record A; +record B; + +record A { b: *B } +record B { a: *A } +``` + +Direct by-value recursion is rejected because it has no finite size. + +For code generation, recursive pointer fields may be lowered with type-erased +pointer storage. The frontend preserves the source pointee type and recovers it +when checking field access, loads, stores, calls, and casts. The erased storage +representation must have the same size and alignment as the source pointer type. + +Function types are not first-class values by themselves. Source values use +pointer-to-function types such as `*fn(i32): i32` or `*AliasToFunction`. + +Type qualifiers are prefix forms: + +```toy +const T +volatile T +restrict T +``` + +`restrict` is valid only on pointer types after qualifier folding. Qualifiers +inform memory access and aliasing rules while lowering to the corresponding code +generation storage type. + +## Declarations + +Top-level declarations: + +```toy +fn local_helper(x: i32): i32 { ... } +pub fn exported(x: i32): i32 { ... } +extern fn puts(s: *i8): i32; + +var counter: i64; +let answer: i32 = 42; +pub var exported_counter: i64; +extern var errno: i32; +extern let ro_table: *u8; +var @[.threadlocal] tls_counter: i32; +extern var @[.threadlocal] errno_tls: i32; + +alias public_name = private_name; +pub alias exported_alias = target_name; +``` + +`fn`, `let`, `var`, and `alias` definitions are local binding by default. +`pub` changes the default binding to global. `extern` declares without defining +and defaults to global binding. The `.threadlocal` object attribute marks object +storage as thread-local and may appear with `pub` or `extern`. + +`let` defines readonly object storage unless it is `extern`. `var` defines +mutable object storage. `alias` targets must already be declared. + +Attribute lists are placed after the syntactic keyword that introduces the item +they decorate. Declaration attributes come after `fn`, `let`, `var`, or +`alias`. Record attributes come after `record`, `tuple`, or `enum`. Field and +parameter attributes come after the field or parameter name. Return ABI +attributes come after the return type. + +```toy +pub fn @[.bind(.weak), .visibility(.hidden), .section(".text.hot"), .hot] +fast_path(x: i32): i32 { ... } + +pub let @[.section(".rodata.tests"), .align(16), .used] +table: [4]i32 = [1, 2, 3, 4]; + +extern var @[.threadlocal, .tls_model(.local_exec)] tls_counter: i32; +``` + +Symbol attributes: + +- `.bind(.local|.global|.weak)` +- `.visibility(.default|.hidden|.protected)` +- `.used`, `.dllimport`, `.dllexport` + +Function attributes: + +- `.section("name")`, `.noreturn`, `.ifunc`, `.cold`, `.hot` +- `.naked`, `.interrupt`, `.no_red_zone` +- `.stack_align(N)`, `.target_features("features")` +- `.callconv(.target_c|.sysv|.win64|.aapcs|.wasm|.interrupt)` + +Object attributes: + +- `.section("name")`, `.align(N)`, `.readonly`, `.threadlocal`, `.static` +- `.tls_model(.auto|.local_exec|.initial_exec|.local_dynamic|.general_dynamic)` +- `.common` + +Data-definition attributes for object definitions: + +- `.retain`, `.merge`, `.strings`, `.entsize(N)` + +Parameter and return ABI attributes are written inline: + +```toy +extern fn sx(x @[.signext]: i8): u8 @[.zeroext]; +extern fn byval(p @[.byval, .align(16)]: *Pair): void; +extern fn borrowed(p @[.byref, .readonly, .noalias, .nonnull, + .dereferenceable(32)]: *Pair): void; +``` + +ABI attributes are `.signext`, `.zeroext`, `.sret`, `.byval`, `.byref`, +`.inreg`, `.noalias`, `.readonly`, `.writeonly`, `.nonnull`, `.nest`, +`.align(N)`, and `.dereferenceable(N)`. + +## Records, Enums, And Aggregates + +Named records, tuple records, and enums are nominal. Record fields are named. +Tuple record fields are anonymous and are accessed by numeric field names in +declaration order. + +```toy +record @[.packed] Header { + tag: u8, + len: u32, +} + +record Padded { + x @[.align(16)]: i32, + y: i32, +} + +tuple Tuple { + i32, + i32, +} + +enum Color: i32 { + .red = 1, + .green = 2, + .blue = 3, +} +``` + +Record fields, tuple fields, and enum values are comma-separated. A trailing +comma is allowed. + +Enum values are dot constants. They require an expected enum type from context, +such as a typed initializer, switch selector, parameter type, or explicit cast. + +Record literals use named fields. Tuple record literals use positional fields. +Array literals use the expected array type for count and element type. Omitted +fields and trailing array elements are zero-filled. + +```toy +let p: Pair = Pair { a: 10, b: 32 }; +let pz: Pair = Pair {}; +let t: Tuple = Tuple { 10, 32 }; +let xs: [4]i32 = [1, 2]; +let zeros: [4]i32 = []; +let z: Pair = {}; +let c: Color = .green; +``` + +Aggregate assignment is allowed for identical array, record, or alias-expanded +storage types. + +Anonymous records are structural product types. They are valid as builtin result +types, inline assembly result types, local inference results, and explicit +`record { ... }` type literals. Two anonymous record types are identical when +their field names, field order, and field types are identical. They cannot be +recursive and do not introduce source declarations. + +When a code generation intrinsic produces multiple stack values, Toy treats +those values as an anonymous record at the source level. Field projection may +consume the component directly. If the value must be stored, passed, returned, +or addressed, the frontend materializes a private code generation record type +with the same field layout. + +## Expressions + +Toy values are statically typed. There is limited local inference: an +initialized block-local `let` or `var` may omit `: T` when the initializer has a +complete, unambiguous type. Function parameters, function returns, top-level +objects, external declarations, empty aggregate literals, and numeric literals +that need a storage width require an expected or explicit type. + +There is no implicit numeric conversion except function-name decay to a +function pointer. Explicit conversion uses `expr as T`. Conditions accept +`bool` and integer-like values; zero is false and non-zero is true. + +`NULL` is the null pointer literal. It requires an expected pointer type or an +explicit cast, for example `NULL as *i32`. + +Operators are parsed in small precedence islands. Operators within the same +island associate left-to-right. Operators from different binary islands must be +parenthesized explicitly, except that additive and multiplicative operators may +be mixed with the usual precedence rules. For example, `a + b * c` is accepted +as `a + (b * c)`, but `a + b << c` is rejected and must be parenthesized. + +- Postfix chain: call, index, field, pointer dereference `expr.*` +- Prefix unary: `+`, `-`, `!`, `~`, `&` +- Cast: `expr as T` +- Multiplicative: `*`, `/`, `%` +- Additive: `+`, `-` +- Shift: `<<`, `>>` +- Less-than: `<` +- Less-than-or-equal: `<=` +- Greater-than: `>` +- Greater-than-or-equal: `>=` +- Equal: `==` +- Not-equal: `!=` +- Bitwise and: `&` +- Bitwise xor: `^` +- Bitwise or: `|` +- Logical and: `and` +- Logical or: `or` + +Assignment is statement-only and does not produce a value. + +### Lvalues + +Lvalue forms: + +- Variable: `x` +- Dereference: `expr.*` +- Index: `expr[index]` +- Named field: `expr.field` +- Tuple field: `expr.N` +- Address-of: `&lvalue` + +Indexing works for array lvalues, pointer-to-element values, and +pointer-to-array values. Indexing a `*[N]T` implicitly dereferences the pointer +to the array, so `p[i]` is equivalent to `p.*[i]` in that case. Field access on +a `*Record` implicitly dereferences the pointer; Toy has no `->` operator. + +`&arr` produces `*[N]T`, a pointer to the whole array. `&arr[0]` produces `*T`, +a pointer to the first element. + +## Statements + +Blocks introduce lexical local scopes: + +```toy +{ + let x: i32 = 1; + var y: i32 = 2; +} +``` + +Statement forms: + +```toy +let name: T = expr; +let name = expr; +var name: T = expr; +var name = expr; +lvalue = expr; +expr; + +if cond { ... } else { ... } +while cond { ... } +label: while cond { ... } +switch expr { ... } +label: switch expr { ... } +break; +break expr; +break label; +break label expr; +continue; +continue label; +return; +return expr; +return tail callee(args); +``` + +`return tail callee(args);` supports direct functions and function-pointer +callees. Tail calls to variadic functions are rejected. + +Block-local `let` and `var` declarations may use `@[.static]` to allocate +function-local static storage with internal linkage and lexical visibility: + +```toy +fn next_id(): i32 { + var @[.static] id: i32 = 0; + id = id + 1; + return id; +} +``` + +## Expression Control Flow + +`if` can be an expression when both arms produce the same type. Each arm is a +block: it may contain intervening statements, and the final expression without a +semicolon is the block value. + +```toy +let x: i32 = if cond { + let base = 4; + base + 6 +} else { + let base = 10; + base + 10 +}; +``` + +A final expression with a semicolon is an expression statement and does not +provide the block value. + +Result-typed loops use `while<T>` and an `else` expression for fallthrough: + +```toy +let found: i32 = while<i32> i < n { + if xs[i] == needle { + break i; + } + i = i + 1; + continue; +} else { + -1 +}; +``` + +Structured loops and switches may have labels. `break label;`, +`break label expr;`, and `continue label;` target the named enclosing control +scope rather than the innermost one. `continue label;` is valid only when the +target is a loop. A value-bearing break must match the target scope's result +type. + +```toy +let found: i32 = outer: while<i32> row < rows { + while col < cols { + if grid[row][col] == needle { + break outer row; + } + col = col + 1; + } + row = row + 1; + continue outer; +} else { + -1 +}; +``` + +Unlabeled `break` and `continue` still target the innermost valid scope. + +`break expr;` is valid only when the target is a result-typed loop or expression +switch and must match the target result type. + +## Switch And Labels + +`switch` can be a statement or an expression. There is no `case` keyword. +Statement arms use blocks: + +```toy +switch @[.jump_table] tag { + 0 { + return 10; + } + 1, 2 { + return 20; + } + default { + return 30; + } +} +``` + +In expression context, switch arms use the same block shape and must all produce +the same type. As with `if` expression blocks, statements may precede the final +expression. + +```toy +let value: i32 = switch tag { + 0 { + let x = 4; + x + 6 + } + 1, 2 { + 20 + } + default { + 30 + } +}; +``` + +The selector must be integer-like, `bool`, or an enum. Arm labels are integer +or enum constants. `default` is optional for statement switches and required +for expression switches unless the selector is an enum and all values are +covered. Switch strategy hints are `.branch_chain` and `.jump_table`. + +Labels must be declared before placement: + +```toy +label again; +label done; + +let target: *void = @labeladdr(again); +goto *target within (again, done); + +again: +... +done: +``` + +Label address values have type `*void`. `goto *target;` is a computed goto. +`within (...)` is optional; when present, it gives the valid target set for +diagnostics and target branch-protection lowering. Omitting it is legal but may +be rejected by targets that require an explicit branch-protection target set. + +## Calls And Varargs + +Direct and indirect calls use the same syntax: + +```toy +fn add1(x: i32): i32 { + return x + 1; +} + +let fp: *fn(i32): i32 = add1; +let fp2: *fn(i32): i32 = &add1; +return fp(41); +``` + +A bare function name in rvalue position decays to a function pointer. Taking +the address of a function name produces the same value. + +Varargs builtins: + +- `@va_start(ap)` +- `@va_arg<T>(ap)` +- `@va_copy(dst, src)` +- `@va_end(ap)` + +## Type Queries + +Type query builtins: + +- `@sizeof<T>()` +- `@alignof<T>()` +- `@offsetof<T>(field)` + +`@offsetof` accepts named fields and tuple field indexes. + +## Memory + +Stack allocation and memory operations: + +```toy +let p: *i32 = @alloca<i32>(count, 16); + +@memmove(dst, src, size, align); +@memcpy(dst, src, size, align, .volatile); +@memset(dst, 0, size, align, .nontemporal); +``` + +When `size` and `align` are compile-time constants, `@memcpy`, `@memmove`, and +`@memset` lower to fixed-size code generation memory operations. Otherwise they +lower to the target/runtime dynamic memory intrinsic selected by the frontend. + +Memory flags are `.volatile`, `.nontemporal`, and `.invariant`. Address space +comes from pointer types. Alias and noalias scopes come from language semantics, +such as `restrict` and ABI attributes. + +Ordinary lvalues, pointer qualifiers, and assignment are the primary spelling +for loads and stores. + +## Data Definitions + +Ordinary `let` and `var` definitions use typed initializers: + +```toy +pub let answer: i32 = 42; +pub let pi: f64 = 3.0; +pub let msg: [6]u8 = "hello\0"; +pub let table: [4]i32 = [1, 2]; +pub let pair: Pair = Pair { a: 1 }; +pub let panswer: *i32 = &answer; + +pub var @[.common] tentative: i64; +``` + +Low-level relocatable data expressions may appear in typed object initializers: + +```toy +pub let rels: [2]i32 = [ + @pcrel(target, 4), + @symdiff(end, start, 0), +]; +``` + +Function-local static object initializers may use label addresses while the +containing function is open: + +```toy +fn run(op: i32): void { + label case0; + label case1; + + let @[.static] dispatch: [2]*void = [ + @labeladdr(case0), + @labeladdr(case1), + ]; + + goto *dispatch[op] within (case0, case1); + +case0: + return; +case1: + return; +} +``` + +Data initializer builtins: + +- `@pad(N, value)` +- `@align(N)` +- `@pcrel(symbol, addend)` +- `@symdiff(lhs, rhs, addend)` +- `@labeladdr(label)` + +`@pcrel` and `@symdiff` are valid only inside object initializers. `@labeladdr` +in an object initializer is valid only for function-local static objects in the +function that owns the label. + +`@pcrel` and `@symdiff` require an expected integer type from the object +initializer slot. That type determines the encoded relocation field width. There +is no default width: use an explicitly typed object, array element, record +field, or cast when context is ambiguous. Signed integer types are preferred for +range diagnostics, but unsigned integer storage is allowed when the bit pattern +is intentional. + +## Atomics + +Atomic builtins: + +- `@atomic_load<T>(ptr, order, access(...))` +- `@atomic_store<T>(ptr, value, order, access(...))` +- `@atomic_rmw<T>(op, ptr, value, order, access(...))` +- `@atomic_cmpxchg<T>(ptr, expected, desired, success_order, failure_order, + strength, access(...))` +- `@atomic_fence(order)` +- `@atomic_is_legal<T>(order, access(...))` +- `@atomic_is_lock_free<T>(access(...))` + +The `access(...)` group is optional. Empty `access()` means natural alignment, +address space from the pointer operand or zero for query-only builtins, and no +memory flags. Access entries are `.align(N)`, `.addrspace(N)`, `.volatile`, +`.nontemporal`, `.invariant`, `.alias_scope(N)`, and `.noalias_scope(N)`. +Operation builtins normally derive address space from `ptr`; `.addrspace(N)` is +required only for query-only builtins targeting a nonzero address space. + +Memory orders are `.relaxed`, `.consume`, `.acquire`, `.release`, `.acq_rel`, +and `.seq_cst`. + +RMW operations are `.xchg`, `.add`, `.sub`, `.and`, `.or`, `.xor`, and `.nand`. +Compare-exchange strengths are `.strong` and `.weak`. + +`@atomic_cmpxchg<T>` returns an anonymous record: + +```toy +{ prior: T, ok: bool } +``` + +## Intrinsics + +Scalar and arithmetic intrinsics: + +- `@trap()` +- `@unreachable()` +- `@compile_error("message")` +- `@clz(x)`, `@ctz(x)`, `@popcount(x)`, `@bswap(x)` +- `@expect(value, expected)` +- `@bitget(value, lo, width)` +- `@bitset(dst, src, lo, width)` +- `@fma(a, b, c)` +- `@prefetch(ptr)` +- `@assume_aligned<T>(ptr, align)` + +Low-level conversion builtins are available when tests need to select the exact +code generation conversion rather than the source-level `as` conversion: + +- `@sext<T>(x)`, `@zext<T>(x)`, `@trunc<T>(x)` +- `@ptr_to_int<T>(x)`, `@int_to_ptr<T>(x)`, `@bitcast<T>(x)` +- `@fpext<T>(x)`, `@fptrunc<T>(x)` +- `@sint_to_float<T>(x, rounding)`, `@uint_to_float<T>(x, rounding)` +- `@float_to_sint<T>(x, rounding)`, `@float_to_uint<T>(x, rounding)` + +Rounding modes are `.default`, `.nearest_even`, `.toward_zero`, `.down`, and +`.up`. + +Overflow intrinsics: + +- `@add_overflow<T>(a, b)` +- `@sub_overflow<T>(a, b)` +- `@mul_overflow<T>(a, b)` + +Overflow builtins return an anonymous record: + +```toy +{ value: T, overflow: bool } +``` + +Syscall: + +- `@syscall(nr, arg0, ..., arg5)` returns `isize` + +Non-local control transfer: + +- `@setjmp(buf)` returns `i32` +- `@longjmp(buf, value)` does not return + +`buf` is an lvalue or pointer to target-defined setjmp buffer storage. + +Bare-metal, cache, barrier, and coroutine intrinsics: + +- `@irq_save()`, `@irq_restore(prev)`, `@irq_disable()`, `@irq_enable()` +- `@dmb(scope)`, `@dsb(scope)`, `@isb()` +- `@dcache_clean(ptr, size)` +- `@dcache_invalidate(ptr, size)` +- `@dcache_clean_invalidate(ptr, size)` +- `@icache_invalidate(ptr, size)` +- `@cpu_nop()`, `@cpu_yield()`, `@wfi()`, `@wfe()`, `@sev()` +- `@coro_switch<T>(from, to, value)` returns `T` + +Barrier scopes are `.full`, `.inner`, `.inner_store`, `.outer`, +`.outer_store`, and `.non_share`. + +Target-specific intrinsics that cannot be legally lowered for the selected +target are compile-time errors. + +`@compile_error` emits a compile-time diagnostic and can appear in any expected +expression type. + +## Inline Assembly + +Inline assembly uses one typed builtin: + +```toy +let v: i32 = switch @target_arch() { + .arm64 { + @asm<i32>( + "add %w0, %w1, %w2", + outputs(out("=r", value: i32)), + inputs(in("r", a), in("r", b)), + clobbers("cc"), + flags(.volatile) + ) + } + .x64 { + @asm<i32>( + "leal (%1,%2), %0", + outputs(out("=r", value: i32)), + inputs(in("r", a), in("r", b)), + clobbers("cc"), + flags(.volatile) + ) + } + .rv64 { + @asm<i32>( + "addw %0, %1, %2", + outputs(out("=r", value: i32)), + inputs(in("r", a), in("r", b)), + clobbers(), + flags(.volatile) + ) + } + default { + @compile_error("unsupported asm target") + } +}; + +@asm<void>("nop", outputs(), inputs(), clobbers(), flags(.volatile)); +``` + +The template is a byte string literal. Target-specific assembly is selected with +ordinary fold-only control flow over `@target_arch()` rather than a special asm +selector form. + +Operand wrappers: + +- `in("constraint", expr)` +- `in("m", lvalue)` +- `in("i", const_expr)` +- `name = in("constraint", expr)` +- `out("constraint", name: T)` +- `inout("constraint", expr)` + +Groups: + +- `outputs(...)` +- `inputs(...)` +- `clobbers("memory", "cc", ...)` +- `clobber_abi(.caller_saved)` +- `flags(.volatile, .pure, .nomem, .readonly, .preserves_flags, .nostack, + .noreturn)` + +`outputs(...)` is required. Empty trailing groups may be omitted. + +`@asm<void>` produces no value. `@asm<T>` with one output returns that value. +`@asm<Record>` with multiple outputs maps output names to record fields. Inout +operands count as outputs for result-shape purposes. Anonymous record type +literals are valid as asm result types: + +```toy +let pair = @asm<record { lo: i32, hi: i32 }>( + "...", + outputs(out("=r", lo: i32), out("=r", hi: i32)), + inputs() +); +``` + +## Target Capability Queries + +Capability queries are fold-only constants for the selected target: + +- `@target_arch()` +- `@supports_callconv(kind)` +- `@supports_symbol_feature(feature)` +- `@has_backend_feature(feature)` + +Arch constants: + +- `.x86`, `.x64`, `.arm32`, `.arm64`, `.rv32`, `.rv64`, `.wasm` + +`@target_arch()` returns the selected target's arch enum value. + +Call convention constants: + +- `.target_c`, `.sysv`, `.win64`, `.aapcs`, `.wasm`, `.interrupt` + +Symbol feature constants: + +- `.weak`, `.protected_visibility`, `.dllimport`, `.dllexport`, `.comdat` +- `.common`, `.merge_sections`, `.constructor_priority` +- `.tls_local_exec`, `.tls_initial_exec` +- `.tls_local_dynamic`, `.tls_general_dynamic` + +Backend feature constants: + +- `.unaligned_memory`, `.strict_alignment`, `.red_zone`, `.simd` +- `.pointer_auth`, `.branch_protection` + +Unknown constants are compile-time errors. Known unsupported features evaluate +to `false`. diff --git a/doc/toy-todo.md b/doc/toy-todo.md @@ -1,623 +0,0 @@ -# Toy CG Coverage TODO - -Toy should become the primary way to exercise the public CG API in -`include/cfree/cg.h`. This file tracks the gaps between what the current toy -language can express and the full CG surface. - -Baseline: `lang/toy/toy.c` currently has core integer functions, locals, -globals, direct calls, tail calls, pointers to `int`, variadic helpers, inline -asm helpers, atomics, memory helpers, a hard-coded record field test, and a -hard-coded type-query test. - -## Language Extension Plan - -This section is a decision log for source-language extensions that make toy -able to express the full CG API surface. Items move from `Proposed` to -`Accepted` or `Rejected` as decisions are made. - -### Group 1: Types And Literals - -Status: Accepted - -Goal: make toy able to name every CG type shape and produce scalar constants -for those types. Later work on calls, atomics, globals, records, and memory -operations should build on these type forms instead of adding one-off builtins. - -Proposal: - -- Builtin scalar type names: `void`, `bool`, `i8`, `u8`, `i16`, `u16`, `i32`, - `u32`, `i64`, `u64`, `isize`, `usize`, `f32`, `f64`, and `va_list`. -- Keep `int` as a compatibility alias for `isize` for existing toy tests. -- Pointer types remain prefix: `*T`. -- Arrays use prefix count syntax: `[N]T`, for example `[4]i32`. -- Function pointer types use `fn(T, U): R` as a type form. Variadic function - types use `fn(T, ...): R`. -- Qualified types use prefix qualifiers: `const T`, `volatile T`, and - `restrict T`. Multiple qualifiers are allowed as repeated prefixes. -- Boolean literals are `true` and `false`. -- Numeric literals keep their current unsuffixed form and are context-typed by - the target declaration, parameter, return, or cast. Add decimal float literals - for `f32`/`f64`. -- Add explicit casts as `expr as T` so tests can force conversion edges without - needing C-like cast ambiguity in the grammar. -- Add type query expressions: `sizeof<T>()`, `alignof<T>()`, and - `offsetof<T>(field)`. - -Decisions: - -- Use `true`/`false` for boolean literals. -- Use `[N]T` for arrays. -- Use `expr as T` for casts. -- Include `offsetof<T>(field)` with the type-query expressions. - -### Group 2: Declarations, Linkage, And Attributes - -Status: Accepted - -Goal: expose function/data declaration modes and symbol attributes without -turning toy into C. Declarations should stay regular enough that coverage cases -are easy to read and failures point at CG behavior rather than parser trivia. - -Proposal: - -- Function and data definitions are static/private by default. -- Exported definitions use `pub`: - `pub fn name(args): ret { ... }`, `pub let x: T = init;`, and - `pub var x: T = init;`. -- Imported function declarations use explicit `extern`: - `extern fn name(args): ret;` - This emits `cfree_cg_func_decl` without `cfree_cg_func_begin`. -- Function definitions keep the current non-exported shape: - `fn name(args): ret { ... }`. -- Global definitions keep `let`/`var`: - `let x: T = init;`, `var x: T = init;`, and zero-init `var x: T;`. -- Imported data declarations use `extern`: - `extern let name: T;`, `extern var name: T;`. - These emit `cfree_cg_data_decl` without a definition. -- Attributes use a prefix bracket list before `pub`, `extern`, `fn`, `let`, or - `var`: - `#[bind(local), visibility(hidden), section(".foo"), align(16), used]` -- Supported attributes: - `bind(global|local|weak)`, `visibility(default|hidden|protected)`, - `section("name")`, `align(N)`, `readonly`, `tls`, `tls_model(default|local_exec|initial_exec|local_dynamic|general_dynamic|tlvp)`, - `common`, `used`, and `noreturn`. -- `pub` maps to global binding unless an explicit `bind(...)` attribute - overrides it for an error-case test. Non-`pub` definitions map to local - binding by default. -- `let` implies readonly for defined data; `readonly` remains accepted on `var` - to exercise the declaration flag directly. - -Decisions: - -- Use `#[...]` attributes. -- Default definitions are static/private. -- Use `pub` to mark exported definitions. -- Use explicit `extern` for outside functions and data. - -### Group 3: Composite Types And Initializers - -Status: Accepted - -Goal: let toy source define records, enums, aliases, arrays, and aggregate -initializers directly, so CG type construction, layout, data emission, field -access, and ABI aggregate behavior are all covered without hard-coded helper -types. - -Proposal: - -- Type aliases: - `type Word = u64;` -- Record declarations: - `record Pair { a: i32; b: i32; }` -- Anonymous record fields use `_`: - `record Tuple2 { _: i32; _: i32; }` -- Field alignment uses field attributes: - `record Padded { #[align(16)] x: i32; y: i32; }` -- Packed records use a record attribute: - `#[packed] record Header { tag: u8; len: u32; }` -- Enum declarations: - `enum Color: i32 { red = 1; green = 2; blue = 3; }` - The base type is required so enum layout is explicit. -- Record literals use named fields: - `Pair { a: 1, b: 2 }` -- Array literals use bracket values: - `[1, 2, 3, 4]` - The expected type supplies the element type and count. -- Zero aggregate literal: - `zero<T>()` - This gives tests a concise way to produce zeroed arrays/records/globals. -- Global initializers may use scalar literals, bool/float literals, string/byte - literals, array literals, record literals, `zero<T>()`, and symbol addresses. -- String literals have type `*[N]u8` or `*u8` only through decay contexts; exact - behavior can be narrowed during implementation. -- Symbol address initializers use `&name` initially; symbol addends and - non-`ADDR` reference kinds are handled in a later symbol-reference group. - -Decisions: - -- Use `record`; do not add `struct` as an alias initially. -- Support `Color.red` and `.red` enum value references. `.red` requires an - expected enum type from context. -- Use `zero<T>()` for zero values and zero aggregate initializers. - -### Group 4: Lvalues, Field Access, Indexing, And Memory - -Status: Accepted - -Goal: make normal toy expressions produce the lvalue shapes CG needs for -`addr`, `indirect`, `load`, `store`, `index`, `field`, `memcpy`, and `memset`. -The language should express memory operations through ordinary source forms -first, with explicit builtins kept for exact CG knob coverage. - -Proposal: - -- Field access: - `expr.field` - If `expr` has record type, this selects the field directly. If `expr` has - pointer-to-record type, field access implicitly dereferences before selecting - the field. There is no `->` operator. -- Array and pointer indexing: - `expr[index]` - This lowers through `cfree_cg_index`; arrays use array lvalues, pointers use - pointer rvalues converted to element lvalues. -- Address-of accepts any lvalue: - `&x`, `&*p`, `&arr[i]`, `&rec.field`, and `&ptr.field`. -- Assignment accepts any assignable lvalue: - `x = value;`, `*p = value;`, `arr[i] = value;`, `rec.field = value;`, - `ptr.field = value;`. -- Aggregate copy is expressed by assignment when source and destination have - the same array or record type. -- Explicit memory builtins remain for exact size/alignment coverage: - `memcpy(dst, src, size, align)`, `memset(dst, value, size, align)`. - They return `void`. - -Decisions: - -- Do not add `->`; use `.` and let type checking decide whether the base is - direct or indirect. -- Allow aggregate assignment for both records and arrays. -- `memcpy` and `memset` return `void`. -- Do not add explicit offset indexing syntax or a helper for now. -- Do not add compound assignment or inc/dec syntax for now. - -### Group 5: Operators, Casts, And Control Values - -Status: Accepted - -Goal: exercise every scalar binop, cmp, unop, and conversion path while keeping -the expression grammar close to the CG stack model. - -Proposal: - -- Arithmetic operators are type-directed: - signed integer types use signed division/remainder and signed comparisons; - unsigned integer types use unsigned division/remainder and unsigned - comparisons; floats use float arithmetic/comparisons when supported by CG. -- Shift operators: - `<<` always maps to `CFREE_CG_SHL`; `>>` maps to signed or unsigned shift - based on the left operand type. -- Bitwise operators apply to integer types: `&`, `|`, `^`, and unary `~`. -- Logical operators apply to `bool`: `!`, `&&`, `||`. -- Comparisons produce `bool`, not `int`. -- Conditions and logical operators accept integer values using C-style truth: - zero is false, any non-zero value is true. The operation result is `bool`. -- Casts use the accepted spelling `expr as T`. -- Enum values cast explicitly to/from their base integer type through `as`. -- Pointer casts are allowed through `as *T` for CG conversion coverage. -- Pointer comparisons are allowed for `==`, `!=`, `<`, `<=`, `>`, and `>=`. - Ordered pointer comparisons are address-order comparisons using the target - pointer representation. -- Add bit helpers as explicit builtins instead of syntax: - `bitget(value, lo, width)` and `bitset(dst, src, lo, width)`. - These map to the inline CG composites and require integer operands. -- Keep no compound assignment and no inc/dec syntax. - -Decisions: - -- Integer truth uses zero for false and any non-zero value for true. -- Add `bitget` and `bitset` builtins. -- Allow all pointer comparisons, including ordered address comparisons. - -### Group 6: Control Flow And Expression Scopes - -Status: Accepted - -Goal: expose the structured scope operations, labels/branches through normal -control flow, and unreachable paths without turning toy into an assembly-like -language. - -Proposal: - -- Keep statement control flow: - `if cond { ... } else { ... }`, `while cond { ... }`, `break;`, - `continue;`, and `return`. -- Add expression `if`: - `if cond { expr } else { expr }` - This produces a value and exercises expression-valued scopes. -- Add result-typed `while`: - `while<T> cond { ... }` - The loop body may use `break expr;` to exit with a value of `T`. This maps to - `cfree_cg_scope_begin` with a non-`NONE` result type. -- Conditional break/continue stays source-level structured: - `if cond { break; }` and `if cond { continue; }` - The compiler may lower these through the conditional CG helpers where useful, - but toy does not add `break if` or `continue if` syntax. -- Add `unreachable();` and `trap();` as intrinsics in the intrinsic group, but - control-flow tests should also use them to cover dead-end blocks. -- Do not add source-level labels/goto initially. Short-circuiting, `if`, and - loops should cover labels and branches at the CG API level. - -Decisions: - -- Expression `if` uses braces. -- Use `while<T> cond { ... }` for value-producing loops. -- Keep source as `if cond { break; }` and `if cond { continue; }`; do not add - conditional break/continue syntax. - -### Group 7: Calls, Function Pointers, ABI, And Variadics - -Status: Accepted - -Goal: make toy able to exercise direct calls, indirect calls, tail calls, -scalar and aggregate ABI paths, and variadic argument handling using ordinary -typed source. - -Proposal: - -- Direct calls keep the existing spelling: - `callee(arg0, arg1)`. -- Function values are addressable with `&name` and have type `*fn(...): R`. -- Bare function names are also allowed in value position and produce function - values, so `let fp: *fn(i32): i32 = f;` is valid. -- Indirect calls use normal call syntax when the callee expression has function - or pointer-to-function type: - `fp(arg0, arg1)`. -- Tail calls keep explicit source spelling: - `return tail callee(args);` - The callee may be direct or indirect. Variadic tail calls remain rejected. -- Function parameters and returns may use every scalar type, pointers, enums, - aliases, records, and arrays where CG supports the shape. -- Records can be passed and returned by value. Tests should include small - records, large records, mixed integer/float records, and homogeneous float - aggregates on targets where the ABI distinguishes them. -- Arrays act like records: they are first-class values and pass by value. There - is no implicit array-to-pointer decay. -- `&arr` is equivalent to `&arr[0]`, producing a pointer to the first element - rather than a pointer to the whole array. -- Variadic functions keep `...` in the parameter list: - `fn sum(count: i32, ...): i32 { ... }` -- `va_start`, `va_arg<T>`, `va_copy`, and `va_end` become typed builtins: - `va_start(ap);`, `va_arg<T>(ap)`, `va_copy(dst, src);`, `va_end(ap);` -- Variadic call sites accept any expression type. Default argument promotion - is explicit through `as` instead of hidden C-like promotion. -- Function type queries from Group 1 should be used in tests to validate - function pointer and variadic type shapes. - -Decisions: - -- Allow bare function names in value position. -- Arrays are first-class by-value aggregates, like records. -- Do not add implicit array decay. -- `&arr` is the same as `&arr[0]`. -- Variadic call-site promotions are explicit through `as`. - -### Group 8: Intrinsics And Atomics - -Status: Accepted - -Goal: expose the CG intrinsic and atomic APIs with typed, explicit source forms -that make memory order, result shape, and multi-result behavior visible in toy -tests. - -Proposal: - -- Intrinsics use named builtin functions: - `trap()`, `unreachable()`, `clz(x)`, `ctz(x)`, `popcount(x)`, `bswap(x)`, - `prefetch(ptr)`, `expect(value, expected)`, and - `assume_aligned(ptr, align)`. -- `setjmp`/`longjmp` use typed builtins: - `setjmp(buf)` returns `i32`; `longjmp(buf, value)` returns `void` and is - noreturn in control-flow analysis. -- Overflow intrinsics return a small record shape: - `add_overflow(a, b)`, `sub_overflow(a, b)`, and `mul_overflow(a, b)`. - The result is an anonymous record with fields `{ value: T; ok: bool; }`. -- Atomic memory order names are enum-like builtin constants: - `.relaxed`, `.consume`, `.acquire`, `.release`, `.acq_rel`, `.seq_cst`. -- Atomic builtins take memory order explicitly: - `atomic_load<T>(ptr, order)`, `atomic_store<T>(ptr, value, order)`, - `atomic_rmw<T>(op, ptr, value, order)`, - `atomic_cmpxchg<T>(ptr, expected, desired, success_order, failure_order)`, - and `atomic_fence(order)`. -- Atomic RMW op names are enum-like constants: - `.xchg`, `.add`, `.sub`, `.and`, `.or`, `.xor`, `.nand`. -- `atomic_cmpxchg<T>` returns a builtin record: - an anonymous record with fields `{ prior: T; ok: bool; }` - so tests can inspect both CG results. -- The parser/type checker should reject invalid compare-exchange order pairs - when it can do so locally. -- Keep compatibility aliases for the current simple builtins only during test - migration if needed: `atomic_add`, `atomic_sub`, `atomic_cas_ok`, and - `fence`. - -Decisions: - -- Overflow and compare-exchange builtins return anonymous records. -- Use dot constants for memory orders and RMW ops. -- Remove compatibility aliases after test migration. - -### Group 9: Symbol References, Relocations, And TLS - -Status: Accepted - -Goal: support thread-local storage and normal imported/exported symbol access -as language features. Do not add a use-site relocation escape hatch unless the -CG API grows a clear language-level semantic for it. - -Proposal: - -- Keep ordinary symbol address syntax: - `&name` - The declaration attributes on `name`, plus target and code-model context, - select the ordinary reference kind. For example, TLS attributes select the - default TLS access path, and imported functions/data can use the platform's - normal PLT/GOT behavior where appropriate. -- Global data symbol initializers also use ordinary address syntax: - `let p: *i32 = &name;` - This maps to a normal `cfree_cg_data_symbol` emission selected from the same - declaration/context information. -- TLS variables are declared with Group 2 attributes: - `#[tls, tls_model(local_exec)] var tls_counter: i32;` -- TLS imports use `extern` plus the same attributes: - `#[tls, tls_model(initial_exec)] extern var errno: i32;` -- Normal TLS variable access and `&tls_counter` use the declaration's TLS - attributes and model. This is the language-level path for exercising TLS - codegen. -- Supported source TLS models are: - `default`, `local_exec`, `initial_exec`, `local_dynamic`, - `general_dynamic`, and `tlvp`. -- Target-incompatible TLS models should produce diagnostics rather than silently - falling back. -- For non-TLS imports, ordinary calls, bare function values, variable loads, and - `&name` should choose direct, PLT, GOT, or PC-relative references according - to declaration binding, visibility, target, and output mode. - -Rationale: - -- Thread-local storage is a real source-language feature. It belongs in toy - because normal loads, stores, address-taking, imports, and TLS models all - exercise meaningful codegen behavior. -- Use-site relocation forms such as explicit GOT/PLT/PCREL selection are not - currently accepted as toy language extensions. If the only way to exercise a - `CfreeCgSymbolRefKind` is an artificial escape hatch, that is evidence that - the CG API may need narrowing or re-shaping around declaration attributes, - target mode, and semantic operations instead. -- Low-level relocation record tests can remain in object/linker-specific - harnesses until there is a clear language-level operation that needs them. - -Follow-up CG API questions: - -- Should the public CG API keep `CfreeCgSymbolRefKind` on `push_symbol` and - `data_symbol`, or should reference kind be inferred from declaration attrs - and compile/output mode? -- For source TLS models, should `default` always choose the platform default, - or should toy require an explicit model in tests that assert a specific - relocation sequence? - -Decisions: - -- Support TLS as a normal source-language feature through declaration - attributes and normal variable/address access. -- Do not add explicit use-site relocation syntax to toy for now. - -### Group 10: Inline Assembly - -Status: Accepted - -Goal: expose `cfree_cg_inline_asm` through a source form that is explicit -enough for CG tests but still reads like a language feature rather than a C API -dump. - -Proposal: - -- Use an expression form: - `asm<T>(template, outputs, inputs, clobbers, flags)` - It returns `T`, or `void` for statement-only asm. -- Template is a string literal or existing `arch(...)` selector. -- Inputs are typed expressions with constraints: - `in("r", expr)`, `in("m", lvalue)`, `in("i", const_expr)`. -- Outputs declare constraints and result names: - `out("=r", name: T)`, `out("=&r", name: T)`. -- Inout operands consume and produce one value: - `inout("+r", expr)`. -- Multiple outputs return an anonymous record with fields named from the output - operands. A single output returns the output value directly. -- Named operands use the output/input names when provided, giving tests a path - to symbolic operand names. -- Clobbers are string literals: - `clobbers("memory", "cc")`. -- Flags are dot constants: - `.volatile` - Omitted flags mean non-volatile asm. -- Example: - `asm<i32>("add %0, %1, %2", out("=r", value: i32), inputs(in("r", a), in("r", b)), clobbers(), .volatile)` - -Decisions: - -- Use a single `asm<T>(...)` form for statement and expression asm. -- Use wrapper groups such as `inputs(...)` and `clobbers(...)`. -- Multi-output asm returns an anonymous record. - -## Type System - -- Add syntax for all builtin scalar types, not just `int`/`isize`, implicit - `void` returns, and `va_list`: `bool`, `i8`, `u8`, `i16`, `u16`, `i32`, - `u32`, `i64`, `u64`, `usize`, `f32`, and `f64`. -- Add float literals and operations so toy exercises `cfree_cg_push_float`, - floating arithmetic, floating calls/returns, and float ABI paths. -- Add array type syntax and declaration support. `typecheck()` currently - constructs one array type internally, but toy programs cannot declare array - locals, globals, parameters, or return values. -- Add named alias syntax. `typecheck()` constructs an alias, but aliases cannot - be introduced or used by toy source. -- Add qualified type syntax for `const`, `volatile`, and `restrict`. - `typecheck()` only constructs a `const int` and does not exercise qualified - values in declarations or memory operations. -- Add record/struct declarations with arbitrary field names, field counts, - field types, anonymous fields, packed alignment, and explicit field alignment. - Toy currently has only one internal `Pair { int a; int b; }`. -- Add enum declarations and enum constants as source-level values. The current - enum coverage is only an internal constructor call inside `typecheck()`. -- Add function pointer type syntax. Toy can form direct function types for - declared functions, but source cannot declare function pointer variables, - parameters, globals, or indirect calls. -- Exercise `cfree_cg_type_record_field`. The current type-query builtin checks - record-ness and field count, but does not query field metadata. -- Exercise type layout through source constructs, not only hard-coded helper - calls: `sizeof(type)`, `alignof(type)`, and possibly `offsetof(record, field)`. - -## Declarations And Linkage - -- Add function declarations without definitions, including external functions. - Toy calls `cfree_cg_func_decl`, but only immediately before defining the same - function. -- Add global data declarations without definitions to exercise - `cfree_cg_data_decl`. -- Add declaration attributes: non-default binding, hidden/protected visibility, - custom sections, explicit alignment, readonly, TLS, common, used, and noreturn. -- Add TLS globals and source-level selection of TLS models. -- Add support for undefined external data/function references so link-time - symbol handling is exercised from toy. - -## Global Data Initializers - -- Add byte/string data initializers. Toy has a hard-coded `byteconst()` helper - for `cfree_cg_push_bytes`, but no source-level string or byte literal data. -- Add aggregate initializers for arrays and records. -- Add explicit zero-fill ranges inside aggregate initializers, not only whole - object zero initialization. -- Add symbol initializer addends and non-pointer-width symbol records. -- Add source coverage for non-`ADDR` data symbol reference kinds: `PCREL`, - `GOT`, `PLT`, `TLS_LE`, `TLS_IE`, `TLS_LD`, `TLS_GD`, and `TLVP`. - -## Symbol References - -- Add source-level control over `cfree_cg_push_symbol` reference kind and addend. - Current code references use `CFREE_CG_SYMREF_ADDR` with addend `0`. -- Add GOT/PLT and PC-relative code reference tests. -- Add TLS code reference tests for all TLS symbol reference kinds. -- Add TLVP coverage for Mach-O-style thread-local references. - -## Values And Lvalues - -- Add string literals as expression values. -- Generalize address-of beyond identifiers. Source cannot write `&*p`, - `&index(p, i)`, or `&record.field`. -- Generalize assignment targets beyond variables and unary `*p`. Source cannot - assign through indexed or field lvalues directly. -- Add first-class array and record values so `load`, `store`, `addr`, - `indirect`, `field`, `index`, `memcpy`, and ABI aggregate paths can interact - naturally. -- Add pre/post increment and decrement to exercise `cfree_cg_inc_dec`. - -## Operators And Conversions - -- Add unsigned arithmetic and comparisons. Missing source coverage includes - `CFREE_CG_UDIV`, `CFREE_CG_UREM`, `CFREE_CG_SHR_U`, `CFREE_CG_LT_U`, - `CFREE_CG_LE_U`, `CFREE_CG_GT_U`, and `CFREE_CG_GE_U`. -- Add explicit casts to exercise `cfree_cg_convert` across integer widths, - signedness, pointers, booleans, and floats. Current conversion coverage is - essentially comparison `i1 -> int`. -- Add a real `bool` type instead of representing truth values as `int`. -- Add bitfield-like extraction/insertion helpers or syntax to exercise - `cfree_cg_bitget` and `cfree_cg_bitset`. -- Check whether toy should allow chained comparisons or keep one comparison per - expression; the current parser accepts only one comparison operator at that - precedence level. - -## Stack Operations - -- Keep direct stack manipulation out of normal toy syntax if possible, but add - small targeted builtins when needed to exercise stack-sensitive CG behavior. -- `cfree_cg_dup`, `cfree_cg_swap`, `cfree_cg_drop`, and `cfree_cg_rot3` are used - incidentally today. Add explicit regression cases for stack order if bugs - appear around multi-result operations, inc/dec, atomics, or inline asm. - -## Control Flow - -- Add expression-valued scopes so `cfree_cg_scope_begin` with a non-`NONE` - result type, `cfree_cg_break`, `cfree_cg_break_true`, `cfree_cg_break_false`, - and `cfree_cg_scope_end` are exercised with carried values. -- Add conditional loop continuation syntax or builtins for - `cfree_cg_continue_true` and `cfree_cg_continue_false`. -- Add direct tests for `cfree_cg_break_true`; current loops use - `cfree_cg_break_false` for while conditions. -- Keep label/jump/branch coverage through short-circuiting and `if`, but add - targeted cases for awkward CFG shapes: empty blocks, nested conditionals, - early returns inside loops, and branches after unreachable paths. - -## Calls And ABI Coverage - -- Add indirect calls through function pointers. -- Add calls and returns for every scalar builtin type. -- Add by-value record parameters and record returns, including large records and - ABI-specific aggregate classifications. -- Add HFA/HVA-style float aggregate coverage where the target ABI supports it. -- Add variadic arguments beyond integer values: pointers, floats, and records. -- Decide whether to support variadic tail calls or document that toy should - reject them permanently. - -## Intrinsics - -- Add source forms for `trap()` and `unreachable()`. -- Add `setjmp`/`longjmp` coverage, including buffer storage. -- Add overflow intrinsics: `add_overflow`, `sub_overflow`, and `mul_overflow`. - These require a way to consume the two-value `(result, ok)` CG result. -- Add `prefetch(addr)`. -- Add `assume_aligned(ptr, align)` or an equivalent helper. -- Add result-type checks for existing intrinsics across more than `int` once - more scalar types exist. - -## Atomics - -- Add source-level memory order selection for relaxed, consume, acquire, - release, acquire-release, and sequentially consistent operations. -- Add remaining RMW operations: exchange, and, or, xor, and nand. -- Add atomic operations over widths other than toy `int`. -- Add compare-exchange forms that expose both returned values: the prior value - and the success flag. `atomic_cas_ok` currently drops the prior value. -- Add invalid-order diagnostics for compare-exchange success/failure pairs. - -## Inline Assembly - -- Add symbolic operand names. -- Add multiple outputs and multiple inout operands. -- Add asm operands for non-`int` types once those types exist. -- Add non-volatile asm support. Current helpers always set - `CFREE_CG_ASM_VOLATILE`. -- Add richer clobber lists instead of only `memory` or one arch-selected string. -- Add target-specific tests for unsupported constraints and diagnostics. - -## Memory Operations - -- Add configurable alignment operands for `memcpy` and `memset`; current toy - always emits alignment `1`. -- Add configurable `index` offset; current toy always passes offset `0`. -- Add direct field access syntax for arbitrary records. `fieldtest()` only - stores and loads field index `1` of the internal `Pair`. -- Add array indexing over array lvalues, not just pointer indexing. -- Add record and array copies that lower through normal source constructs. - -## Debug Location Coverage - -- Add tests that verify `cfree_cg_set_loc` flows through functions, scopes, - locals, params, instructions, and data definitions. -- Add multi-line expressions and initializers so sticky source locations are - exercised across nested parse/codegen calls. - -## Test Harness Work - -- Keep each new feature paired with a small toy corpus case under - `test/toy/cases`. -- Prefer one feature family per case so CG regressions point at a small surface. -- Run targeted toy cases across representative architectures when the feature - touches ABI, relocation, TLS, or inline asm behavior.