kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

IR

This document defines the semantics of kit's semantic CG IR: the function-level, recorded form of the internal CgTarget interface, captured as a CgIrModule of CgIrFunc bodies (see src/cg/ir.h). It is the stable hinge between the frontend's typed CG-API calls and everything downstream that wants a durable program form rather than immediate emission: the optimizer, the threaded interpreter, and source-like replay backends. This is the authoritative semantics-of-the-IR reference; for how the IR is produced and replayed see CODEGEN.md, for the optimizer's own derived form see OPT.md, and for interpreted execution see INTERPRETER.md.

What the IR is, and what it is not

The IR is a faithful tape of CgTarget calls. The CgTarget interface (src/cg/cgtarget.h) is the semantic codegen API: typed locals, labels, structured scopes, memory ops, ABI-shaped calls, atomics, intrinsics, inline asm. A backend can implement that interface to emit code immediately (the O0 native path, the C-source path). The IR recorder (src/cg/ir_recorder.c) is a second implementation of the same interface that, instead of emitting, records each call into a CgIrInst on the current CgIrFunc. Replaying the recorded tape through a direct target reproduces exactly what immediate emission would have done.

Because of that, the IR carries no optimizer state and no machine state. There are no basic blocks, no SSA values, no phis, no dominance, no liveness, no virtual or hard registers, no spill slots, and no call plans in the CG IR. Those are all derived, consumer-private views. In particular the optimizer's Func IR (src/opt/ir.h) is a separate representation with its own op set (IR_PHI, IR_PARAM_DECL, IR_CONST_I, ...); it is built from the CG IR, not a superset of it. Do not conflate the two: the CG IR enum is CgIrOp in src/cg/ir.h; the optimizer enum is IROp in src/opt/ir.h.

The IR is target-data-layout-specific but not target-instruction-specific. Type sizes, alignments, record field offsets, bitfield bit ranges, ABI classifications, and pointer widths are already resolved for the compile target by the time the recorder sees a call. The IR does not know about machine instructions, addressing-mode legality, or register files.

No undefined behavior

The CG IR has no undefined behavior. Every operation, on every input, has a fully determined meaning that falls into exactly one of three categories:

There is deliberately no fourth "anything may happen" category. Where C would say undefined behavior, the CG IR says portably defined, target-defined, or malformed — never unconstrained. The runtime half of this guarantee (what each op computes on every input) is spelled out in Well-definedness: edge-case semantics; the structural half is the Well-formedness list.

Pipeline position

frontend
  -> KitCg          (public CG API: stack/lvalue model)
     -> CgTarget       (semantic codegen interface)
          |-> direct native target      (O0 emit)
          |-> direct C-source target     (--emit=c)
          \-> IR recorder -> CgIrModule  (O1/O2, interpreter)
                                  |-> opt: derive Func (CFG/SSA/MIR) -> native emit
                                  \-> opt: derive Func (reduced) -> interpreter

KitCg lowers the frontend's stack/lvalue source operations into flat CgTarget calls. At O0 those calls hit a direct target and become code right away. At O1/O2 and under the interpreter they hit the recorder and become a CgIrModule. The recorder is created by the optimizer (src/opt/opt.c calls cg_ir_recorder_new); it notifies the optimizer per completed function and at finalize through callbacks so cross-function work (inlining, reachability, alias resolution) can run before the buffered IR is lowered into the wrapped direct target.

Module and function structure

A CgIrModule owns the translation unit's recorded functions, symbol aliases, and file-scope __asm__ blocks. File-scope asm is retained on the module rather than emitted during recording because the optimizer path has no live emit target at recording time; it is replayed at finalize.

A CgIrFunc is one function body and owns everything needed to replay, optimize, or interpret it: the preserved CGFuncDesc (symbol, function type, result/param descriptors, source location, attributes, inline policy); a linear instruction stream (CgIrInst tape); and side tables for locals, params, labels, and scopes. It also caches two ObjSymSets — the set of symbols it calls and the set of globals it references — populated as operands are recorded, so reachability and alias passes need not rescan the tape.

There is one local namespace per function. A CgIrLocal is a mutable typed location identified by a CGLocal id (1-based; CG_LOCAL_NONE is the sentinel). A local records its CGLocalDesc (type, size, align, source name/loc), whether it is a parameter (with parameter index), and whether its address has been taken. Parameters are declarations, not executable ops: the recorder adds the parameter local and a CgIrParam entry; there is no parameter instruction in the tape. Taking the address of a local (CG_IR_ADDR_OF, or the dedicated local_addr recording) sets the local's address_taken flag, which downstream consumers use to decide it needs a concrete memory home; non-address-taken scalar locals may live in registers, SSA values, or interpreter slots as the consumer sees fit.

Labels, scopes, and derived blocks

Labels are first-class because CG control-flow ops name them: branch targets, switch case/default targets, label-address materialization, and the closed target set of a computed goto. A CgIrLabel records its id and the source location of its first placement. Placement appears in the tape as a CG_IR_LABEL instruction.

Structured scopes (CgIrScope) capture CG's structured control model. There are two scope kinds (ScopeKind in src/cg/cgtarget.h): SCOPE_BLOCK, a forward-only region whose break skips to the end, and SCOPE_LOOP, whose break exits forward and whose continue jumps to an explicit loop-header target. if/if-else is not a distinct scope kind: the frontend lowers it to a pair of nested forward blocks (kit_cg_if_begin/_else/_end), so there is no else op in the IR. Backends able to express structure (the C-source target, a future Wasm target; see WASM.md) replay scopes directly; native CFG consumers flatten them to ordinary labels and branches.

Basic blocks are not part of the IR. A consumer that needs CFG form derives it by splitting the linear tape at labels, scope boundaries, and terminators. That derived CFG, with its predecessor/successor edges, layout order, and dominance, belongs to the consumer (the optimizer builds exactly this in opt_func_from_cg_ir).

Instructions and operands

A CgIrInst has an op (CgIrOp), a sticky source location captured from the last set_loc, an operand array, and an extra union holding op-specific auxiliary data: a raw immediate, constant bytes, a MemAccess, or an arena pointer to an op-specific aux struct. There is no separate result-type field; each operand carries its own KitCgTypeId and destinations name typed locals, so the instruction's types are recoverable from its operands and aux.

Most ops map one-to-one to a CgTarget method, and the operand order in the tape follows the method's argument order — destination first where there is one. Multi-result ops (calls, compare-and-swap, checked-arithmetic intrinsics) name several destination locals.

Operands use the shared Operand shape (src/cg/cgtarget.h), every variant typed by a KitCgTypeId:

There is deliberately no register operand kind. Register-like temporaries are just locals; physical registers are a backend concern that never appears in the CG IR.

Types

Every local and operand carries a KitCgTypeId — a CG storage type already selected for the target, not a frontend AST type. Enums and typedef aliases have already collapsed to their storage type; record/array field identity is gone, replaced by byte offsets, bit ranges, and aggregate sizes. The CG type system covers void, a boolean/i1 condition type, width-only integers, the float widths, pointers (with address space), function-pointer values, opaque fixed-size aggregates, and the per-arch vararg-state object. Signedness is not a property of an integer type; it is carried by the operation that consumes the value (signed vs unsigned divide, compare, shift, extend, and int/float conversion). ABI decomposition — splitting one source value into several storage parts for argument passing or returns — is recorded in the call and return descriptors, not by re-typing ordinary value ops.

Operation families

The complete op set is CgIrOp in src/cg/ir.h; the categories below describe its semantics. The textual dumper (src/cg/ir_dump.c, reachable as cg_ir_func_dump) is the canonical rendering and a good cross-check for the spelling and operand order of any op.

Administrative

Data movement

All memory and aggregate/bitfield ops rely on target layout facts already encoded in their operands and aux records; consumers must not reinterpret layout for a different target.

Arithmetic, compare, convert

Source operands of binop/unop/cmp may be OPK_IMM as well as OPK_LOCAL; the backend or interpreter decides whether to fold a small immediate into an instruction form or materialize it. The operation tag families (BinOp, UnOp, CmpOp, ConvKind, AtomicOp, MemOrder, IntrinKind) are defined in src/cg/cgtarget.h and are open to vector/SIMD extension — consumers must switch with a default arm rather than assume exhaustiveness.

Calls and returns

Tail calls are modeled as a CG_IR_CALL carrying the CG_CALL_TAIL flag, not as a property of CG_IR_RET. CG verifies realizability before setting the flag (through the target's tail_call_unrealizable_reason query, which the recorder forwards to its configured callback); the recorder preserves the tail policy so replay can emit a sibling call, fall back to call-plus-return, or diagnose.

Branching and computed goto

Label addresses are opaque, function-local tokens. They may be stored, loaded, compared, selected, and consumed by CG_IR_INDIRECT_BRANCH within the same function activation; they are not callable function pointers and not dereferenceable data.

Function-local static data

Structured scopes

These ops preserve CG's C-like structured control model — block and loop scopes — so backends that express structure directly (the C-source target, a future Wasm target) can replay it without rebuilding a CFG. CFG-based consumers ignore the structure and reconstruct control flow from the underlying labels and branches instead. if/if-else has no dedicated op or scope kind; the frontend builds it from nested forward block scopes plus CG_IR_BREAK_TO.

Scope ids are 1-based with CG_SCOPE_NONE as the zero sentinel. The structured form is advisory metadata layered over the same primitive control flow: a consumer that flattens scopes to labels and branches produces the same observable behavior as one that replays the structure natively.

Stack allocation and variadics

Atomics

Atomic ops carry both a MemAccess and memory-order metadata in their aux. They are observable and must preserve the ordering the memory model requires.

Intrinsics and inline asm

Semantic modes: portable vs target-defined

A handful of integer and conversion operations have edge cases whose cheapest lowering differs across targets: integer division by zero and INT_MIN / -1, shift counts at or beyond the operand width, and out-of-range or NaN float→int conversions. For these the IR offers two semantics, chosen per instruction by the frontend:

The choice rides in CgIrInst.flags (CgIrInstFlag in src/cg/ir.h):

Flag Affects Cleared (portable default) Set (target-defined)
CG_IR_INST_TARGET_DIV_EDGES BINOP sdiv/udiv/srem/urem div-by-zero traps; INT_MIN/-1 wraps target divide instruction
CG_IR_INST_TARGET_SHIFT_EDGES BINOP shl/shr_s/shr_u count reduced modulo width target shift instruction
CG_IR_INST_TARGET_FPTOINT_EDGES CONVERT ftoi_s/ftoi_u saturate; NaN→0 target convert instruction

Both modes are fully defined: target-defined is still deterministic per target, never unconstrained. This flag set is the only place the IR's value semantics depend on a producer choice rather than on the op alone; everything else is fixed by the op. Memory-safety faults are always target-defined and are not governed by these flags — there is no portable bounds-checking mode (see Memory).

Portable is the safe default for a consumer that has not yet been taught a flag: implementing portable semantics where the op asked for target-defined is always legal, because the opt-in is only ever taken when the source language permits any behavior at that edge. Wiring the public CG API and recorder to set these bits, and teaching each consumer (optimizer, interpreter, native and C-source backends) to honor them, is implementation work tracked separately from this spec; the bits are defined here so the IR can carry the choice.

Well-definedness: edge-case semantics

This section pins down every operation's behavior on the inputs that a structural reading of the op set leaves open. It mirrors the operation families above. Unless a rule is marked target-defined, it is portably defined.

Integer arithmetic and bitwise

Floating point

The IR's floating-point operations are strict IEEE-754 in the target's default environment: round-to-nearest-ties-to-even, non-trapping exceptions (status-flag only), no denormal flushing. These are portable; the IR does not represent alternate rounding modes or fast-math relaxations (the public API's rounding argument and FP fast-math flags are dropped at the IR level unless the frontend realizes them as explicit operations).

Conversions

Memory: load, store, aggregate, bitfield

Control flow

Calls and returns

Stack allocation and variadics

Atomics

Intrinsics and inline asm

Operand shapes are fixed per IntrinKind (src/cg/cgtarget.h). Semantic edges:

Well-formedness (invariants)

A well-formed tape satisfies all of the following; consumers may assume them, and a violation is a producer bug (malformed IR), not program behavior. These are the structural half of "no undefined behavior" — the runtime half is the edge-case section above.

Consumer guidance

Anything that reads the IR is reading a layout-resolved, ABI-shaped, but machine-neutral program. The contract a consumer must respect: preserve target-data-layout semantics, memory observability (the MemFlag set and alias roots on each access), the ABI shape of calls and returns, and CFG validity. It must also implement at least the portable edge-case semantics of every op, and honor the CgIrInst.flags semantic-mode bits where it understands them — falling back to portable semantics (a safe refinement) for any bit it does not. A consumer may assume a well-formed tape; it must not introduce undefined behavior of its own where the IR defines a result.

Two consumers exist today, and they take different paths:

Source-like backends (the C-source target, a future Wasm target) can instead replay the tape op-by-op into a direct CgTarget, taking advantage of the retained structured scopes and switch descriptors.