kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

CG / ObjBuilder Lifecycle

This is the target lifecycle for semantic code generation and object building. It is motivated by LTO, but it should be true for ordinary one-TU compilation as well: ObjBuilder owns object lifetime, while KitCg borrows an object and finishes codegen into it.

Status (2026-06-04): the borrowed CG/object lifecycle is implemented as the only public CG session interface. kit_cg_free aborts and detaches without flushing, lowering, debug-emitting, or finalizing the borrowed object. Shared-library LTO remains disabled until that output path is exercised.

Problem

Historically KitCg had an object-shaped lifecycle:

cg_begin_object(cg, ob, code_opts);
frontend_compile_cg(..., cg);
cg_end_object(cg);
kit_obj_builder_finalize(ob);

That was the wrong ownership boundary. KitCg does not create, emit, link, or free the object; the caller does. In the borrowed lifecycle, kit_cg_finish finalizes the CG target and emits debug, while kit_cg_detach drops the borrowed object/target links. kit_cg_free follows the abort path and never finishes a partial object as a side effect of cleanup.

It also makes LTO harder to finish cleanly. LTO needs to collect multiple source units into one object, then finish semantic codegen only after the driver/linker has enough information to provide preserved/export policy. That handoff should be a KitCg finish option, not a driver-owned pseudo-unit abstraction.

Ownership Model

ObjBuilder owns object state:

KitCg owns a semantic codegen session attached to an object:

The driver or API caller owns orchestration:

Target API Shape

The exact names can change, but the shape should be explicit:

KitObjBuilder* ob = NULL;
KitCg* cg = NULL;

kit_obj_builder_new(compiler, &ob);
kit_cg_new(compiler, &cg);

kit_cg_begin(cg, ob, &code_opts);       /* borrow ob, attach backend */
kit_cg_begin_unit(cg, &unit_opts);       /* source contribution */
frontend_compile_cg(..., cg);
kit_cg_end_unit(cg);
kit_cg_finish(cg, &finish_opts);        /* flush/lower/debug into ob */
kit_cg_detach(cg);                      /* drop borrowed links */

kit_obj_builder_finalize(ob);

For multi-source LTO, only the unit loop grows:

kit_obj_builder_new(compiler, &ob);
kit_cg_new(compiler, &cg);
kit_cg_begin(cg, ob, &code_opts);

for each semantic source:
  kit_cg_begin_unit(cg, &unit_opts);
  frontend_compile_cg(..., cg);
  kit_cg_end_unit(cg);

kit_cg_finish(cg, &finish_opts);
kit_cg_detach(cg);
kit_obj_builder_finalize(ob);

Opaque frontends do not attach to KitCg; they compile directly into their own ObjBuilder and enter link/archive/relocatable order as ordinary objects.

Object vs Unit

An object is the emitted product. It may contain one source unit or many.

A unit is one semantic source contribution inside the object. Unit boundaries are not object boundaries. They exist so codegen can track:

Finish Options

kit_cg_finish is where link-picture-dependent policy enters semantic optimization. For LTO, finish options should eventually carry:

The finish operation may use internal ObjSymId sets when the linker/driver has already resolved names into the shared ObjBuilder. A public API can offer a name-based adapter if needed, but the core should prefer symbol ids once an object exists.

kit_cg_finish must not call kit_obj_builder_finalize. The caller finalizes the object after CG has finished writing semantic output into it.

Failure Model

Cleanup must not finalize by accident.

This fixes the old wart where freeing an open KitCg could finalize a partial object.

Boundary Rules

Frontends should only see the KitCg semantic API or the object-only API they explicitly implement. A semantic frontend should not own ObjBuilder finalization, and an opaque frontend should not need a fake KitCg.

ObjBuilder should remain the single source of truth for object symbol identity and storage. CG may ask it to declare/define/merge contributions, but CG should not own object lifetime.

The driver should not implement symbol merge, semantic finalization, or internalization policy. It should gather sources, gather opaque inputs, compute or request preserved/export policy, and pass that policy to kit_cg_finish.

Migration Plan

  1. Introduce borrowed-lifecycle names as the public API: kit_cg_begin, kit_cg_finish, kit_cg_detach, and kit_cg_abort.
  2. Make one-TU semantic compilation use the same borrowed lifecycle that LTO uses: caller creates ObjBuilder, CG borrows it, CG finishes, caller finalizes the object.
  3. Add begin_unit / end_unit bookkeeping and use it in ordinary one-TU and multi-source LTO paths.
  4. Move output-kind and preserved/export input into kit_cg_finish options. The driver now passes output-kind/interposition policy for supported outputs; preserved-symbol computation, internalization, and shared-library LTO remain follow-up work, so global roots stay conservative.
  5. Move duplicate function/data contribution bookkeeping toward the ObjBuilder/CG contribution boundary so src/opt and src/cg/data.c do not each own fragments of LTO symbol-resolution policy.

Non-Goals