kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

lib/ — libkit_rt.a source

Runtime helpers for kit, derived from compiler-rt 18.1.8 (lib/builtins/) and stripped of all target-dispatch ifdefs. Every helper that varies across targets is selected by directory + compile flags, not preprocessor branches inside source.

The build compiles exactly one master .c (and/or .S) file per feature flag — no globbing of per-op files. Per-op snippets are inlined directly into the master, with the per-precision / per-(src,dst) machinery from impl/ re-applied for each section.

License: Apache-2.0 WITH LLVM-exception (see LICENSE-compiler-rt.txt). The hand-written mem/mem.c is 0BSD; relicense as desired.

Layout

Master files (each becomes one object in libkit_rt.a)

File Purpose Built on
int/int.c Integer helpers needed on every target All
int32/int32.c 64-bit ops synthesized from 32-bit ILP32 only
int64/int64.c 128-bit ops implemented on explicit 64-bit lanes LP64 / LLP64 only
fp/fp.c Soft-float sf (binary32) + df (binary64) + sf↔df + fp_mode FPU-less (RV{32,64}I, ARM softfp, WASM)
fp_tf/fp_tf.c Soft-float tf (binary128) + sf↔tf + df↔tf + i128↔tf Targets with binary128 long double (e.g. aarch64 -mlong-double-128)
fp_ti/fp_ti.c __int128 ↔ sf/df + sf/df → ti fix LP64 / LLP64 + soft-float
arm/aeabi_thumb2.S AEABI div/mod/mem* + soft-float compares (ARMv7+/Thumb2) 32-bit ARM, ARMv7+/Thumb2
arm/aeabi_thumb1.S Same, Thumb1-tuned (no tail-calls, simpler instr forms) 32-bit ARM, ARMv6-M (Cortex-M0/M0+/M1)
arm/aeabi.c AEABI __aeabi_drsub / __aeabi_frsub (ISA-agnostic) 32-bit ARM (both ISA modes)
riscv/rv32.S __riscv_save_* + __riscv_restore_* (rv32) RISC-V rv32 with -msave-restore
riscv/rv64.S __riscv_save_* + __riscv_restore_* (rv64) RISC-V rv64 with -msave-restore
mem/mem.c memcpy / memmove / memset / memcmp (weak) All; user libc overrides
atomic/atomic_freestanding.c __atomic_* fallback shim All
coro/<arch>.c Per-arch primitives: setjmp / longjmp (<setjmp.h>) + __kit_coro_ctx_init / __kit_coro_switch / __kit_coro_trampoline (internal; the public <kit/coro.h> API sits on top via coro/coro.c) One of aarch64, arm32, arm32_thumb1, i386, riscv32, riscv64, x86_64, x86_64_win. Not built for wasm32.
coro/coro.c Arch-agnostic asymmetric layer: coro_init / coro_resume / coro_yield / coro_self (<kit/coro.h>) All variants that ship a coro/<arch>.c.

Build-time include dirs (consumed by the masters; nothing here lands in libkit_rt.a)

Directory Consumed by
impl/ int/int.c (via int_div_impl.inc); every fp* master (via fp_*_impl.inc, fp_extend_impl.inc, fp_trunc_impl.inc, int_to_fp_impl.inc)
include/common/ All masters (transitively, via the per-target int_lib.h); arm/aeabi_thumb*.S includes assembly.h
include/lp64_le/ LP64 builds; selected via -Iinclude/lp64_le
include/llp64_le/ LLP64 (Win64) builds
include/ilp32_le/ ILP32 builds
include/lp64_le_ldbl128/ Extra -include tf_supplement.h when compiling fp_tf/fp_tf.c on binary128-long-double targets
atomic/atomic_common.inc atomic/atomic_freestanding.c

How the build picks files

target tuple                              ⟶ compile

x86_64-linux  / x86_64-darwin / aarch64-darwin / rv64
                                          ⟶ int/int.c int64/int64.c fp/fp.c
                                            atomic/atomic_freestanding.c
                                            mem/mem.c
                                          -Iinclude/lp64_le
                                          -DHAS_INT128=1

x86_64-windows                            ⟶ same set, -Iinclude/llp64_le
                                          -DHAS_INT128=1

i386-* / arm32-* / rv32 / wasm32          ⟶ int/int.c int32/int32.c fp/fp.c
                                            atomic/atomic_freestanding.c
                                            mem/mem.c
                                          -Iinclude/ilp32_le
                                          -DHAS_INT128=0

aarch64-linux (binary128 long double)     ⟶ above LP64 set + fp_tf/fp_tf.c
                                            + fp_ti/fp_ti.c
                                          -include include/lp64_le_ldbl128/tf_supplement.h

rv32 with -msave-restore                  ⟶ above + riscv/rv32.S
rv64 with -msave-restore                  ⟶ above + riscv/rv64.S
arm32 ARMv7+/Thumb2 (AEABI)               ⟶ above + arm/aeabi_thumb2.S + arm/aeabi.c
arm32 ARMv6-M Thumb1 (Cortex-M0/M0+/M1)   ⟶ above + arm/aeabi_thumb1.S + arm/aeabi.c

-Iinclude/common and -Iimpl are always added. The full set of variants is in build.sh.

Endianness

All headers assume __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__. kit's supported targets (x86, ARM-LE, RISC-V, WASM) are LE in practice. Big-endian support would need a parallel *_be/ header set; not provided.

Re-includable templates in impl/ and include/common/fp_lib.h

Each master compiles to one TU. To support multiple precisions and multiple (src, dst) pairs in that single TU, the upstream compiler-rt's single-precision-per-TU model was extended:

In short: every template that the master includes more than once per TU either uses a precision-derived suffix (auto, via FP_LIB_SUFFIX) or a caller-defined suffix.

Files with surviving preprocessor logic

The "no target-dispatch ifdefs" rule is applied to every master. Some templates retain preprocessor logic that is not target dispatch:

Notes per master

mem/mem.c

Hand-written portable C (not from compiler-rt). All four functions are weak so a user libc, or a tuned arch-specific replacement, wins at link time. arm/aeabi_thumb{1,2}.S's aeabi_mem* symbols forward to these.

coro/<arch>.c + coro/coro.c

The coro module ships in two layers:

coro/<arch>.c (one per arch) — per-target primitives, file-scope __asm__ inside a .c file (not a separate .S) so the tiny C __kit_coro_ctx_init and the asm save/restore stay co-located. Provides:

The three primitives that need register save/restore (setjmp, longjmp, __kit_coro_switch) share one pair of C string-concat macros SAVE_INTO(reg) / RESTORE_FROM(reg) so the same instruction bytes are emitted in all three places. Symbol naming uses __USER_LABEL_PREFIX__ so the same source compiles for ELF / Mach-O / COFF.

ARM ships two variants: arm32.c (Thumb-2, ARMv7+, optional VFP d8-d15 gated on __ARM_FP) and arm32_thumb1.c (ARMv6-M / Cortex-M0/M0+; no IT blocks, no VFP, data-processing restricted to r0-r7, no str sp / str rN, [sp,...] — the asm sequences don't share with arm32.c so it's a separate file).

Not provided for wasm32 (would need an Asyncify-fiber port).

coro/coro.c (arch-agnostic) — the public asymmetric API: coro_init / coro_resume / coro_yield / coro_self. Tracks the current coroutine in a static, threads each coro_t's resumer slot through the resume chain, and dispatches the per-arch trampoline via a thunk that runs the user's coro_fn, marks the coroutine CORO_DEAD, and switches back to the resumer. Built once per coro variant and linked alongside the per-arch master.

atomic/atomic_freestanding.c

Defines a pointer-sized _Atomic(uintptr_t) spinlock as the lock primitive (no OS dependency) then #includes atomic_common.inc, which contains the dispatch logic and all __atomic_*_N expansions. The shim calls the GCC __atomic_* builtin family (the one kit documents in doc/builtins.md); upstream's Clang-only __c11_atomic_* calls were translated. Public symbols are exported via #pragma redefine_extname from _c-suffixed names so they don't collide with the clang builtins of the same name.

arm/aeabi_thumb{1,2}.S

AEABI aliases for div/mod, soft-float compares, and the aeabi_mem{cpy,move,set,clr} size-specialized wrappers. Two ISA-mode variants:

arm/aeabi.c

__aeabi_drsub and __aeabi_frsub. Built alongside whichever aeabi_thumb{1,2}.S is selected.

riscv/rv{32,64}.S

Combined save_* + restore_*. Upstream's save.S / restore.S are gated on __riscv_xlen; kit splits per xlen. The embedded ABI variants (__riscv_32e / __riscv_64e) are not carried over.

fp_tf/fp_tf.c

Compile only on targets where long double is IEEE binary128 (typically aarch64 with -mlong-double-128). The build must -include lp64_le_ldbl128/tf_supplement.h so tf_float, CRT_HAS_TF_MODE, etc. are defined before fp_lib.h is processed.

Compare helpers

The comparesf2 / comparedf2 / comparetf2 sections of the fp masters define every variant (__eqXf2, __ltXf2, __neXf2, __gtXf2) as a separate function rather than using COMPILER_RT_ALIAS. Replaces an object-format-conditional macro with a handful of one-line wrappers.

int_util section of int/int.c

Replaced upstream's hosted/kernel/Apple/Win32 abort cascade with a single freestanding __compilerrt_abort_impl that calls __builtin_trap().

Things this lib does NOT cover

These are documented in doc/builtins.md but not provided here:

Updating from a newer compiler-rt

The per-op layout that lived in upstream's lib/builtins/<op>.c is now inlined into masters here. To pull in a newer release:

  1. Identify which builtins changed between releases (git log or release notes). For each changed builtin:
    • Find its inlined block in the matching kit master (each block is prefixed by // ---- <upstream filename> ----).
    • Diff that block against upstream lib/builtins/<op>.c. Drop any re-introduced target-dispatch ifdefs (__ARM_EABI__, __MINGW32__, __SOFTFP__).
  2. If int_util.c changed upstream, re-strip to the freestanding-only abort path.
  3. If riscv/save.S / restore.S changed, re-split into the per-xlen inline blocks of riscv/rv{32,64}.S.
  4. Diff int_lib.h / fp_lib.h against the per-target copies under include/; update for any new feature gates that aren't legitimately target-orthogonal. Note: upstream's int_endianness.h and int_types.h are folded into each per-target int_lib.h. Note also that kit's fp_lib.h has been refactored into a re-includable form (per-precision suffix-renaming plus bare-name aliases) — pull upstream changes into the precision blocks; don't revert the suffix machinery.
  5. Run bash build.sh and confirm all 13 variants pass.