lib/ — libkit_rt.a source
Runtime helpers for kit, derived from
compiler-rt 18.1.8
(lib/builtins/) and stripped of all target-dispatch ifdefs. Every helper
that varies across targets is selected by directory + compile flags, not
preprocessor branches inside source.
The build compiles exactly one master .c (and/or .S) file per feature
flag — no globbing of per-op files. Per-op snippets are inlined directly
into the master, with the per-precision / per-(src,dst) machinery from
impl/ re-applied for each section.
License: Apache-2.0 WITH LLVM-exception (see LICENSE-compiler-rt.txt). The
hand-written mem/mem.c is 0BSD; relicense as desired.
Layout
Master files (each becomes one object in libkit_rt.a)
| File | Purpose | Built on |
|---|---|---|
int/int.c |
Integer helpers needed on every target | All |
int32/int32.c |
64-bit ops synthesized from 32-bit | ILP32 only |
int64/int64.c |
128-bit ops implemented on explicit 64-bit lanes | LP64 / LLP64 only |
fp/fp.c |
Soft-float sf (binary32) + df (binary64) + sf↔df + fp_mode |
FPU-less (RV{32,64}I, ARM softfp, WASM) |
fp_tf/fp_tf.c |
Soft-float tf (binary128) + sf↔tf + df↔tf + i128↔tf |
Targets with binary128 long double (e.g. aarch64 -mlong-double-128) |
fp_ti/fp_ti.c |
__int128 ↔ sf/df + sf/df → ti fix |
LP64 / LLP64 + soft-float |
arm/aeabi_thumb2.S |
AEABI div/mod/mem* + soft-float compares (ARMv7+/Thumb2) | 32-bit ARM, ARMv7+/Thumb2 |
arm/aeabi_thumb1.S |
Same, Thumb1-tuned (no tail-calls, simpler instr forms) | 32-bit ARM, ARMv6-M (Cortex-M0/M0+/M1) |
arm/aeabi.c |
AEABI __aeabi_drsub / __aeabi_frsub (ISA-agnostic) |
32-bit ARM (both ISA modes) |
riscv/rv32.S |
__riscv_save_* + __riscv_restore_* (rv32) |
RISC-V rv32 with -msave-restore |
riscv/rv64.S |
__riscv_save_* + __riscv_restore_* (rv64) |
RISC-V rv64 with -msave-restore |
mem/mem.c |
memcpy / memmove / memset / memcmp (weak) |
All; user libc overrides |
atomic/atomic_freestanding.c |
__atomic_* fallback shim |
All |
coro/<arch>.c |
Per-arch primitives: setjmp / longjmp (<setjmp.h>) + __kit_coro_ctx_init / __kit_coro_switch / __kit_coro_trampoline (internal; the public <kit/coro.h> API sits on top via coro/coro.c) |
One of aarch64, arm32, arm32_thumb1, i386, riscv32, riscv64, x86_64, x86_64_win. Not built for wasm32. |
coro/coro.c |
Arch-agnostic asymmetric layer: coro_init / coro_resume / coro_yield / coro_self (<kit/coro.h>) |
All variants that ship a coro/<arch>.c. |
Build-time include dirs (consumed by the masters; nothing here lands in libkit_rt.a)
| Directory | Consumed by |
|---|---|
impl/ |
int/int.c (via int_div_impl.inc); every fp* master (via fp_*_impl.inc, fp_extend_impl.inc, fp_trunc_impl.inc, int_to_fp_impl.inc) |
include/common/ |
All masters (transitively, via the per-target int_lib.h); arm/aeabi_thumb*.S includes assembly.h |
include/lp64_le/ |
LP64 builds; selected via -Iinclude/lp64_le |
include/llp64_le/ |
LLP64 (Win64) builds |
include/ilp32_le/ |
ILP32 builds |
include/lp64_le_ldbl128/ |
Extra -include tf_supplement.h when compiling fp_tf/fp_tf.c on binary128-long-double targets |
atomic/atomic_common.inc |
atomic/atomic_freestanding.c |
How the build picks files
target tuple ⟶ compile
x86_64-linux / x86_64-darwin / aarch64-darwin / rv64
⟶ int/int.c int64/int64.c fp/fp.c
atomic/atomic_freestanding.c
mem/mem.c
-Iinclude/lp64_le
-DHAS_INT128=1
x86_64-windows ⟶ same set, -Iinclude/llp64_le
-DHAS_INT128=1
i386-* / arm32-* / rv32 / wasm32 ⟶ int/int.c int32/int32.c fp/fp.c
atomic/atomic_freestanding.c
mem/mem.c
-Iinclude/ilp32_le
-DHAS_INT128=0
aarch64-linux (binary128 long double) ⟶ above LP64 set + fp_tf/fp_tf.c
+ fp_ti/fp_ti.c
-include include/lp64_le_ldbl128/tf_supplement.h
rv32 with -msave-restore ⟶ above + riscv/rv32.S
rv64 with -msave-restore ⟶ above + riscv/rv64.S
arm32 ARMv7+/Thumb2 (AEABI) ⟶ above + arm/aeabi_thumb2.S + arm/aeabi.c
arm32 ARMv6-M Thumb1 (Cortex-M0/M0+/M1) ⟶ above + arm/aeabi_thumb1.S + arm/aeabi.c
-Iinclude/common and -Iimpl are always added. The full set of variants
is in build.sh.
Endianness
All headers assume __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__. kit's
supported targets (x86, ARM-LE, RISC-V, WASM) are LE in practice. Big-endian
support would need a parallel *_be/ header set; not provided.
Re-includable templates in impl/ and include/common/fp_lib.h
Each master compiles to one TU. To support multiple precisions and multiple (src, dst) pairs in that single TU, the upstream compiler-rt's single-precision-per-TU model was extended:
include/common/fp_lib.his re-includable. Caller#definesSINGLE_PRECISION,DOUBLE_PRECISION, orQUAD_PRECISION, then#include "fp_lib.h". The header emits suffix-renamed typedefs and static inlines (fp_t_sf,rep_clz_df, ...) once per (TU, precision), and sets bare-name#definealiases (fp_t,rep_clz, ...) so caller code uses bare names that resolve to the right suffixed entity.include/common/fp_lib_undef.hclears the bare aliases between sections that switch precision or (src,dst) pair.impl/fp_*_impl.inc(add, mul, div, compare, fixint, fixuint) are re-includable. Their static__addXf3__/__leXf2__/ etc. are suffix-renamed via_FP_NAME(...); emission is gated per (TU, precision) viaFP_<OP>_<SUFFIX>_EMITTED.fp_fixint_impl.inc/fp_fixuint_impl.inctake an additional caller-suppliedFP_FIX_SUFFIXso each fix call site gets its own helper instance.impl/fp_extend_impl.inc,fp_trunc_impl.inc, andint_to_fp_impl.incsuffix-rename by the (src, dst) pair token (sfdf,sftf,dftf, ...). One emission per (TU, pair). Each inc bundles its own type/helper setup at the top (formerly the separatefp_extend.h/fp_trunc.h/int_to_fp.hpartner headers).impl/int_div_impl.inctakes a caller-suppliedINT_DIV_SUFFIXand emits suffixed__udivXi3_<suf>/__divXi3_<suf>etc. so the master can include the inc multiple times in one TU (e.g. once per ofdivdi3,udivdi3,moddi3,umoddi3inint.c).
In short: every template that the master includes more than once per TU
either uses a precision-derived suffix (auto, via FP_LIB_SUFFIX) or a
caller-defined suffix.
Files with surviving preprocessor logic
The "no target-dispatch ifdefs" rule is applied to every master. Some templates retain preprocessor logic that is not target dispatch:
impl/*.inc,include/common/fp_lib.h— parameterized viaSINGLE_PRECISION/DOUBLE_PRECISION/QUAD_PRECISIONand theSRC_*/DST_*selectors set by the master before each inclusion.include/common/assembly.h— abstracts assembler syntax (ELF vs Mach-O vs COFF symbol decoration, ARM/Thumb mode markers, etc.). Heavily ifdef'd by design; consumed only byarm/aeabi_thumb{1,2}.S.atomic/atomic_common.inc— keys the 16-byte cases offHAS_INT128, set by the build (-DHAS_INT128=1on 64-bit,-DHAS_INT128=0on 32-bit).
Notes per master
mem/mem.c
Hand-written portable C (not from compiler-rt). All four functions are weak
so a user libc, or a tuned arch-specific replacement, wins at link time.
arm/aeabi_thumb{1,2}.S's aeabi_mem* symbols forward to these.
coro/<arch>.c + coro/coro.c
The coro module ships in two layers:
coro/<arch>.c (one per arch) — per-target primitives, file-scope
__asm__ inside a .c file (not a separate .S) so the tiny C
__kit_coro_ctx_init and the asm save/restore stay co-located.
Provides:
setjmp/longjmp(public,<setjmp.h>).__kit_coro_switch(from, to, value)— symmetric register switch, exposed in<kit/coro.h>as a compiler-builtin-style primitive for advanced schedulers; the asymmetric layer below also uses it.__kit_coro_ctx_init/__kit_coro_trampoline— internal.
The three primitives that need register save/restore (setjmp,
longjmp, __kit_coro_switch) share one pair of C string-concat
macros SAVE_INTO(reg) / RESTORE_FROM(reg) so the same instruction
bytes are emitted in all three places. Symbol naming uses
__USER_LABEL_PREFIX__ so the same source compiles for ELF / Mach-O
/ COFF.
ARM ships two variants: arm32.c (Thumb-2, ARMv7+, optional VFP
d8-d15 gated on __ARM_FP) and arm32_thumb1.c (ARMv6-M /
Cortex-M0/M0+; no IT blocks, no VFP, data-processing restricted to
r0-r7, no str sp / str rN, [sp,...] — the asm sequences don't
share with arm32.c so it's a separate file).
Not provided for wasm32 (would need an Asyncify-fiber port).
coro/coro.c (arch-agnostic) — the public asymmetric API:
coro_init / coro_resume / coro_yield / coro_self. Tracks the
current coroutine in a static, threads each coro_t's resumer slot
through the resume chain, and dispatches the per-arch trampoline via
a thunk that runs the user's coro_fn, marks the coroutine
CORO_DEAD, and switches back to the resumer. Built once per coro
variant and linked alongside the per-arch master.
atomic/atomic_freestanding.c
Defines a pointer-sized _Atomic(uintptr_t) spinlock as the lock primitive
(no OS dependency) then #includes atomic_common.inc, which contains the
dispatch logic and all __atomic_*_N expansions. The shim calls the GCC
__atomic_* builtin family (the one kit documents in doc/builtins.md);
upstream's Clang-only __c11_atomic_* calls were translated. Public symbols
are exported via #pragma redefine_extname from _c-suffixed names so they
don't collide with the clang builtins of the same name.
arm/aeabi_thumb{1,2}.S
AEABI aliases for div/mod, soft-float compares, and the
aeabi_mem{cpy,move,set,clr} size-specialized wrappers. Two ISA-mode
variants:
- Thumb2 — used on ARMv7+/Thumb2; tail-calls into
memcpyetc., usessubs/mulsfolding. - Thumb1 — used on ARMv6-M (Cortex-M0/M0+/M1); avoids tail-calls (no
b memcpy) and the foldedsubs/mulsform. Both files contain the same ISA-agnostic helpers (aeabi_ldivmod,aeabi_uldivmod,aeabi_{d,f}cmp) inline plus their respective ISA-tuned versions of the dual helpers (aeabi_idivmod,aeabi_uidivmod, and the fouraeabi_mem*).
arm/aeabi.c
__aeabi_drsub and __aeabi_frsub. Built alongside whichever
aeabi_thumb{1,2}.S is selected.
riscv/rv{32,64}.S
Combined save_* + restore_*. Upstream's save.S / restore.S are
gated on __riscv_xlen; kit splits per xlen. The embedded ABI variants
(__riscv_32e / __riscv_64e) are not carried over.
fp_tf/fp_tf.c
Compile only on targets where long double is IEEE binary128 (typically
aarch64 with -mlong-double-128). The build must -include
lp64_le_ldbl128/tf_supplement.h so tf_float, CRT_HAS_TF_MODE, etc.
are defined before fp_lib.h is processed.
Compare helpers
The comparesf2 / comparedf2 / comparetf2 sections of the fp masters
define every variant (__eqXf2, __ltXf2, __neXf2, __gtXf2) as a
separate function rather than using COMPILER_RT_ALIAS. Replaces an
object-format-conditional macro with a handful of one-line wrappers.
int_util section of int/int.c
Replaced upstream's hosted/kernel/Apple/Win32 abort cascade with a single
freestanding __compilerrt_abort_impl that calls __builtin_trap().
Things this lib does NOT cover
These are documented in doc/builtins.md but not provided here:
__riscv_save_*for__riscv_32e/__riscv_64e(rare embedded ABIs).- Big-endian targets.
- 80-bit
xf(x86 long double) soft-float — kit's spec doesn't listxfops in the runtime contract; x86 always has the FPU for long double. - Half-precision (
hf) conversions — not in the kit contract.
Updating from a newer compiler-rt
The per-op layout that lived in upstream's lib/builtins/<op>.c is now
inlined into masters here. To pull in a newer release:
- Identify which builtins changed between releases (
git logor release notes). For each changed builtin:- Find its inlined block in the matching kit master (each block is
prefixed by
// ---- <upstream filename> ----). - Diff that block against upstream
lib/builtins/<op>.c. Drop any re-introduced target-dispatch ifdefs (__ARM_EABI__,__MINGW32__,__SOFTFP__).
- Find its inlined block in the matching kit master (each block is
prefixed by
- If
int_util.cchanged upstream, re-strip to the freestanding-only abort path. - If
riscv/save.S/restore.Schanged, re-split into the per-xlen inline blocks ofriscv/rv{32,64}.S. - Diff
int_lib.h/fp_lib.hagainst the per-target copies underinclude/; update for any new feature gates that aren't legitimately target-orthogonal. Note: upstream'sint_endianness.handint_types.hare folded into each per-targetint_lib.h. Note also that kit'sfp_lib.hhas been refactored into a re-includable form (per-precision suffix-renaming plus bare-name aliases) — pull upstream changes into the precision blocks; don't revert the suffix machinery. - Run
bash build.shand confirm all 13 variants pass.