kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 76182e3a5a1b436ec5df54f8405da50f3ee1e16f
parent 0dc9d9b9d6fd9100c28f165cfa7e71e3033a16cf
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Thu, 28 May 2026 14:32:42 -0700

doc: refresh binary-trees -O1 numbers after known-frame prologue

Re-ran the binary-trees sweep on a clean release build (RELEASE=1,
COMPILE_REPEATS=3, RUN_REPEATS=3). cfree -O1 runtime 3146 -> 2973 ms;
vs gcc -O0 0.84x -> 0.89x. Other rows unchanged (not re-run).

Diffstat:
Mdoc/OPT_O1_PERF_TODO.md | 5+++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/doc/OPT_O1_PERF_TODO.md b/doc/OPT_O1_PERF_TODO.md @@ -27,7 +27,7 @@ the cached baseline in `scripts/opt_bench_baseline.csv`; regenerate with | bench | cfree -O1 | gcc -O0 | vs gcc-O0 | mir -O1 | vs mir-O1 | behind | | --- | ---: | ---: | ---: | ---: | ---: | --- | -| binary-trees | 3146 | 2639 | **0.84×** (slower) | n/a¹ | — | gcc | +| binary-trees | 2973 | 2647 | **0.89×** (slower) | n/a¹ | — | gcc | | lists | 4843 | 8868 | 1.83× ✓ | 4997 | 1.03× | mir | | hash2 | 4988 | 7481 | 1.50× ✓ | 3863 | **0.77×** | mir | | sieve | 5148 | 5077 | 0.99× (~tied) | 4028 | **0.78×** | gcc (~tied), mir | @@ -39,7 +39,8 @@ the cached baseline in `scripts/opt_bench_baseline.csv`; regenerate with ## Per-benchmark notes ### binary-trees — slower than unoptimized gcc (highest priority) -The only case where cfree `-O1` is *slower than gcc -O0* (0.84×). Workload is +The only case where cfree `-O1` is *slower than gcc -O0* (0.89×, up from 0.84× +after the known-frame prologue landed — item 1 below). Workload is recursive tree build/walk: four tiny functions (`NewTreeNode`, `ItemCheck`, `BottomUpTree`, `DeleteTree`) called ~7.6M times at depth=19, plus a `malloc`/`free` per node. The **body** of each function is fine — cfree -O1