boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit 364cb5c1b8cdf9e70f6dfb81015cb5d8e22c0cee
parent 313276b56269e5599f1d94a89090fb6dd9ec6461
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Mon, 27 Apr 2026 08:01:46 -0700

scheme1: add (heap-mark) and (heap-rewind!) for arena-style reclamation

heap-mark returns bytes-consumed since heap_init as a tagged fixnum.
heap-rewind! resets heap_next to (heap_buf_ptr + mark), reclaiming
everything allocated past the mark in a single step. Same body as
heap-usage for the mark; kept distinct for callsite clarity.

UNSAFE by design: the caller must guarantee no surviving reference
into the freed region. The intended A→B→C ergonomics:

  (define (B input)
    (let ((out  (cons 0 0))
          (mark (heap-mark)))
      (C out input)         ; allocates scratch, mutates out
      (heap-rewind! mark)
      out))                 ; final env walk reads dropped-but-intact cells

A and C are arena-unaware; only B knows. The invariant that makes the
trailing return safe is that heap-rewind! only resets heap_next — it
does not zero memory — so the bare-symbol lookup walks cells whose
bytes are still valid as long as nothing has allocated since the
rewind. By the time A receives the value, all retained references
are to `out` itself (allocated pre-mark, permanently safe).

tests/scheme1/93-heap-mark-rewind exercises the full A→B→C pattern
and asserts both that the result is correct and that the rewind
genuinely reclaims C's scratch (delta < delta2 vs a no-rewind sibling).

Diffstat:
Mscheme1/scheme1.P1pp | 41++++++++++++++++++++++++++++++++++++++---
Atests/scheme1/93-heap-mark-rewind.expected-exit | 1+
Atests/scheme1/93-heap-mark-rewind.scm | 75+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 114 insertions(+), 3 deletions(-)

diff --git a/scheme1/scheme1.P1pp b/scheme1/scheme1.P1pp @@ -4811,6 +4811,32 @@ %mkfix(a0, a0) %ret +# (heap-mark) -> tagged fixnum: bytes consumed since heap_init. +# Capture before transient allocations; pass to (heap-rewind! m) to +# discard everything allocated after the mark. UNSAFE: any pointer +# referencing the rewound region becomes dangling. Caller must keep only +# fixnums / symbols / pointers into surviving (pre-mark) heap across the +# rewind. Same body as heap-usage; kept distinct for callsite clarity. +:prim_heap_mark_entry + %ld_global(t0, &heap_next) + %ld_global(t1, &heap_buf_ptr) + %sub(a0, t0, t1) + %mkfix(a0, a0) + %ret + +# (heap-rewind! mark) -> unspec. Resets heap_next to (heap_buf_ptr + +# mark). UNSAFE: subsequent allocations overwrite the freed region; any +# surviving reference into it is dangling. No bounds check -- caller +# passes a value previously returned by (heap-mark). +:prim_heap_rewind_bang_entry + %car(t0, a0) + %untag_fix(t0, t0) + %ld_global(t1, &heap_buf_ptr) + %add(t0, t0, t1) + %st_global(t0, &heap_next, t1) + %li(a0, %imm_val(%IMM.UNSPEC)) + %ret + # Surface names. Length is hard-coded at the call site; no NUL needed # because intern takes (ptr, len). Aligned padding via "\0" bytes is # fine -- M0 emits ASCII verbatim. @@ -4896,9 +4922,16 @@ :name_write "write" :name_error "error" :name_format "format" -;; "heap-usage" + auto-NUL = 11 bytes; pad to 16 (multiple of 8) so -;; subsequent 8-aligned data slots (prim_table $() rows, the bss -;; pointer slots) stay aligned. M0 appends a NUL to every "..." string. +;; The last three names are padded individually to 16 bytes (multiple +;; of 8) so subsequent 8-aligned data slots (prim_table $() rows, the +;; bss pointer slots) stay aligned. M0 appends a NUL to every "..." +;; string, so the listed length below counts the trailing NUL plus the +;; explicit '00' bytes. +;; "heap-mark" + NUL = 10 bytes; pad with 6 '00' to reach 16. +;; "heap-rewind!"+ NUL = 13 bytes; pad with 3 '00' to reach 16. +;; "heap-usage" + NUL = 11 bytes; pad with 5 '00' to reach 16. +:name_heap_mark "heap-mark" '00' '00' '00' '00' '00' '00' +:name_heap_rewind_bang "heap-rewind!" '00' '00' '00' :name_heap_usage "heap-usage" '00' '00' '00' '00' '00' # Writer string constants. Lengths are hard-coded at the bv_putn call @@ -4979,6 +5012,8 @@ &name_error %(0) $(5) &prim_error_entry %(0) &name_format %(0) $(6) &prim_format_entry %(0) &name_heap_usage %(0) $(10) &prim_heap_usage_entry %(0) +&name_heap_mark %(0) $(9) &prim_heap_mark_entry %(0) +&name_heap_rewind_bang %(0) $(12) &prim_heap_rewind_bang_entry %(0) :prim_table_end :msg_usage "scheme1: usage: scheme1 SOURCE.scm" '0a' '00' diff --git a/tests/scheme1/93-heap-mark-rewind.expected-exit b/tests/scheme1/93-heap-mark-rewind.expected-exit @@ -0,0 +1 @@ +42 diff --git a/tests/scheme1/93-heap-mark-rewind.scm b/tests/scheme1/93-heap-mark-rewind.scm @@ -0,0 +1,75 @@ +; heap-mark / heap-rewind! — A → B → C ergonomics. +; +; A: caller. Arena-unaware; just calls (B input) and gets a value back. +; B: arena boundary. Allocates the output cell BEFORE marking, calls C +; to fill it in, rewinds, returns the output to A. +; C: worker. Arena-unaware; allocates scratch and mutates the output. +; +; Invariant for B: between (heap-rewind! mark) and the function's final +; return of out, NO heap allocation may occur. heap-rewind! only resets +; heap_next; it does not zero memory. The trailing env walk that resolves +; the bare `out` symbol reads dropped-but-intact cells, which is well- +; defined as long as nothing has allocated since the rewind. By the time +; A receives the value, all references are to `out` itself (allocated +; pre-mark, permanently safe). +; +; C is tail-recursive; proper tail calls collapse host frames, and the +; recursive heap envs all die together at the rewind. + +(define (C-loop out xs sum count rev) + (if (null? xs) + (begin + (set-car! out sum) + (set-cdr! out count)) + (C-loop out + (cdr xs) + (+ sum (car xs)) + (+ count 1) + (cons (car xs) rev)))) + +(define (C out input) + (C-loop out input 0 0 '())) + +(define (B input) + (let ((out (cons 0 0)) ; pre-mark: survives rewind + (mark (heap-mark))) ; everything past here is C's scratch + (C out input) + (heap-rewind! mark) + out)) + +(define (A) + (let ((input (cons 1 (cons 2 (cons 3 (cons 4 (cons 5 '()))))))) + (B input))) + +; A no-rewind sibling lets us assert the rewind actually saves bytes. +(define (B-noreclaim input) + (let ((out (cons 0 0))) + (C out input) + out)) + +(define (A-noreclaim) + (let ((input (cons 1 (cons 2 (cons 3 (cons 4 (cons 5 '()))))))) + (B-noreclaim input))) + +(define before (heap-mark)) +(define result (A)) +(define after (heap-mark)) +(define delta (- after before)) + +(define before2 (heap-mark)) +(define result2 (A-noreclaim)) +(define after2 (heap-mark)) +(define delta2 (- after2 before2)) + +(sys-exit + (if (= (car result) 15) + (if (= (cdr result) 5) + (if (= (car result2) 15) + (if (= (cdr result2) 5) + (if (< delta delta2) + 42 + 43) + 44) + 45) + 46) + 47))