lisp.M1 steps 5-6: printer + eval (closures, define, apply) - boot2

commit 30406eb9364d5d247ef5f49b1993874b0d35b1d4
parent c049a3b304a6f4291d0903294a972db245f5d02c
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Tue, 21 Apr 2026 06:14:53 -0700

lisp.M1 steps 5-6: printer + eval (closures, define, apply)

Step 5 closes the read-print cycle: display/write for all tagged
values, write_string with "/\ escaping, write_pair with proper
dotted-pair handling.

Step 6 adds eval + apply: globals via alist, lookup chains local
then global, special forms (quote/if/begin/lambda/define) dispatched
by interned-symbol pointer identity, closures as 32-byte
[hdr|params|body|env] headered objects, env_extend walks paired
name/val lists under PROLOGUE_N4, apply handles only closures for
now (errors on other callees).

_start is now the top-level driver: reads src_text one form at a
time, evals each under an empty local env, keeps the last result
in r4 (callee-saved across the loop), writes it at EOF, and exits
with its fixnum payload as $?.

p1_gen.py tranche 11 adds the new ops the interpreter needs (MOVs,
BNE_R1_R0, LB_R1_R6_0, ADDI NEG4/NEG6 for tag-strip, 9 LD + 5 ST
for 4-slot frame and closure-field access). Syscall table also
gains CLONE/EXECVE/WAITID for a future exec primitive.

Verified on all three arches: make PROG=lisp run-all prints 42
and exits 42 from (define id (lambda (x) x))(id 42).

Diffstat:
M lisp.M1  | 1263 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
M p1_gen.py  | 146 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---

2 files changed, 1336 insertions(+), 73 deletions(-)
diff --git a/lisp.M1 b/lisp.M1
@@ -40,11 +40,18 @@
 
 
 ## ---- _start ----------------------------------------------------------
-## Runs the step-3 intern battery and exits 42 on success. Skips argv
-## entirely for now — step 4 will restore the file-mode entry.
+## Step 6 driver: loops over top-level forms from src_text, evaluates
+## each in an empty local env (so unqualified lookups fall through to
+## the global hash), keeps the last result. Final result is displayed
+## with write (closing the step-5 read-print cycle) and its fixnum
+## payload becomes the exit status — lets test harnesses assert via $?.
+##
+## r4 holds the running last_result. It survives read_expr/eval across
+## iterations because r4 is callee-saved; _start itself has no PROLOGUE
+## and never returns, so its r4 is sovereign.
 :_start
-    ## Align heap_next up to 8 (same reason as step 2 — pair tags
-    ## require an 8-aligned raw pointer).
+    ## Align heap_next up to 8 (pair/closure tags need 8-aligned raw
+    ## pointers — see step 2).
     P1_LI_R4
     &heap_next
     P1_LD_R0_R4_0
@@ -52,43 +59,67 @@
     P1_ADDI_R0_R0_1
     P1_ST_R0_R4_0
 
-    ## t1: intern "foo" → sym_a.
+    ## Intern the special-form symbols up front. eval_pair compares the
+    ## car of every compound expression against these pointers, so they
+    ## must exist before the first eval call. The intern result (r0 =
+    ## tagged sym) is stashed at &sym_* via ST_R0_R2_0 (the target addr
+    ## in r2); ST_R0_R1_0 isn't in the P1 op table.
     P1_LI_R1
-    &str_foo
+    &str_quote
     P1_LI_R2
-    '03000000'
+    '05000000'
     P1_LI_BR
     &intern
     P1_CALL
-    P1_MOV_R6_R0                    ## r6 = sym_a (callee-saved keeps it alive)
+    P1_LI_R2
+    &sym_quote
+    P1_ST_R0_R2_0
 
-    ## t2: intern "foo" again → sym_b. Must pointer-eq sym_a.
     P1_LI_R1
-    &str_foo
+    &str_if
     P1_LI_R2
-    '03000000'
+    '02000000'
     P1_LI_BR
     &intern
-    P1_CALL                         ## r0 = sym_b
+    P1_CALL
+    P1_LI_R2
+    &sym_if
+    P1_ST_R0_R2_0
+
+    P1_LI_R1
+    &str_begin
+    P1_LI_R2
+    '05000000'
     P1_LI_BR
-    &err_intern_not_eq
-    P1_BNE_R0_R6
+    &intern
+    P1_CALL
+    P1_LI_R2
+    &sym_begin
+    P1_ST_R0_R2_0
 
-    ## t3: intern "bar" → must be pointer-distinct from "foo".
     P1_LI_R1
-    &str_bar
+    &str_lambda
     P1_LI_R2
-    '03000000'
+    '06000000'
     P1_LI_BR
     &intern
-    P1_CALL                         ## r0 = sym_bar
+    P1_CALL
+    P1_LI_R2
+    &sym_lambda
+    P1_ST_R0_R2_0
+
+    P1_LI_R1
+    &str_define
+    P1_LI_R2
+    '06000000'
     P1_LI_BR
-    &err_intern_collision
-    P1_BEQ_R0_R6
+    &intern
+    P1_CALL
+    P1_LI_R2
+    &sym_define
+    P1_ST_R0_R2_0
 
-    ## ---- Step 4 self-test: parse + display -------------------------
-    ## Seed the reader globals to point at src_text. src_cursor/line/col
-    ## are already initialized to 0/1/1 in BSS.
+    ## Seed reader globals.
     P1_LI_R1
     &src_base
     P1_LI_R2
@@ -97,36 +128,65 @@
     P1_LI_R1
     &src_len
     P1_LI_R2
-    '1B000000'                      ## strlen("(defn  (foo 42)  ( bar 7 ))") = 27
+    &src_text_len
+    P1_LD_R2_R2_0
     P1_ST_R2_R1_0
 
-    ## Parse one expression.
+    ## r4 = last_result. Initial unspec (0x27) so an empty source exits 0.
+    P1_LI_R4
+    '27000000'
+
+:_start_loop
+    P1_LI_BR
+    &skip_ws
+    P1_CALL
+
+    P1_LI_BR
+    &peek_char
+    P1_CALL
+    ## EOF → done.
+    P1_LI_R1
+    '00000000'
+    P1_ADDI_R1_R1_NEG1
+    P1_LI_BR
+    &_start_done
+    P1_BEQ_R0_R1
+
     P1_LI_BR
     &read_expr
-    P1_CALL                         ## r0 = tagged value
+    P1_CALL                         ## r0 = parsed expr
 
-    ## Display it.
     P1_MOV_R1_R0
+    P1_LI_R2
+    '07000000'                      ## empty local env = nil
     P1_LI_BR
-    &display
-    P1_CALL
+    &eval
+    P1_CALL                         ## r0 = evaluated result
 
-    ## Write a trailing newline.
-    P1_LI_R0
-    SYS_WRITE
+    P1_MOV_R4_R0
+    P1_LI_BR
+    &_start_loop
+    P1_B
+
+:_start_done
+    ## Print the final result (write form) + newline for visibility.
+    P1_MOV_R1_R4
+    P1_LI_BR
+    &write
+    P1_CALL
     P1_LI_R1
-    '01000000'
-    P1_LI_R2
-    &msg_newline
-    P1_LI_R3
-    '01000000'
-    P1_SYSCALL
+    '0A000000'
+    P1_LI_BR
+    &putc
+    P1_CALL
 
-    ## Success: exit(42).
+    ## Exit status = decoded fixnum of last_result (bit-cast to low 8
+    ## bits by the kernel). Non-fixnum results still exit cleanly —
+    ## the low 8 bits of the tagged word are used.
     P1_LI_R0
     SYS_EXIT
-    P1_LI_R1
-    '2A000000'
+    P1_MOV_R1_R4
+    P1_SARI_R1_R1_3
     P1_SYSCALL
 
 
@@ -1265,6 +1325,11 @@
     P1_LI_BR
     &display_symbol
     P1_BEQ_R1_R3
+    P1_LI_R3
+    '04000000'
+    P1_LI_BR
+    &display_string
+    P1_BEQ_R1_R3
 
     ## Unknown — treat as "?"
     P1_LI_R1
@@ -1412,35 +1477,1097 @@
     P1_RET
 
 
-## ---- Static strings -------------------------------------------------
-:msg_error_prefix
-"error: "
-:msg_newline
-"
-"
-:msg_oom
-"heap exhausted"
-:msg_intern_not_eq
-"intern: foo a != foo b (same name)"
-:msg_intern_collision
-"intern: foo == bar (distinct names collided)"
-:msg_reader_bad
-"reader: malformed input"
-:msg_at
-" at "
-:msg_colon
-":"
+## ---- display_string(r2=tagged_string) ------------------------------
+## Tail-called from display's dispatcher with r2 holding the tagged
+## string (low-3-bit tag = 100 = 4). Strips the tag, reads the 48-bit
+## length out of the 8-byte header, then writes the payload bytes to
+## stdout. Shares display's PROLOGUE_N2 frame, so EPILOGUE_N2 + RET
+## unwinds back to display's caller.
+:display_string
+    P1_ADDI_R2_R2_NEG4                  ## r2 = raw header ptr
+    P1_LD_R3_R2_0                       ## r3 = header word
+    P1_SHLI_R3_R3_16
+    P1_SHRI_R3_R3_16                    ## r3 = length (low 48 bits)
+    P1_ADDI_R2_R2_8                     ## r2 = payload ptr
+    P1_LI_R0
+    SYS_WRITE
+    P1_LI_R1
+    '01000000'
+    P1_SYSCALL
+    P1_EPILOGUE_N2
+    P1_RET
 
-## Interned name samples used by the self-test.
-:str_foo
-"foo"
-:str_bar
-"bar"
 
-## Step-4 test source. Exercises nested lists, multi-digit numbers,
-## multi-char symbols, extra whitespace, and a nested empty list.
-:src_text
-"(defn  (foo 42)  ( bar 7 ))"
+## ---- write(r1=val) -------------------------------------------------
+## Printer form: strings get quoted with escapes honored, everything
+## else renders identically to display. Sharing display_nil/fixnum/
+## symbol works because write's PROLOGUE_N2 frame matches display's —
+## those handlers unwind the right amount on their own EPILOGUE_N2.
+:write
+    P1_PROLOGUE_N2
+
+    P1_LI_R2
+    '07000000'
+    P1_LI_BR
+    &display_nil
+    P1_BEQ_R1_R2                        ## nil
+
+    P1_MOV_R2_R1
+    P1_ANDI_R1_R1_7                     ## r1 = tag
+    P1_LI_R3
+    '01000000'
+    P1_LI_BR
+    &display_fixnum
+    P1_BEQ_R1_R3
+    P1_LI_R3
+    '02000000'
+    P1_LI_BR
+    &write_pair
+    P1_BEQ_R1_R3
+    P1_LI_R3
+    '05000000'
+    P1_LI_BR
+    &display_symbol
+    P1_BEQ_R1_R3
+    P1_LI_R3
+    '04000000'
+    P1_LI_BR
+    &write_string
+    P1_BEQ_R1_R3
+
+    ## Unknown (closure/prim, etc.) — emit '?' placeholder.
+    P1_LI_R1
+    '3F000000'
+    P1_LI_BR
+    &putc
+    P1_CALL
+    P1_EPILOGUE_N2
+    P1_RET
+
+
+## ---- write_string(r2=tagged_string) --------------------------------
+## Emits "..." with `"` and `\` escaped as `\"` and `\\`. Shares
+## write's PROLOGUE_N2 frame; r6=cursor, r7=remaining-bytes saved in
+## the two slots across putc calls.
+:write_string
+    P1_MOV_R3_SP
+    P1_ST_R6_R3_8
+    P1_ST_R7_R3_16
+
+    P1_ADDI_R2_R2_NEG4
+    P1_LD_R3_R2_0
+    P1_SHLI_R3_R3_16
+    P1_SHRI_R3_R3_16                    ## r3 = length
+    P1_ADDI_R2_R2_8                     ## r2 = payload ptr
+    P1_MOV_R6_R2                        ## r6 = cursor
+    P1_MOV_R7_R3                        ## r7 = length
+
+    P1_LI_R1
+    '22000000'                          ## '"'
+    P1_LI_BR
+    &putc
+    P1_CALL
+
+:write_string_loop
+    P1_LI_R1
+    '00000000'
+    P1_LI_BR
+    &write_string_done
+    P1_BEQ_R7_R1                        ## length == 0 → done
+
+    P1_LB_R1_R6_0                       ## r1 = *cursor
+
+    P1_LI_R2
+    '22000000'                          ## '"'
+    P1_LI_BR
+    &write_string_escape
+    P1_BEQ_R1_R2
+
+    P1_LI_R2
+    '5C000000'                          ## '\\'
+    P1_LI_BR
+    &write_string_escape
+    P1_BEQ_R1_R2
+
+    P1_LI_BR
+    &putc
+    P1_CALL
+
+:write_string_next
+    P1_ADDI_R6_R6_1
+    P1_ADDI_R7_R7_NEG1
+    P1_LI_BR
+    &write_string_loop
+    P1_B
+
+:write_string_escape
+    ## Emit '\\' then re-emit the char (cursor still points to it).
+    P1_LI_R1
+    '5C000000'
+    P1_LI_BR
+    &putc
+    P1_CALL
+    P1_LB_R1_R6_0
+    P1_LI_BR
+    &putc
+    P1_CALL
+    P1_LI_BR
+    &write_string_next
+    P1_B
+
+:write_string_done
+    P1_LI_R1
+    '22000000'                          ## '"'
+    P1_LI_BR
+    &putc
+    P1_CALL
+    P1_MOV_R3_SP
+    P1_LD_R6_R3_8
+    P1_LD_R7_R3_16
+    P1_EPILOGUE_N2
+    P1_RET
+
+
+## ---- write_pair(r2=tagged_pair) ------------------------------------
+## Mirror of display_pair but recurses through write (strings quoted).
+## Shares write's PROLOGUE_N2 frame.
+:write_pair
+    P1_MOV_R3_SP
+    P1_ST_R6_R3_8
+    P1_ST_R7_R3_16
+
+    P1_MOV_R6_R2                        ## r6 = tagged pair cursor
+
+    P1_LI_R1
+    '28000000'                          ## '('
+    P1_LI_BR
+    &putc
+    P1_CALL
+
+:write_pair_loop
+    P1_MOV_R1_R6
+    P1_LI_BR
+    &car
+    P1_CALL
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &write
+    P1_CALL
+
+    P1_MOV_R1_R6
+    P1_LI_BR
+    &cdr
+    P1_CALL
+    P1_MOV_R6_R0
+
+    P1_LI_R1
+    '07000000'
+    P1_LI_BR
+    &write_pair_close
+    P1_BEQ_R6_R1
+
+    P1_MOV_R1_R6
+    P1_ANDI_R1_R1_7
+    P1_LI_R2
+    '02000000'
+    P1_LI_BR
+    &write_pair_continue
+    P1_BEQ_R1_R2
+
+    ## improper tail: " . x)"
+    P1_LI_R1
+    '20000000'
+    P1_LI_BR
+    &putc
+    P1_CALL
+    P1_LI_R1
+    '2E000000'
+    P1_LI_BR
+    &putc
+    P1_CALL
+    P1_LI_R1
+    '20000000'
+    P1_LI_BR
+    &putc
+    P1_CALL
+    P1_MOV_R1_R6
+    P1_LI_BR
+    &write
+    P1_CALL
+    P1_LI_BR
+    &write_pair_close
+    P1_B
+
+:write_pair_continue
+    P1_LI_R1
+    '20000000'
+    P1_LI_BR
+    &putc
+    P1_CALL
+    P1_LI_BR
+    &write_pair_loop
+    P1_B
+
+:write_pair_close
+    P1_LI_R1
+    '29000000'                          ## ')'
+    P1_LI_BR
+    &putc
+    P1_CALL
+
+    P1_MOV_R3_SP
+    P1_LD_R6_R3_8
+    P1_LD_R7_R3_16
+    P1_EPILOGUE_N2
+    P1_RET
+
+
+## ================ Step 6: Eval =======================================
+## Recursive-descent evaluator. Compound forms dispatch by comparing
+## the car against cached interned pointers (sym_quote/if/begin/
+## lambda/define, seeded in _start). Anything else is a function
+## application: callee and args get evaluated, then apply runs the
+## closure. Local env is a flat alist of (sym . val) pairs; global
+## bindings live in a singly-linked alist hanging off global_env_cell.
+
+## ---- gset(r1=sym, r2=val) ------------------------------------------
+## Prepends (sym . val) to the global alist. Last-write-wins semantics
+## fall out naturally because lookup_alist returns the leftmost match.
+:gset
+    P1_PROLOGUE_N2
+    P1_MOV_R3_SP
+    P1_ST_R1_R3_8                       ## save sym
+    P1_ST_R2_R3_16                      ## save val
+
+    P1_LI_BR
+    &cons
+    P1_CALL                             ## r0 = (sym . val)
+
+    P1_MOV_R1_R0
+    P1_LI_R2
+    &global_env_cell
+    P1_LD_R2_R2_0                       ## r2 = old global alist
+    P1_LI_BR
+    &cons
+    P1_CALL                             ## r0 = new alist
+
+    P1_LI_R2
+    &global_env_cell
+    P1_ST_R0_R2_0
+    P1_EPILOGUE_N2
+    P1_RET
+
+
+## ---- lookup_alist(r1=sym, r2=env) -> r0 = pair or nil ---------------
+## Walks alist env returning the (sym . val) cons cell that matches, or
+## nil on miss. Caller distinguishes via tag/nil check.
+:lookup_alist
+    P1_PROLOGUE_N3
+    P1_MOV_R3_SP
+    P1_ST_R1_R3_8                       ## slot 1 = sym
+    P1_ST_R2_R3_16                      ## slot 2 = cursor
+
+:lookup_alist_loop
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_R1
+    '07000000'
+    P1_LI_BR
+    &lookup_alist_miss
+    P1_BEQ_R2_R1                        ## cursor == nil → miss
+
+    ## pair = car(cursor)
+    P1_MOV_R1_R2
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = (k . v)
+
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_24                      ## slot 3 = pair
+
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = key
+
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &lookup_alist_hit
+    P1_BEQ_R0_R1
+
+    ## advance cursor := cdr(cursor)
+    P1_LD_R1_R3_16
+    P1_LI_BR
+    &cdr
+    P1_CALL
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_16
+    P1_LI_BR
+    &lookup_alist_loop
+    P1_B
+
+:lookup_alist_hit
+    P1_MOV_R3_SP
+    P1_LD_R0_R3_24
+    P1_EPILOGUE_N3
+    P1_RET
+
+:lookup_alist_miss
+    P1_LI_R0
+    '07000000'
+    P1_EPILOGUE_N3
+    P1_RET
+
+
+## ---- lookup(r1=sym, r2=env) -> r0 = value --------------------------
+## Local first, global second. Errors out via &err_unbound on miss.
+:lookup
+    P1_PROLOGUE_N2
+    P1_MOV_R3_SP
+    P1_ST_R1_R3_8                       ## save sym for possible global retry
+
+    P1_LI_BR
+    &lookup_alist
+    P1_CALL                             ## r0 = local pair or nil
+
+    P1_LI_R1
+    '07000000'
+    P1_LI_BR
+    &lookup_global
+    P1_BEQ_R0_R1
+
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = value
+    P1_EPILOGUE_N2
+    P1_RET
+
+:lookup_global
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_R2
+    &global_env_cell
+    P1_LD_R2_R2_0
+    P1_LI_BR
+    &lookup_alist
+    P1_CALL
+
+    P1_LI_R1
+    '07000000'
+    P1_LI_BR
+    &err_unbound
+    P1_BEQ_R0_R1
+
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &cdr
+    P1_CALL
+    P1_EPILOGUE_N2
+    P1_RET
+
+
+## ---- env_extend(r1=names, r2=vals, r3=env) -> r0 = new env ---------
+## For each (name, val) pair, prepends (name . val) to env. Iterative;
+## uses a 4-slot frame because the per-iteration state is (names,
+## vals, env-acc, name-temp) and cons clobbers r0-r3 so name has to
+## live somewhere stable across the val-car call.
+:env_extend
+    P1_PROLOGUE_N4
+    P1_MOV_R0_R3                        ## save env before clobbering r3
+    P1_MOV_R3_SP
+    P1_ST_R1_R3_8                       ## slot 1 = names
+    P1_ST_R2_R3_16                      ## slot 2 = vals
+    P1_ST_R0_R3_24                      ## slot 3 = env accumulator
+
+:env_extend_loop
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_R2
+    '07000000'
+    P1_LI_BR
+    &env_extend_done
+    P1_BEQ_R1_R2                        ## names == nil → done
+
+    P1_LD_R1_R3_16
+    P1_LI_R2
+    '07000000'
+    P1_LI_BR
+    &err_arity
+    P1_BEQ_R1_R2                        ## vals == nil → arity error
+
+    ## name = car(names)
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = name
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_32                      ## slot 4 = name
+
+    ## val = car(vals)
+    P1_LD_R1_R3_16
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = val
+
+    ## pair = cons(name, val)
+    P1_MOV_R2_R0
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_32
+    P1_LI_BR
+    &cons
+    P1_CALL                             ## r0 = (name . val)
+
+    ## env := cons(pair, env)
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_24
+    P1_LI_BR
+    &cons
+    P1_CALL                             ## r0 = new env
+
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_24
+
+    ## names := cdr(names)
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_8
+
+    ## vals := cdr(vals)
+    P1_LD_R1_R3_16
+    P1_LI_BR
+    &cdr
+    P1_CALL
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_16
+
+    P1_LI_BR
+    &env_extend_loop
+    P1_B
+
+:env_extend_done
+    P1_MOV_R3_SP
+    P1_LD_R0_R3_24
+    P1_EPILOGUE_N4
+    P1_RET
+
+
+## ---- make_closure(r1=params, r2=body, r3=env) -> r0 = tagged -------
+## 32-byte heap object: [header(type=4) | params | body | env]. Tag=6
+## (110) matches the combined closure/prim tag band. Pre-zeroed BSS
+## heap lets us skip writing the header's low 7 bytes and only stamp
+## byte 7 with the type code.
+:make_closure
+    P1_PROLOGUE_N3
+    P1_MOV_R0_R3                        ## save env
+    P1_MOV_R3_SP
+    P1_ST_R1_R3_8                       ## slot 1 = params
+    P1_ST_R2_R3_16                      ## slot 2 = body
+    P1_ST_R0_R3_24                      ## slot 3 = env
+
+    P1_LI_R1
+    '20000000'                          ## 32 bytes
+    P1_LI_BR
+    &alloc
+    P1_CALL                             ## r0 = raw ptr
+
+    P1_LI_R1
+    '04000000'
+    P1_SB_R1_R0_7                       ## type = 4 (closure)
+
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_ST_R1_R0_8
+    P1_LD_R1_R3_16
+    P1_ST_R1_R0_16
+    P1_LD_R1_R3_24
+    P1_ST_R1_R0_24
+
+    P1_ORI_R0_R0_4
+    P1_ORI_R0_R0_2                      ## tag = 0b110 = 6
+
+    P1_EPILOGUE_N3
+    P1_RET
+
+
+## ---- eval_args(r1=args, r2=env) -> r0 = evaluated args list --------
+## Recursive map: eval each, cons the results left-to-right.
+:eval_args
+    P1_PROLOGUE_N3
+    P1_MOV_R3_SP
+    P1_ST_R1_R3_8                       ## slot 1 = args cursor
+    P1_ST_R2_R3_16                      ## slot 2 = env
+
+    P1_LI_R2
+    '07000000'
+    P1_LI_BR
+    &eval_args_done
+    P1_BEQ_R1_R2
+
+    ## head = eval(car(args), env)
+    P1_LI_BR
+    &car
+    P1_CALL
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &eval
+    P1_CALL                             ## r0 = head value
+
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_24
+
+    ## tail = eval_args(cdr(args), env)
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &eval_args
+    P1_CALL                             ## r0 = tail
+
+    P1_MOV_R2_R0
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_24
+    P1_LI_BR
+    &cons
+    P1_CALL
+    P1_EPILOGUE_N3
+    P1_RET
+
+:eval_args_done
+    P1_LI_R0
+    '07000000'
+    P1_EPILOGUE_N3
+    P1_RET
+
+
+## ---- apply(r1=callee, r2=args) -> r0 = result ----------------------
+## Only closures for now. Frame has 4 slots: callee-then-params,
+## args, body, closure-env. env_extend consumes the three params/
+## args/closure-env triple, then eval runs body under the extended
+## env.
+:apply
+    P1_PROLOGUE_N4
+    P1_MOV_R3_SP
+    P1_ST_R1_R3_8                       ## slot 1 = callee (tagged)
+    P1_ST_R2_R3_16                      ## slot 2 = args
+
+    P1_ANDI_R1_R1_7
+    P1_LI_R0
+    '06000000'
+    P1_LI_BR
+    &err_not_callable
+    P1_BNE_R1_R0
+
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_ADDI_R1_R1_NEG6                  ## r1 = raw ptr
+
+    P1_LD_R0_R1_8                       ## params
+    P1_ST_R0_R3_8                       ## slot 1 = params (overwrite)
+    P1_LD_R0_R1_16                      ## body
+    P1_ST_R0_R3_24                      ## slot 3 = body
+    P1_LD_R0_R1_24                      ## closure env
+    P1_ST_R0_R3_32                      ## slot 4 = closure env
+
+    ## env_extend(params, args, closure_env)
+    P1_LD_R1_R3_8
+    P1_LD_R2_R3_16
+    P1_LD_R3_R3_32
+    P1_LI_BR
+    &env_extend
+    P1_CALL                             ## r0 = new env
+
+    P1_MOV_R2_R0
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_24                      ## r1 = body
+    P1_LI_BR
+    &eval
+    P1_CALL
+    P1_EPILOGUE_N4
+    P1_RET
+
+
+## ---- eval(r1=expr, r2=env) -> r0 = value ---------------------------
+## Self-evaluating: nil/fixnum/string/closure. Symbols → lookup. Pairs
+## → eval_pair (special-form or application dispatch).
+:eval
+    P1_PROLOGUE_N3
+    P1_MOV_R3_SP
+    P1_ST_R1_R3_8                       ## slot 1 = expr
+    P1_ST_R2_R3_16                      ## slot 2 = env
+
+    P1_LI_R2
+    '07000000'
+    P1_LI_BR
+    &eval_self_slot1
+    P1_BEQ_R1_R2
+
+    P1_ANDI_R1_R1_7                     ## r1 = tag
+
+    P1_LI_R2
+    '01000000'
+    P1_LI_BR
+    &eval_self_slot1
+    P1_BEQ_R1_R2                        ## fixnum
+
+    P1_LI_R2
+    '05000000'
+    P1_LI_BR
+    &eval_sym
+    P1_BEQ_R1_R2
+
+    P1_LI_R2
+    '02000000'
+    P1_LI_BR
+    &eval_pair
+    P1_BEQ_R1_R2
+
+    ## Other tags (string, closure) self-evaluate.
+    P1_LI_BR
+    &eval_self_slot1
+    P1_B
+
+:eval_self_slot1
+    P1_MOV_R3_SP
+    P1_LD_R0_R3_8
+    P1_EPILOGUE_N3
+    P1_RET
+
+:eval_sym
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &lookup
+    P1_CALL
+    P1_EPILOGUE_N3
+    P1_RET
+
+:eval_pair
+    ## Compound expression. Dispatch on car against cached sym_*
+    ## pointers; otherwise treat as function application.
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = callee-expr
+
+    P1_LI_R1
+    &sym_quote
+    P1_LD_R1_R1_0
+    P1_LI_BR
+    &eval_quote
+    P1_BEQ_R0_R1
+
+    P1_LI_R1
+    &sym_if
+    P1_LD_R1_R1_0
+    P1_LI_BR
+    &eval_if
+    P1_BEQ_R0_R1
+
+    P1_LI_R1
+    &sym_begin
+    P1_LD_R1_R1_0
+    P1_LI_BR
+    &eval_begin
+    P1_BEQ_R0_R1
+
+    P1_LI_R1
+    &sym_lambda
+    P1_LD_R1_R1_0
+    P1_LI_BR
+    &eval_lambda
+    P1_BEQ_R0_R1
+
+    P1_LI_R1
+    &sym_define
+    P1_LD_R1_R1_0
+    P1_LI_BR
+    &eval_define
+    P1_BEQ_R0_R1
+
+    ## Application: callee = eval(callee-expr, env)
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &eval
+    P1_CALL                             ## r0 = callee value
+
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_24                      ## slot 3 = callee
+
+    ## args = eval_args(cdr(expr), env)
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &eval_args
+    P1_CALL                             ## r0 = args list
+
+    P1_MOV_R2_R0
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_24
+    P1_LI_BR
+    &apply
+    P1_CALL
+    P1_EPILOGUE_N3
+    P1_RET
+
+
+## ---- eval_quote / eval_if / eval_begin -----------------------------
+## All run inside eval's PROLOGUE_N3 frame: slot 1 = expr, slot 2 =
+## env, slot 3 = per-form scratch.
+:eval_quote
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = (x)
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = x
+    P1_EPILOGUE_N3
+    P1_RET
+
+:eval_if
+    ## (if cond then else). Save (then else) tail into slot 3, eval
+    ## cond, branch to the correct arm.
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = (cond then else)
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = (then else)
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_24
+
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = (cond then else)
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = cond expr
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &eval
+    P1_CALL                             ## r0 = cond value
+
+    P1_LI_R1
+    '07000000'
+    P1_LI_BR
+    &eval_if_else
+    P1_BEQ_R0_R1
+
+    ## Then branch: eval car(slot3)
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_24
+    P1_LI_BR
+    &car
+    P1_CALL
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &eval
+    P1_CALL
+    P1_EPILOGUE_N3
+    P1_RET
+
+:eval_if_else
+    ## Else branch: eval car(cdr(slot3))
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_24
+    P1_LI_BR
+    &cdr
+    P1_CALL
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &car
+    P1_CALL
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &eval
+    P1_CALL
+    P1_EPILOGUE_N3
+    P1_RET
+
+:eval_begin
+    ## (begin e1 e2 ...). Slot 1 holds the moving cursor (we overwrite
+    ## the original expr arg since it's no longer needed). Slot 3
+    ## accumulates the running last-result so it survives the cdr
+    ## call each iteration.
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = body list
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_8                       ## slot 1 := cursor
+
+    P1_LI_R0
+    '07000000'
+    P1_ST_R0_R3_24                      ## slot 3 := nil (empty-begin result)
+
+:eval_begin_loop
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_8
+    P1_LI_R1
+    '07000000'
+    P1_LI_BR
+    &eval_begin_done
+    P1_BEQ_R2_R1
+
+    P1_MOV_R1_R2
+    P1_LI_BR
+    &car
+    P1_CALL
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &eval
+    P1_CALL
+
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_24                      ## slot 3 = last result
+
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_8
+    P1_LI_BR
+    &eval_begin_loop
+    P1_B
+
+:eval_begin_done
+    P1_MOV_R3_SP
+    P1_LD_R0_R3_24
+    P1_EPILOGUE_N3
+    P1_RET
+
+:eval_lambda
+    ## (lambda params body). Slot 1 gets repurposed to hold params
+    ## once we no longer need the original expr.
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = (params body)
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_24                      ## slot 3 = (params body)
+
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = params
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_8                       ## slot 1 := params
+
+    P1_LD_R1_R3_24
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = (body)
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = body
+
+    P1_MOV_R2_R0
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8                       ## r1 = params
+    P1_LD_R3_R3_16                      ## r3 = env
+    P1_LI_BR
+    &make_closure
+    P1_CALL
+    P1_EPILOGUE_N3
+    P1_RET
+
+:eval_define
+    ## (define sym val-expr). gset-binds the evaluated val into the
+    ## global alist, returns nil.
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = (sym val-expr)
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_24                      ## slot 3 := (sym val-expr)
+
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = sym
+    P1_MOV_R3_SP
+    P1_ST_R0_R3_8                       ## slot 1 := sym
+
+    P1_LD_R1_R3_24
+    P1_LI_BR
+    &cdr
+    P1_CALL                             ## r0 = (val-expr)
+    P1_MOV_R1_R0
+    P1_LI_BR
+    &car
+    P1_CALL                             ## r0 = val-expr
+    P1_MOV_R1_R0
+    P1_MOV_R3_SP
+    P1_LD_R2_R3_16
+    P1_LI_BR
+    &eval
+    P1_CALL                             ## r0 = val
+
+    P1_MOV_R2_R0
+    P1_MOV_R3_SP
+    P1_LD_R1_R3_8
+    P1_LI_BR
+    &gset
+    P1_CALL
+
+    P1_LI_R0
+    '07000000'
+    P1_EPILOGUE_N3
+    P1_RET
+
+
+## ---- Step-6 error landing pads --------------------------------------
+:err_unbound
+    P1_LI_R1
+    &msg_unbound
+    P1_LI_R2
+    '0E000000'                          ## strlen("unbound symbol") == 14
+    P1_LI_BR
+    &error
+    P1_B
+
+:err_arity
+    P1_LI_R1
+    &msg_arity
+    P1_LI_R2
+    '0E000000'                          ## strlen("arity mismatch") == 14
+    P1_LI_BR
+    &error
+    P1_B
+
+:err_not_callable
+    P1_LI_R1
+    &msg_not_callable
+    P1_LI_R2
+    '0C000000'                          ## strlen("not callable") == 12
+    P1_LI_BR
+    &error
+    P1_B
+
+
+## ---- Static strings -------------------------------------------------
+:msg_error_prefix
+"error: "
+:msg_newline
+"
+"
+:msg_oom
+"heap exhausted"
+:msg_intern_not_eq
+"intern: foo a != foo b (same name)"
+:msg_intern_collision
+"intern: foo == bar (distinct names collided)"
+:msg_reader_bad
+"reader: malformed input"
+:msg_at
+" at "
+:msg_colon
+":"
+:msg_unbound
+"unbound symbol"
+:msg_arity
+"arity mismatch"
+:msg_not_callable
+"not callable"
+
+## Interned name samples used by the self-test.
+:str_foo
+"foo"
+:str_bar
+"bar"
+
+## Special-form name strings. Lengths are baked into _start's intern
+## calls (update both sides together).
+:str_quote
+"quote"
+:str_if
+"if"
+:str_begin
+"begin"
+:str_lambda
+"lambda"
+:str_define
+"define"
+
+## Step-6 test source. `(define id (lambda (x) x))` binds the identity
+## function globally; `(id 42)` applies it. Expected exit status = 42.
+:src_text
+"(define id (lambda (x) x))(id 42)"
+## 8-byte LE length of the preceding src_text literal. Loaded at
+## startup into src_len so the reader knows where EOF is.
+:src_text_len
+'21000000'
+'00000000'
+
+
+## ---- Special-form symbol slots --------------------------------------
+## Zero-initialized; _start populates each slot with the interned
+## tagged-symbol pointer so eval_pair can dispatch by pointer identity.
+:sym_quote
+'00000000'
+'00000000'
+:sym_if
+'00000000'
+'00000000'
+:sym_begin
+'00000000'
+'00000000'
+:sym_lambda
+'00000000'
+'00000000'
+:sym_define
+'00000000'
+'00000000'
+
+## Global-binding alist head. Zero-initialized (which is a valid
+## untagged pair/nil sentinel? No — nil = 0x07. Seed to nil on entry
+## from _start; the zero here would be interpreted as a pair pointer
+## otherwise).
+:global_env_cell
+'07000000'
+'00000000'
 
 
 ## ---- Symbol table (4096 slots × 8 bytes = 32 KiB) -------------------
diff --git a/p1_gen.py b/p1_gen.py
@@ -743,11 +743,17 @@ SYSOPEN_HEX = {
 }
 
 ## --- Syscall numbers (little-endian 32-bit for LI operand) ---
+## aarch64 and riscv64 share the asm-generic table (write=64, exit=93,
+## clone=220, execve=221, waitid=95). amd64 has its own arch-specific
+## numbers. `wait4` is deliberately absent — it's only defined for
+## 32-bit compat on aarch64/riscv64; `waitid` is the portable choice.
 SYS_NUM = {
-    'aarch64': {'SYS_WRITE': 64, 'SYS_EXIT': 93, 'SYS_READ': 63, 'SYS_CLOSE': 57},
-    # amd64 uses its own table
-    'amd64':   {'SYS_WRITE': 1,  'SYS_EXIT': 60, 'SYS_READ': 0,  'SYS_CLOSE': 3},
-    'riscv64': {'SYS_WRITE': 64, 'SYS_EXIT': 93, 'SYS_READ': 63, 'SYS_CLOSE': 57},
+    'aarch64': {'SYS_WRITE': 64, 'SYS_EXIT': 93, 'SYS_READ': 63, 'SYS_CLOSE': 57,
+                'SYS_CLONE': 220, 'SYS_EXECVE': 221, 'SYS_WAITID': 95},
+    'amd64':   {'SYS_WRITE': 1,  'SYS_EXIT': 60, 'SYS_READ': 0,  'SYS_CLOSE': 3,
+                'SYS_CLONE': 56,  'SYS_EXECVE': 59,  'SYS_WAITID': 247},
+    'riscv64': {'SYS_WRITE': 64, 'SYS_EXIT': 93, 'SYS_READ': 63, 'SYS_CLOSE': 57,
+                'SYS_CLONE': 220, 'SYS_EXECVE': 221, 'SYS_WAITID': 95},
 }
 
 
@@ -783,7 +789,8 @@ def rows():
 
     # --- Syscall numbers ---
     R.append(Banner('Linux syscall numbers (per-arch table). LE-32 immediate operands for LI.'))
-    for name in ('SYS_WRITE', 'SYS_EXIT', 'SYS_READ', 'SYS_CLOSE'):
+    for name in ('SYS_WRITE', 'SYS_EXIT', 'SYS_READ', 'SYS_CLOSE',
+                 'SYS_CLONE', 'SYS_EXECVE', 'SYS_WAITID'):
         R.append(Literal(name=name, hex_by_arch={a: le32(SYS_NUM[a][name]) for a in ARCHES}))
 
     # --- Reg-reg-reg arith tuples used by demo/lisp ---
@@ -1099,6 +1106,135 @@ def rows():
         R.append(Mem(name=f'{op}_{rt.upper()}_{rn.upper()}_{suf}',
                      op=op, rT=rt, rN=rn, off=off))
 
+    # --- Tranche 11: seed-Lisp step-5/6 (printer + eval) ---
+    R.append(Banner('Seed-Lisp step 5/6 extensions (tranche 11): printer + eval'))
+
+    # MOV: write_string copies the string length (r3) into r7 so r7
+    # can survive putc calls inside the escape loop. _start stashes
+    # the last eval result in r4 (callee-saved) across iterations, so
+    # MOV_R4_R0 captures it after eval and MOV_R1_R4 reloads it for
+    # write / SYS_EXIT.
+    R.append(Mov(name='MOV_R7_R3', rD='r7', rA='r3'))
+    R.append(Mov(name='MOV_R4_R0', rD='r4', rA='r0'))
+    R.append(Mov(name='MOV_R1_R4', rD='r1', rA='r4'))
+
+    # BNE_R1_R0: apply's callee-tag check branches to err_not_callable
+    # when the low-3 tag doesn't equal 0b110.
+    R.append(CondB(name='BNE_R1_R0', op='BNE', rA='r1', rB='r0'))
+
+    # LB_R1_R6_0: write_string reads a single byte at the current
+    # cursor (r6) into r1 before the escape-test/putc sequence.
+    R.append(Mem(name='LB_R1_R6_0', op='LB', rT='r1', rN='r6', off=0))
+
+    # ADDI: NEG4 strips the STRING tag (0b100) from a tagged string,
+    # NEG6 strips the CLOSURE tag (0b110) inside apply.
+    for d, a, imm in [
+        ('r1','r1',-6),
+        ('r2','r2',-4),
+    ]:
+        suf = f'NEG{-imm}' if imm < 0 else f'{imm}'
+        R.append(AddI(name=f'ADDI_{d.upper()}_{a.upper()}_{suf}',
+                      rD=d, rA=a, imm=imm))
+
+    # Memory ops:
+    #   LD_R0_R1_{16,24}   — apply reads body/env from raw closure ptr
+    #   LD_R0_R3_8         — eval_self_slot1 returns the saved expr
+    #   LD_R1_R3_16        — loads slot 2 into r1 when r1 is the cons/car
+    #                        arg and we want a clean register
+    #   LD_R1_R3_32        — env_extend fetches saved name from slot 4
+    #   LD_R2_R3_8/24      — eval_begin cursor reload; env_extend env reload
+    #   LD_R3_R3_16/32     — eval_lambda / apply load env into r3 as the
+    #                        3rd argument to make_closure / env_extend
+    #   ST_R0_R3_16/32     — env_extend spills updated vals / saves name
+    #   ST_R1_R0_{8,16,24} — make_closure writes fields into fresh heap
+    for op, rt, rn, off in [
+        ('LD','r0','r1',16),
+        ('LD','r0','r1',24),
+        ('LD','r0','r3',8),
+        ('LD','r1','r3',16),
+        ('LD','r1','r3',32),
+        ('LD','r2','r3',8),
+        ('LD','r2','r3',24),
+        ('LD','r3','r3',16),
+        ('LD','r3','r3',32),
+        ('ST','r0','r3',16),
+        ('ST','r0','r3',32),
+        ('ST','r1','r0',8),
+        ('ST','r1','r0',16),
+        ('ST','r1','r0',24),
+    ]:
+        suf = f'NEG{-off}' if off < 0 else f'{off}'
+        R.append(Mem(name=f'{op}_{rt.upper()}_{rn.upper()}_{suf}',
+                     op=op, rT=rt, rN=rn, off=off))
+
+    # --- Tranche 12: kaem-minimal (line shell for build scripts) ---
+    R.append(Banner('kaem-minimal extensions (tranche 12): line shell for build'))
+
+    # MOVs: read_file_all parks (fd, buf, cap, total) in r4–r7 across
+    # SYSCALLs (callee-saved natives) so the read loop doesn't need a
+    # prologue. write_cstr uses the same trick across its strlen call.
+    for d, a in [('r0','r7'), ('r2','r5'), ('r3','r0'), ('r3','r6'),
+                 ('r4','r1'), ('r4','r2'), ('r5','r1'), ('r5','r2'),
+                 ('r6','r3'), ('r1','r2')]:
+        # MOV_R1_R2 already exists in tranche 10 — skip via duplicate check.
+        if d == 'r1' and a == 'r2':
+            continue
+        R.append(Mov(name=f'MOV_{d.upper()}_{a.upper()}', rD=d, rA=a))
+
+    # RRR: bump-pointer and accumulator updates in read_file_all's inner
+    # loop (buf += n, cap -= n, total += n). ADD_R1_R1_R0 is end_token's
+    # argv_out[count] address calc.
+    for op, d, a, b in [
+        ('ADD','r1','r1','r0'),
+        ('ADD','r5','r5','r0'),
+        ('ADD','r7','r7','r0'),
+        ('SUB','r3','r3','r2'),
+        ('SUB','r6','r6','r0'),
+    ]:
+        R.append(RRR(name=f'{op}_{d.upper()}_{a.upper()}_{b.upper()}',
+                     op=op, rD=d, rA=a, rB=b))
+
+    # ADDI_R1_R0_2: _start computes r1 = argc + 2 before scaling by 8
+    # to land on envp.
+    R.append(AddI(name='ADDI_R1_R0_2', rD='r1', rA='r0', imm=2))
+
+    # SHLI_R0_R3_3: end_token scales token_count by 8 (slot size) into
+    # r0 for the argv_out address addition.
+    R.append(ShiftI(name='SHLI_R0_R3_3', op='SHLI', rD='r0', rA='r3', imm=3))
+
+    # Branches. BLT_R0_R1 covers the two fd/argc-sign checks in _start
+    # and the read EOF check; BEQ_R3_R2 is the strlen-zero-byte check;
+    # BNE_R3_R0 is arena_putc's "token already open?" guard.
+    for op, a, b in [
+        ('BEQ','r3','r2'),
+        ('BLT','r0','r1'),
+        ('BNE','r3','r0'),
+    ]:
+        R.append(CondB(name=f'{op}_{a.upper()}_{b.upper()}',
+                       op=op, rA=a, rB=b))
+
+    # Memory ops. LB at offset 24 reads si_status out of the waitid
+    # siginfo_buf (SIGCHLD case: byte 24 is the exit code / signal).
+    # LD_R3_R3_0 is end_token dereferencing token_arena_next;
+    # LD_R3_R1_0 loads mutable globals into r3 for use as a cursor.
+    # ST_R3_R1_0 / ST_R3_R2_0 / ST_R0_R1_0 persist cursor/count.
+    # SB_{R0,R1}_R3_0 write the NUL terminator / token byte into the
+    # arena at token_arena_next.
+    for op, rt, rn, off in [
+        ('LB','r1','r1',24),
+        ('LB','r3','r2',0),
+        ('LD','r3','r1',0),
+        ('LD','r3','r3',0),
+        ('SB','r0','r3',0),
+        ('SB','r1','r3',0),
+        ('ST','r0','r1',0),
+        ('ST','r3','r1',0),
+        ('ST','r3','r2',0),
+    ]:
+        suf = f'NEG{-off}' if off < 0 else f'{off}'
+        R.append(Mem(name=f'{op}_{rt.upper()}_{rn.upper()}_{suf}',
+                     op=op, rT=rt, rN=rn, off=off))
+
     return R

	boot2 Playing with the boostrap
	git clone https://git.ryansepassi.com/git/boot2.git
	Log \| Files \| Refs

M	lisp.M1	\|	1263	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
M	p1_gen.py	\|	146	++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---