boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs

commit f10eab299b822884ec68f4c6b3b8285859515ca9
parent 2bf50d2a8f56b2106a737e260c4900abfb98d07a
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Thu, 23 Apr 2026 21:14:37 -0700

Update Lisp plan for P1

Diffstat:
Mdocs/LISP-C.md | 521++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------
1 file changed, 414 insertions(+), 107 deletions(-)

diff --git a/docs/LISP-C.md b/docs/LISP-C.md @@ -1,16 +1,43 @@ -# LISP: Scheme interpreter in C +# Scheme in P1 ## Goal A small tree-walking interpreter for the Scheme subset specified in -[LISP-SUBSET.md](LISP-SUBSET.md). Intent: simplicity and correctness -first; performance is a later concern. +[LISP.md](LISP.md), written in P1 via the M1PP + M1 + M0 + hex2 +toolchain. Intent: simplicity and correctness first; performance is a +later concern. The self-hosted C compiler (see [PLAN.md](PLAN.md)) will run under this interpreter, as will shell-style scripts built on [shell.scm](../../shell.scm) and [prelude.scm](../lisp/prelude.scm). +## Toolchain + +``` +scheme.P1 (hand-written M1PP source using P1.M1pp) + | prepend arch backend + P1.M1pp + | m1pp + v +scheme.M1 (target-specific M1 ops) + | M0 + v +scheme.hex2 + | hex2 + v +scheme (native ELF) +``` + +C-shaped snippets in this document are descriptive of the target shape, +not the source. The source is M1PP-preprocessed P1 assembly. + +## Word size + +P1v2-64 is the initial target. Every heap word is 8 bytes; every object +is 8-byte aligned. P1v2-32 is out of scope for the first pass; porting +would require a parallel layout story because M1PP's `%struct` stride +is fixed at 8. + ## Settled decisions Load-bearing; the rest of the document assumes them. @@ -18,11 +45,11 @@ Load-bearing; the rest of the document assumes them. 1. **Tree-walking eval/apply over raw s-expressions.** No AST preprocessing, no bytecode, no compilation step. The reader's output is the interpreter's input. -2. **Static buffers, no GC.** Fixed-size BSS regions for heap, symbol - table, reader buffer, writer buffer. Heap exhaustion aborts with - an error. -3. **Bump allocation, 8-byte aligned.** Single `heap_next` pointer; - every object is 8-byte aligned by construction. +2. **Static buffers, no GC.** Fixed-size labeled byte reservations for + heap, symbol table, reader buffer, writer buffer. Heap exhaustion + aborts with an error. +3. **Bump allocation, 8-byte aligned.** Single `heap_next` pointer + (pinned in `s0`); every object is 8-byte aligned by construction. 4. **Tagged values, low-3-bit tag.** See §Value representation. 5. **Symbols are interned indices.** No heap object per symbol; the value is an index into the symbol table, which stores the name @@ -41,9 +68,10 @@ Load-bearing; the rest of the document assumes them. recursion between local helpers. 10. **`pmatch` is a built-in special form.** Saves writing a macro expander for one feature. -11. **Tail calls via host `musttail`.** All tail positions in `eval` - and `apply` are marked. Scheme-level tail-call correctness falls - out. +11. **Tail calls via P1 `TAIL` / `TAILR`.** Every tail position in + `eval` and `apply` compiles to `TAIL` (label) or `TAILR` (register) + rather than `CALL` + `RET`. Scheme-level tail-call correctness + falls out. 12. **Primitive failure is undefined behavior.** No runtime checks on `(car '())`, `(quotient n 0)`, out-of-bounds bytevector access, integer overflow, or mutation of literals. Follows the spec's @@ -51,18 +79,18 @@ Load-bearing; the rest of the document assumes them. ## Memory layout -Single BSS region, fixed at link time. Starting sizes: +Labeled byte reservations in the final image. Starting sizes: | Region | Size | Purpose | |---------------|----------|--------------------------------------| -| `heap[]` | 64 MB | bump-allocated Scheme objects | -| `symtab[]` | 16384 × | open-addressing intern table | -| `readbuf[]` | 16 MB | slurped source file | -| `writebuf[]` | 64 KB | staged output for display/write | -| `argbuf[]` | 64 × 8 B | primitive dispatch scratch | +| `heap` | 64 MB | bump-allocated Scheme objects | +| `symtab` | 16384 × | open-addressing intern table | +| `readbuf` | 16 MB | slurped source file | +| `writebuf` | 64 KB | staged output for display/write | +| `argbuf` | 64 × 8 B | primitive dispatch scratch | -All sized via `#define`s; easy to retune. Overflow of any buffer → -`error(...)` → `exit(1)`. +All sized via M1PP constants; easy to retune. Overflow of any buffer → +`error(...)` → `sys_exit(1)`. ### Symbol table entry @@ -78,16 +106,18 @@ struct sym_entry { `global_val` gives O(1) top-level lookup: every symbol's global binding is reachable without hash probing once the symbol is resolved. -The symbol table and heap are the only global roots. +M1PP layout (widened to word-per-field for 8-byte stride): -## Allocation +``` +%struct SYMENT { name_bv global_val name_hash chain_next } +; .name_bv = 0, .global_val = 8, .name_hash = 16, .chain_next = 24, +; .SIZE = 32 +``` -`alloc(bytes)` rounds up to 8, checks `heap_next + bytes <= heap_end`, -advances the bump pointer, returns the pre-bump address. Overflow → -`error("heap exhausted")` → `exit(1)`. +The `hash` and `next` fields are logically 32-bit but occupy a full +word each to stay on the `%struct` stride; simpler than packing. -8-byte alignment is enforced by the bump allocator; every object -constructor assumes it. +The symbol table and heap are the only global roots. ## Value representation @@ -102,18 +132,24 @@ All values are 64-bit tagged words. Low 3 bits are the tag. | `100` | immediate singleton | `#f`, `#t`, `()`, unspec, unbound | | `101`–`111` | reserved | | -Immediates (bit patterns chosen at implementation time, all tag `100`): +### Tag and immediate constants -| Constant | Meaning | -|-----------|---------------------------------------------| -| `FALSE` | the Scheme `#f` | -| `TRUE` | the Scheme `#t` | -| `NIL` | the empty list `()` | -| `UNSPEC` | result of `set!`, `define`, etc. | -| `UNBOUND` | sentinel for undefined globals | +Encoded as M1PP `%enum`s. Concrete values derive via expression: -Truthiness: every value except `FALSE` is truthy. Test is a single -compare. +``` +%enum TAG { FIXNUM PAIR SYM HEAP IMM } ; raw indices 0..4 +%enum IMM { FALSE TRUE NIL UNSPEC UNBOUND } ; raw indices 0..4 + +; a tag 100 value in low bits is literally %TAG.IMM == 4 +%macro MKIMM(idx) +$((| (<< idx 3) %TAG.IMM)) +%endm + +; usage: %MKIMM(%IMM.FALSE), %MKIMM(%IMM.NIL), etc. +``` + +Truthiness: every value except `%MKIMM(%IMM.FALSE)` is truthy. Test is +a single compare. ## Object types @@ -128,23 +164,205 @@ This is the complete runtime type set. | pair | `[car][cdr]`, 16 B, tag 001, no header | | bytevector | `[hdr = BV | len<<8][bytes...]`, tag 011 | | closure | `[hdr = CLOSURE][params][body][env]`, tag 011 | -| primitive | `[hdr = PRIM | code_id<<8]`, tag 011 | +| primitive | `[hdr = PRIM | entry_label]`, tag 011 | | record-td | `[hdr = TD][name_sym][nfields]`, tag 011 | | record | `[hdr = REC][td][field_0]...[field_{n-1}]`, tag 011| -Headered object header word: +Headered object header word (little-endian bytes): ``` [ type:8 | aux:56 ] ``` where `aux` carries type-specific data (length for bytevectors, -code id for primitives, nfields count for records, etc.). One load +entry-label for primitives, nfields for records, etc.). One load tells you the type. +### Layouts + +``` +%struct PAIR { car cdr } ; .SIZE = 16 +%struct CLOSURE { hdr params body env } ; .SIZE = 32 +%struct BV { hdr } ; bytes follow inline +%struct PRIM { hdr } ; header encodes entry +%struct TD { hdr name_sym nfields } ; .SIZE = 24 +; record is variable-width; td+nfields known from TD +%struct REC_HDR { hdr td } ; field_i at offset 16 + 8*i +%enum HDR { BV CLOSURE PRIM TD REC } ; type-byte values +``` + Strings and bytevectors are the same type. `string?` ≡ `bytevector?`. No character type: `#\a` in source is sugar for fixnum 97. +### Primitive encoding + +An M1 label is 4 bytes. A primitive fits in a single 8-byte header: + +``` +bytes [0..4): &entry_label ; 4-byte hex2 label reference +bytes [4..6): arity ; @(arity) little-endian 2 bytes +byte [6] : type = %HDR.PRIM ; !(%HDR.PRIM) +byte [7] : 0 ; !(0) +``` + +This makes `apply_prim` a header-load plus `callr`: no dispatch table, +no switch. §Apply sketches the sequence. + +## Tag idioms + +A handful of short sequences do all of the runtime's bit-twiddling. +They are M1PP macros so the rest of the code reads at the Scheme +level. These are illustrative; exact op selection is finalized during +implementation. + +``` +%macro tagof(rd, rs) ; rd = rs & 7 +%andi(rd, rs, 7) +%endm + +%macro make_fixnum(rd, rs) ; rd = rs << 3 +%shli(rd, rs, 3) +%endm + +%macro untag_fixnum(rd, rs) ; rd = rs >> 3, arithmetic +%sari(rd, rs, 3) +%endm + +%macro car(rd, rs) ; pair tag is 001; raw = rs - 1 +%ld(rd, rs, -1) ; [raw + 0] = car +%endm + +%macro cdr(rd, rs) +%ld(rd, rs, 7) ; [raw + 8] = cdr +%endm + +%macro set_car(rs, rv) +%st(rv, rs, -1) +%endm + +%macro set_cdr(rs, rv) +%st(rv, rs, 7) +%endm + +%macro hdr_load(rd, rs) ; heap tag is 011; raw = rs - 3 +%ld(rd, rs, -3) +%endm + +%macro hdr_type(rd, rs) ; type byte is low byte of header +%lb(rd, rs, -3) +%endm + +%macro hdr_field(rd, rs, idx) ; word field idx, 0 counts the header word +%ld(rd, rs, (- (* idx 8) 3)) +%endm +``` + +`cons` is a bump + two stores + tag: + +``` +%macro cons(rd, rcar, rcdr, rtmp) +; assumes s0 = heap_next, s1 = heap_end +%addi(rtmp, s0, 16) +%la_br(heap_oom) +%bltu(s1, rtmp) ; if heap_end < next+16, OOM +%st(rcar, s0, 0) +%st(rcdr, s0, 8) +%addi(rd, s0, 1) ; tag 001 +%mov(s0, rtmp) +%endm +``` + +`alloc_hdr(rd, type_byte, bytes, rtmp)` is the corresponding idiom for +headered objects; it bumps, writes the header word, and returns the +tagged pointer. The header word is composed on the fly from the type +byte and any aux field. + +## Dispatch tables + +P1 has no indexed branch op, but it has `BR rs` / `CALLR rs` / `TAILR +rs` and one-word loads. That's enough to build a jump table for any +switch wider than ~3 cases, which we use for the special-form +cascade in `eval`. + +Each entry is 8 bytes: a 4-byte M1 label reference plus 4 bytes of +pad. P1v2-64 has no 4-byte load, only full-word `LD`, so padding to +a word is the cheap path. + +``` +%macro table_entry(lref) +lref %(0) ; label ref + 4 zero bytes = 8 bytes +%endm +``` + +Usage (per special form, in index order): + +``` +special_dispatch_table: +%table_entry(&eval_quote) +%table_entry(&eval_if) +%table_entry(&eval_lambda) +%table_entry(&eval_define) +%table_entry(&eval_set) +%table_entry(&eval_begin) +... +``` + +Dispatch is load + indirect branch: + +``` +; t0 = sym_idx, already range-checked (< N_SPECIAL) +%la(t1, special_dispatch_table) +%shli(t2, t0, 3) +%add(t1, t1, t2) +%ld(t1, t1, 0) +%tailr(t1) +``` + +Five ops and one memory load, regardless of how many special forms. + +**Reservation convention.** Special-form symbols are interned first at +startup, so their sym indices are `0..N_SPECIAL-1`. The dispatcher +checks `idx < N_SPECIAL` before using the table; out-of-range means +"ordinary application". + +## Allocation + +C sketch (target shape): + +```c +static val_t alloc(size_t bytes) { + char *p = heap_next; + heap_next += (bytes + 7) & ~7; + if (heap_next > heap_end) error("heap exhausted"); + return (val_t) p; +} +``` + +P1 translation: `s0` and `s1` are pinned to `heap_next` and `heap_end` +for the lifetime of the program. Allocation is `addi` + `bltu` + +`mov`. In practice almost every allocation site calls one of `%cons`, +`%alloc_hdr`, or an inline bytevector emitter, not a generic `alloc`. + +## Register assignment + +Four callee-saved registers carry the hot state: + +| Reg | Role | +|------|----------------------------------------------------| +| `s0` | `heap_next` | +| `s1` | `heap_end` | +| `s2` | current env pointer | +| `s3` | scratch held across one inner loop at a time | + +`eval(expr, env)` is `a0 = expr`, `a1 = env`, returns in `a0`. +Handlers copy `a1` into `s2` at entry. `a0..a3` and `t0..t2` are free +across calls. Leaf code inside a handler may use `s3` without saving, +provided it restores before the handler returns. + +Reader, writer, and primitive bodies follow the same discipline. Any +primitive that needs deep recursion (e.g. `equal?`) saves `s2` in its +frame even if it does not touch env, since its callees might. + ## Closures ``` @@ -188,6 +406,8 @@ functions works without any pre-pass. ## Eval +C sketch (target shape): + ```c val_t eval(val_t expr, val_t env) { switch (tag(expr)) { @@ -227,12 +447,54 @@ val_t eval(val_t expr, val_t env) { } ``` -The special-form symbols (`SYM_QUOTE`, `SYM_IF`, etc.) are -indices interned once at startup and stored in globals. Dispatch is a +The special-form symbols (`SYM_QUOTE`, `SYM_IF`, etc.) are interned +once at startup and stored in labeled word slots. Dispatch is a cascade of integer compares. +P1 translation. The outer tag switch is a short cascade of `beq`s on +the tag value (fixnum / imm / sym / heap return immediately; pair +falls through to form dispatch). Form dispatch uses the jump table +from §Dispatch tables: + +``` +; eval_pair path, a0 = expr, a1 = env already normalized +%car(t1, a0) ; t1 = head +%cdr(a1, a0) ; a1 = rest, set up for handlers + +; fall through to ordinary application if head is not a symbol +%andi(t2, t1, 7) +%la_br(eval_apply) +%addi(t3, t2, (- %TAG.SYM)) +%bnez(t3) ; tag != SYM -> apply + +; extract idx and bounds-check +%sari(t0, t1, 3) ; t0 = sym_idx +%la(t2, N_SPECIAL_const) +%ld(t2, t2, 0) +%la_br(eval_apply) +%bltu(t0, t2) ; invert: take table path iff idx < N +... + +; table dispatch +%la(t1, special_dispatch_table) +%shli(t2, t0, 3) +%add(t1, t1, t2) +%ld(t1, t1, 0) +%tailr(t1) +``` + +(The bounds-check polarity and the branch for "take table path" are +sketched loosely; the final shape is whatever combination of +`BLT`/`BLTU`/`BEQ` produces the fewest ops.) + +Each handler is its own P1 function. Tail positions inside handlers +end with `%la_br(target); %tail()`. An ordinary application is +`eval(head)`, `eval_list(rest)`, then `%la_br(apply); %tail()`. + ## Apply +C sketch: + ```c val_t apply(val_t fn, val_t args) { switch (header_type(fn)) { @@ -253,6 +515,36 @@ when params becomes a non-pair symbol, that symbol is bound to the remaining args list. When params becomes `NIL` with args remaining (or vice versa), error out with an arity message. +P1 translation. `hdr_type` reads one byte; dispatch is two compares: + +``` +; apply(fn=a0, args=a1) +%hdr_type(t0, a0) +%la(t1, %HDR.PRIM) +%la_br(apply_prim) +%beq(t0, t1) +%la(t1, %HDR.CLOSURE) +%la_br(apply_closure) +%beq(t0, t1) +%la_br(err_not_procedure) +%b() +``` + +`apply_prim` extracts the entry label from the primitive's header +(low 4 bytes) and `callr`s it with the arg list in `a0`: + +``` +apply_prim: +%hdr_load(t0, a0) ; t0 = header word +%andi(t0, t0, 0xFFFFFFFF) ; low 32 bits = &entry +%mov(a0, a1) ; arg list in a0 for the primitive +%callr(t0) +%ret() +``` + +`apply_closure` runs `bind_params`, extends env, then `%la_br`s +`eval_body` and `%tail()`s. + ## Eval_body ```c @@ -287,11 +579,11 @@ The `SYM_DEFINE` handler in `eval` itself handles only top-level ## Special forms -Implemented directly, ~15-30 lines of C each: +Implemented directly, each ~30-60 P1 ops: - `quote` — return the datum. - `if` — eval test; tail-eval the chosen branch. -- `lambda` — build closure. +- `lambda` — build closure (one `%alloc_hdr`). - `define` — top-level: write `symtab[idx].global_val`. Inner: handled by `eval_body`. - `set!` — walk env; first match mutates. Miss: write global slot. @@ -331,8 +623,7 @@ Patterns: | `(p1 p2 ...)` | proper list of exactly that length | | `(p1 ... . ptail)` | improper list; `ptail` binds the rest | -No compilation or caching — rewalk the patterns each call. ~200 LOC -total for matcher plus driver. +No compilation or caching — rewalk the patterns each call. ## define-record-type @@ -358,13 +649,13 @@ becomes ``` The `%` primitives are internal, not part of the user-facing -primitive list. ~100 LOC for the desugaring plus ~100 LOC for the -record primitives. +primitive list. ## Primitives -55 user-facing primitives plus 6 internal record ops. Dispatched by -code id through a switch in `call_prim`. +55 user-facing primitives plus 6 internal record ops. Each is its own +P1 function, reached via `callr` on the label packed into the +primitive object's header. **Predicates / equality** (13): `eq?`, `equal?`, `not`, `boolean?`, `pair?`, `null?`, `symbol?`, `integer?`, `string?` (≡ `bytevector?`), @@ -389,17 +680,24 @@ code id through a switch in `call_prim`. ### Calling convention -`call_prim` reads the args list from `apply`, copies the values into -`argbuf[]`, and switches on the primitive code id. Each primitive -reads its inputs from `argbuf`, returns a `val_t`. +`apply_prim` receives the arg list in `a0` and jumps via `callr` to +the primitive's entry label (extracted from the primitive header). +Each primitive's body: + +1. Walks the arg list, copying values into `argbuf[]` and counting + `argc`. +2. Checks arity (fixed-arity primitives; variadic ones do their own). +3. Computes result into `a0`. +4. `ret`s. -Fixed-arity primitives check argc at the dispatch boundary; variadic -primitives (`+`, `-`, `append`, etc.) do their own argc handling. +There is no code-id switch and no dispatch table — the table of +primitives is the set of labels referenced from `register_primitives` +at startup, and the interpreter keeps zero pointers to it afterwards. ## Prelude -Written in Scheme, embedded as a string constant, parsed and -evaluated at startup like any other source. Contents: +Written in Scheme, embedded as a bytevector literal in `scheme.P1`, +parsed and evaluated at startup like any other source. Contents: ```scheme (define (list . xs) xs) @@ -419,7 +717,7 @@ Runs before the user file, sharing the same symbol table and heap. Single-pass recursive descent over `readbuf` with an integer cursor. No token array. Conses directly onto the heap. -Handles the lexical syntax specified in scheme-subset.md: +Handles the lexical syntax specified in LISP.md: - lists with `.` dotted tail - `'x` → `(quote x)` (no quasiquote, unquote, unquote-splicing) @@ -449,51 +747,58 @@ Three entry points: - `format fmt args...` — supports `~a` (display), `~s` (write), `~d` (decimal fixnum), `~%` (newline). -Output buffered in `writebuf[]`; flushed on buffer-full or on newline -for `display`/`write`. `error` flushes before writing to fd 2. +Output buffered in `writebuf`; flushed via `sys_write(1, ...)` on +buffer-full or on newline for `display`/`write`. `error` flushes +before writing to fd 2. ## Error path `(error msg arg ...)`: 1. Flush writer buffer. -2. Write `"error: "` + message + formatted args to fd 2. -3. `exit(1)`. +2. Write `"error: "` + message + formatted args to fd 2 via + `sys_write`. +3. `sys_exit(1)`. No unwinding, no handlers. Primitive failure follows the same policy as the spec: undefined behavior, likely crash, no runtime check. ## Startup +P1 defines a portable program-entry model (see P1.md §Program Entry): +the arch backend emits `_start`, captures argc/argv off the native +entry stack, and calls the portable label `p1_main` with `a0 = argc`, +`a1 = argv`. On return, the backend does `sys_exit(a0)`. + +The Scheme interpreter therefore contains no per-target startup code. +It just defines `p1_main`: + ``` -_start: - parse argv - if argc < 2: error "usage: lisp <file.scm>" - init heap_next, heap_end - init symtab - intern the special-form symbols (SYM_QUOTE, SYM_IF, ...) - register primitives: for each (name, code_id, arity), intern - the name and bind a primitive object in symtab[idx].global_val - parse and eval prelude (embedded) - open argv[1], slurp into readbuf - loop: - read one top-level form - eval in NIL env - advance cursor - until EOF - exit 0 +p1_main: + ; a0 = argc, a1 = argv + init_heap ; s0 = &heap, s1 = &heap_end + init_symtab + intern_special_forms ; fill labeled slots sym_quote, sym_if, ... + ; (order reserves sym_idx 0..N_SPECIAL-1) + register_primitives ; emit %HDR.PRIM words with &entry labels + parse_and_eval_prelude ; prelude is an embedded bytevector literal + ; slurp argv[1] into readbuf + ; loop: read_one_top_level; eval(form, NIL); advance cursor; until EOF + li a0, 0 + ret ; backend _start does sys_exit(0) ``` -All state lives in BSS; no heap allocation before `_start`. +All state lives in labeled reservations; nothing is heap-allocated +before `p1_main`. ## File layout ``` lispcc/ src/ - lisp.c # single C source, target ~2–2.5k LOC + scheme.P1 # M1PP source, consumes P1.M1pp macro library lisp/ - prelude.scm # embedded at build time + prelude.scm # embedded into scheme.P1 as a bytevector literal tests/lisp/ 00-arith.scm 01-list.scm @@ -506,43 +811,45 @@ lispcc/ ... ``` +Build: prepend the arch-specific backend M1M to `p1/P1.M1pp`, prepend +that bundle to `src/scheme.P1`, run `m1pp` → `M0` → `hex2` → ELF. Tests are `.scm` files the interpreter runs; pass = exit 0 and -expected stdout. Build integrates into the existing Makefile. +expected stdout. ## Milestones Status legend: `[x]` done · `[~]` in progress · `[ ]` not started. -1. [ ] **Runtime skeleton.** `_start`, argv parsing, BSS layout, bump - allocator, `error` path. Exits with a placeholder status. - ~200 LOC. -2. [ ] **Tagged values.** Fixnum / pair / immediate encoding; - `cons`/`car`/`cdr`/tag predicates as C functions. Hand-build - `(cons 42 nil)` and exit with the decoded car. ~200 LOC. -3. [ ] **Symbols + interning.** 16k-slot open-addressing table; - `intern(name, len)` returns a tagged symbol index. Two - `intern("foo")` calls compare equal as integers. ~250 LOC. -4. [ ] **Bytevectors.** Headered heap object; `make-bytevector`, - `bytevector-u8-ref`, etc. as C helpers. ~200 LOC. -5. [ ] **Reader.** End-to-end source → s-expression for the full - lexical syntax, with source-location side table. ~500 LOC. -6. [ ] **Writer.** `display`, `write`, `format`. ~250 LOC. -7. [ ] **`eval` + `apply` core.** Self-eval, symbol lookup, `if`, - `lambda`, top-level `define`, application. No musttail yet. - ~300 LOC. -8. [ ] **Tail calls.** Mark all tail positions with musttail. Stress: - 10⁶-iteration named-let loop runs in constant host stack. +1. [ ] **Bootstrapping shim.** Per-target `_start`, argv capture, + labeled reservations for heap / symtab / readbuf / writebuf, + `sys_write` + `sys_exit` error path. Exits with a placeholder + status. +2. [ ] **Tag macros.** `%tagof`, `%make_fixnum`, `%untag_fixnum`, + `%car`, `%cdr`, `%hdr_*`. Verify constants at M1PP time with + `%select`. +3. [ ] **Bump allocator and cons.** `%cons` macro; OOM branch. + Hand-build `(cons 42 nil)` and exit with the decoded car. +4. [ ] **Symbols + interning.** 16k-slot open-addressing table; + `intern(name, len)` returns a tagged symbol val_t. Two interns of + the same name compare equal as integers. +5. [ ] **Bytevectors.** Headered heap object; `make-bytevector`, + `bytevector-u8-ref`, etc. +6. [ ] **Reader.** End-to-end source → s-expression for the full + lexical syntax, with source-location side table. +7. [ ] **Writer.** `display`, `write`, `format`. +8. [ ] **`eval` + `apply` core.** Self-eval, symbol lookup, `if`, + `lambda`, top-level `define`, application. Every tail position + uses `%tail`. 9. [ ] **Remaining special forms.** `set!`, `begin`, `let`/`let*`/`letrec`, named let, `cond`, `and`/`or`, inner - `define` in `eval_body`. ~300 LOC. + `define` in `eval_body`. 10. [ ] **Primitives.** All 55 user-facing + 6 internal. Batches: arith/bitwise, list-core, bv/string, I/O+apply+error, record - ops. ~1000 LOC. -11. [ ] **Prelude.** Embed as string; parse and eval at startup. - ~80 Scheme LOC. + ops. +11. [ ] **Prelude.** Embed as bytevector literal; parse and eval at + startup. 12. [ ] **`define-record-type`.** Desugaring + record primitives. - ~250 LOC. -13. [ ] **`pmatch`.** Pattern matcher + clause driver. ~200 LOC. +13. [ ] **`pmatch`.** Pattern matcher + clause driver. 14. [ ] **End-to-end.** Run the self-hosted compiler under it. -Rough total: **~2–2.5k LOC C + ~80 Scheme LOC**. +Rough total: **~4–5k P1 LOC + ~80 Scheme LOC**.