scheme1 → shell.scm TODO
Checklist for getting lisp/shell.scm running under scheme1.
Workflow: every item is red-green TDD. Add a failing
tests/scheme1/NN-*.scm (with .expected-exit and/or .expected) first,
run the suite to confirm it fails for the expected reason, then
implement until green. Multi-arch suite (make test SUITE=scheme1)
must stay clean before moving on.
Audit: deviations and known issues
Everything below is a real bug, hack, or spec gap that must be addressed before calling scheme1 shippable.
Open bugs
Prelude
spawnreached throughrunerrors with "unbound variable" in the parent.(run prog)from user code fails even though(spawn prog)inline at user level with the identical body works. Root cause not identified (apply_build_argswalking the variadic list, closure env capture, or env extension with the dotted-tail paramargsare all suspects). See testtests/scheme1/45-shell-spawn.scm— it works around the bug by redefiningspawnat user level. Until this is understood, the prelude'sspawnandrunare effectively unverified.No heap-exhaustion check.
cons,alloc_hdr, andalloc_bytesnow compareheap_next + bytesagainst:heap_end(initialized toheap_buf_ptr + HEAP_CAP_BYTESat startup) and abort viaruntime_erroron overflow.load_sourceandeval_preludereject sources that would overrunREADBUF_CAP_BYTES.No symtab-name copy bound. Name copies still go through
alloc_bytes, but that path now errors cleanly when the heap arena is exhausted instead of silently scribbling into the symtab.intern's 1024-slot count check remains and routes through the sameruntime_error.Bytevector-u8-set! / -ref / -copy / -copy! have no bounds check. All four now check
0 <= idx < length(or0 <= start <= end <= length, plus the dst-side range forbytevector-copy!) and abort viaruntime_error.make-bytevectorandbytevector-grow!reject negative arguments through the same path.car/cdrof non-pair,quotient/remainderof zero, etc., are silent UB — same policy as above, no abort path.
Spec features still missing
Per LISP.md and LISP-C.md, but not implemented:
- Special forms missing:
set!,pmatch,cond's=>arrow form.pmatchis called out by LISP-C.md as a built-in special form needed by the self-hosted compiler. - Primitives missing (LISP.md lists them as required):
- Equality:
eqv?,equal?(we only haveeq?) - Predicates:
boolean?,integer?,string?,procedure?,record?,record-type? - Numeric:
quotient,remainder,modulo,<=,>=,>,positive?,negative?,abs,min,max,bit-xor,bit-not,number->string,string->number - Pair / list:
set-car!,set-cdr!,length,list-ref,map,for-eachas primitives (we provide them via the prelude only) - Bytevector:
bytevector-append,bytevector=?,string->symbol,symbol->string
- Equality:
+ - * = <are 2-arg only. R7RS allows any arity.applyis variadic on the trailing list but otherwise unverified for arity edge cases.- Type names are not bound by
define-record-type. The TD is reachable only via the parameterized prims that close over it; norecord-type-of, no way to inspect a TD from user code. Spec is ambiguous on this; LISP-C.md's example uses a generated<point-td>binding. - shell.scm's port record-type,
stdin/stdout/stderrports,open-input/open-output/read-line/read-bytes/read-all/bv-concat-reverse/write-bytes/write-lineare NOT in the prelude. Only the process-management half of shell.scm is ported. scheme1/prelude.scmis a strict subset oflisp/prelude.scm. Active set:<=,>=,negative?,abs,caar/cadr/cdar/cddr/caddr,list?,assoc,member,filter,fold, plus the inherited list/shell helpers. Commented-out placeholders forpositive?(needs>),vector->list/list->vector(needmake-vector/vector-ref/vector-set!/vector-length), andequal?(needsstring?/vector?plus their ref/length) wait on the corresponding primitives.
Hacks and fragile invariants
These work today but are easy to break.
Bytevector NUL-termination via headroom.
bv_capacity_forreturns the smallest power of two strictly greater thann. The byte at indexlengthis the zero-init NUL terminator and we hand the rawdata_ptrdirectly to syscalls expecting C strings (sys-openat,sys-execve, the per-arg pointers inbuild_execve_argv). If user code callsbytevector-u8-set!pastlength, that NUL is gone and the next syscall reads garbage. Capacity is never reset bybytevector-copy!or any other op, so the invariant only protects fresh / never-overwritten bytevectors.bytevector-grow!is a public primitive (bv_grow) that's effectively only there to make the doubling path testable (test 34). Not in R7RS, not in LISP.md. Either expose it as part of a documented mutable-bytevector API or delete and demotebv_growto internal.%record-*primitives are exposed publicly inprim_tablealongside the parameterized record entries. LISP-C.md says "internal, not part of the user-facing primitive list".PRIM size grew from 16 to 24 bytes uniformly to fit the parameterized data slot used by record ctors / preds / accessors / mutators. Plain primitives (sys-exit, cons, +, …) waste those 8 bytes per instance.
applymodification: prim ptr is now passed ina1alongside args ina0. All existing primitives ignorea1, but any future primitive that usesa1for anything else will silently break.Symbol-table linear scan.
internwalks the table from idx 0 on every call. LISP-C.md describes a 16384-slot open-addressing hash; we have a 1024-slot linear scan that exits with code 5 on overflow.
Test suite caveats
Issues in the test files themselves that need fixing or revisiting before the suite can be considered authoritative.
tests/scheme1/15-dot-symbol.scm— defines.foo(a leading.identifier). LISP.md says "a lone.is not a symbol — it's reserved for dotted-pair syntax", but the spec is silent on whether.foois admissible. Behavior depends on whether the byte after.is whitespace/paren (handled byparse_list's peek). Useful as a regression test for the dotted-tail detector but not necessarily desired surface syntax.tests/scheme1/19-letstar.scm— comment claims "outer x; let*'s x must shadow inside the body" but the test only checks the inner shadow path. Nothing exercises that the outerxis not affected after thelet*body returns.tests/scheme1/20-letrec.scm— uses(if n n (f #t))to test letrec self-reference. Recurses once (n=44 → truthy → returns 44) so it doesn't actually trigger the recursive case. The comment acknowledges the workaround ("Without numeric primitives we terminate by passing #t at the recursive call"). Needs a real recursion test now that the let family + arith primitives are available;21-letrec-recursion.scmpartially fills this.tests/scheme1/22-named-let.scm— recursion is bounded by a flag (first) flipping from#tto#f. Deep iteration not exercised.tests/scheme1/27-apply.scm— only tests 2-arg(apply f arglist)and 3-arg(apply f x arglist).(apply f)is unspecified;(apply f a b … last)for N>3 is unverified.tests/scheme1/40-sys-argv.scm— hard-codesexpected-exit = 2, the count of argv entries the runner happens to pass (./binary tests/scheme1/40-sys-argv.scm). Any change toscripts/run-tests.sh's invocation or a wrapper that injects extra args breaks this test silently.tests/scheme1/41-fileio.scm— opens itself by reading(car (cdr (sys-argv)))and passing the bytevector as a path. Relies on thebv_capacity_forheadroom invariant for NUL termination (no explicitchars->bv). Doesn't exercise the(#f . errno)branch ofsys-openat(e.g., a non-existent path). Hard-codesO_RDONLY = 0andmode = 0instead of using named constants.tests/scheme1/42-clone-wait.scm— bypassessys-wait/decode-wait-statusentirely; readssiginfo_t.si_status(offset 24) directly from the buffer withbytevector-u8-ref. Encodes Linux-x86_64-and-aarch64 siginfo layout; non-portable to other Linux ABIs and to any non-Linux target.tests/scheme1/43-prelude.scm— verifiesfor-eachonly by running(for-each (lambda (x) x) ys)and checking it doesn't error; doesn't check thatfor-eachactually invokes the lambda for each element (no side-effect verification).tests/scheme1/44-shell-run.scm— name is misleading. It testssys-wait+decode-wait-statusagainst asys-clonechild but never callsrun.runis what test 45 was supposed to cover, and it doesn't because of the spawn-via-run bug.tests/scheme1/45-shell-spawn.scm— works around the prelude spawn bug by redefiningspawnat user level. The prelude'sspawn/runare therefore covered by zero passing tests.tests/scheme1/38-record-internal-prims.scm— the%record-*primitives are tested via the Scheme surface; the only way to invoke them is through the public binding, which conflicts with LISP-C.md's "internal" classification.No test verifies
(set-car! …)/(set-cdr! …)— the primitives don't exist; spec requires them.No test verifies that mutating a literal pair (
'(1 2 3)) is UB — undefined behavior is policy, but the policy isn't pinned down by a test.No test verifies tail-call correctness on deep recursion — named let,
letrec, and the eval/apply tail positions all rely on%tail/%tailr, but nothing recurses thousands of times to confirm no host-stack growth.No
(define x …)followed by(set! x …)test becauseset!doesn't exist.No quoted-pair test (
'(1 . 2)) — only quoted lists are tested. The reader handles dotted pairs but no test pins this.tests/scheme1/16-cond.scm— verifies short-circuit in the positive direction (later truthy clauses don't fire). Doesn't verify that a(cond)with no matching clause and noelsereturns UNSPEC (or whatever the policy is — currently it does, but unspecified by spec).
Suggested next steps before shipping
In rough priority order:
- Track down and fix the prelude
spawn-via-runbug; remove the workaround in test 45. - Fill in the spec-required primitives (
equal?,eqv?,set-car!,set-cdr!, the comparison family, the bytevector family, the number/string converters). set!,pmatch.- Port shell.scm's port record + I/O wrappers.
- Replace the 1024-slot linear-scan symtab with an open-addressing hash per LISP-C.md.