commit e05d709edfb9f6f42bff02097a80ebb58282b4e6
parent f0972c4527e533f9195944889ae6cb4440d04a36
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Wed, 22 Apr 2026 07:17:31 -0700
P1.md: document the CALL/PROLOGUE contract explicitly
The previous wording left room for a prologue-less function to do
CALL+RET; that pattern hangs silently on aarch64/riscv64 (where
native CALL writes the return address to a link register that the
inner CALL then clobbers) while happening to work on amd64 (native
CALL pushes to the stack). The lisp.M1 mark_push_recurse bug in the
prior commit was exactly this shape.
Spec it as a hard rule: a function that executes a CALL must wrap
its body in PROLOGUE/EPILOGUE; leaf functions may RET but not CALL.
Note the amd64-happens-to-work asymmetry as a portability footgun
rather than something to rely on, and point at the tail-branch
substitute (li_br &callee ; B) for the "leaf wants to dispatch and
inherit the caller's return" use case.
Diffstat:
1 file changed, 28 insertions(+), 3 deletions(-)
diff --git a/docs/P1.md b/docs/P1.md
@@ -224,9 +224,34 @@ SYSCALL # num in r0, args r1-r6, ret in r0
current stack pointer into a GPR — used e.g. for stack-balance assertions
around a call tree). The reverse (`MOV sp, rA`) is not provided; `sp`
is only mutated by `PROLOGUE`/`EPILOGUE`.
-- `CALL %label` pushes a return address (via the arch's native mechanism
- or the caller-emitted `PROLOGUE`, see below) and jumps. `RET` pops and
- jumps.
+- `CALL %label` transfers control to `%label` with a return address
+ established such that a subsequent `RET` returns to the instruction
+ after the `CALL`. The storage location of that return address is
+ implementation-defined (stack on amd64, link register on
+ aarch64/riscv64) and **must be treated as volatile across any inner
+ `CALL`**.
+
+ Concrete rule: **a function that itself executes a `CALL` must wrap
+ its body in a matching `PROLOGUE`/`EPILOGUE` pair.** `PROLOGUE` is
+ what spills the incoming return address into the frame; `EPILOGUE`
+ restores it so `RET` can find it. A leaf function (no `PROLOGUE`) is
+ permitted — but it may only execute `RET`, never `CALL`. A bare
+ `CALL` in a prologue-less function clobbers its own return address
+ on arches where the native mechanism uses a register rather than a
+ stack push, and the eventual `RET` branches to itself.
+
+ The failure mode is platform-asymmetric: amd64's native `CALL`
+ pushes onto the stack so a prologue-less `CALL ; RET` happens to
+ work; aarch64 and riscv64 write the return address to a link
+ register and hang silently. Don't write code that relies on the
+ amd64-happens-to-work behavior.
+
+ Tail-call substitute for the "leaf wants to dispatch to another
+ function and inherit its return" pattern: `li_br &callee ; B`
+ (unconditional branch, not `CALL`). The callee's `EPILOGUE` returns
+ directly to the current function's caller.
+
+ `RET` pops / branches through the return address.
- `PROLOGUE` / `EPILOGUE` set up and tear down a frame with **k
callee-private scratch slots**. `PROLOGUE` is shorthand for
`PROLOGUE_N1` (one slot); `PROLOGUE_Nk` for k = 2, 3, 4 reserves that