libp1pp
Scope
libp1pp is a small portable utility library for P1pp programs, written in P1pp itself. It provides:
- M1PP control-flow macros that wrap P1 branches into structured forms
- Common byte and string primitives
- Integer parsing and formatting
- Character predicates
- Thin syscall wrappers and higher-level IO helpers
- A single-arena bump allocator
- Panic and assertion helpers
libp1pp is a single source file, p1pp.P1pp, composed via catm after the
target backend header and before user source:
catm P1-aarch64.M1pp p1pp.P1pp usersrc.P1pp > program.M1
Because p1pp.P1pp passes through M1PP, it freely mixes P1 code and M1PP macro
definitions.
Target and conventions
Width
libp1pp targets P1-64 only. Word size is 8 bytes. Pointer values, integer results, and syscall arguments are all one word.
Syscall numbers
libp1pp does not hard-code syscall numbers. It relies on the backend header
to provide the p1_sys_<name> data-word macros already defined by existing
backends (e.g. %p1_sys_read(), %p1_sys_write()). User code should not
issue syscalls through raw numbers; it calls the libp1pp wrappers instead.
Error style
IO functions follow kernel conventions: a negative return indicates an error
(typically -errno), a non-negative return is a success value.
Functions that return a pointer use 0 (NULL) to indicate failure.
Parsers return two words under the two-word direct-result convention:
(value, consumed). consumed == 0 means the input did not begin with a
syntactically valid token; the value word is then unspecified.
Functions whose return type is "nothing meaningful" return 0 in a0.
String representation
libp1pp uses two string conventions, distinguished by the function name:
(buf, len)pair — an explicit pointer/length pair. The default.cstr— a NUL-terminated byte pointer. Only functions whose name includescstrexpect or emit NUL termination.
libp1pp never mutates a caller-provided buffer it was not passed as an output parameter.
Allocation
libp1pp functions do not allocate. Anything that produces bytes writes into a caller-provided buffer whose capacity is the caller's responsibility. The single exception is the bump allocator, which allocates only from a region the caller explicitly installed.
Internal label namespace
libp1pp reserves the global label prefix libp1pp__ for all internal
state and helper labels — bump allocator cursor/base/cap words, internal
scratch buffers used by print_int / print_hex, private helper routines.
User code must not define globals beginning with libp1pp__, and must not
reference them directly; everything libp1pp exposes is reachable through
its documented functions and macros.
Labels inside libp1pp functions are scope-local hex2++ dotted labels
(:.loop, &.done) and never appear in the global namespace, so they
cannot collide with user labels. The libp1pp__ prefix matters only for
file-scope data and helper functions.
Public entry points (the functions and macros listed in this document,
such as memcpy, bump_alloc, %if_eq) are unprefixed. A user who
sees an undefined-label error for a libp1pp__ name at link time has
almost certainly forgotten to catm p1pp.P1pp into the build.
Initialization
libp1pp requires no global init step at program entry. Subsystems are either self-initializing or require an explicit per-subsystem init call, documented with that subsystem.
The only subsystem that requires explicit init is the bump allocator:
bump_alloc called before bump_init returns 0 (the "arena
exhausted" sentinel) because no arena is installed yet. Every other
libp1pp function is callable from the first instruction of p1_main.
p1_main itself inherits the portable entry contract from P1:
a0 = argc, a1 = argv. libp1pp does not wrap or interpose on
p1_main.
Control-flow macros
All control-flow macros take braced blocks as arguments. The braces are M1PP argument delimiters; they are stripped on substitution.
There are two flavors:
Unscoped forms (
%if_<cc>,%while_<cc>,%for_lt,%loop) use M1PP per-expansion@-mangled labels for their internal targets. They emit no hex2++.scopeand do not interact with%break/%continue. Use these when the body does not need mid-body exit.Scoped forms (
%loop_scoped,%while_scoped_<cc>,%for_lt_scoped) open a nested hex2++.scopeand define dotted labels.topand.endinside it. The generic%breakand%continuemacros resolve through hex2++'s innermost-out scope lookup, so they always target the nearest enclosing scoped loop.
Condition suffixes
Each conditional family is expanded once per condition. Suffixes:
- Two-operand:
eq,ne,lt,ltu - Zero-operand (implicit compare against zero):
eqz,nez,ltz
lt and ltz are signed comparisons. ltu is unsigned. These mirror the
P1 branch opcodes BEQ, BNE, BLT, BLTU, BEQZ, BNEZ, BLTZ.
%if_<cc> / %ifelse_<cc>
%if_eq(ra, rb, { body })
%if_ne(ra, rb, { body })
%if_lt(ra, rb, { body })
%if_ltu(ra, rb, { body })
%if_eqz(ra, { body })
%if_nez(ra, { body })
%if_ltz(ra, { body })
%ifelse_eq(ra, rb, { tblk }, { fblk })
%ifelse_ne(ra, rb, { tblk }, { fblk })
%ifelse_lt(ra, rb, { tblk }, { fblk })
%ifelse_ltu(ra, rb, { tblk }, { fblk })
%ifelse_eqz(ra, { tblk }, { fblk })
%ifelse_nez(ra, { tblk }, { fblk })
%ifelse_ltz(ra, { tblk }, { fblk })
%if_<cc> executes the block when the condition is true and falls through
otherwise. %ifelse_<cc> executes tblk on true and fblk on false, then
falls through to the code after the macro.
Neither form establishes a new frame or changes sp. A block that issues
a CALL must sit inside a function that has already established a frame
with ENTER. Neither form opens a .scope, so %break / %continue
inside the body resolve through to the enclosing scoped loop (if any).
%while_<cc> / %do_while_<cc>
%while_eq(ra, rb, { body })
%while_ne(ra, rb, { body })
%while_lt(ra, rb, { body })
%while_ltu(ra, rb, { body })
%while_eqz(ra, { body })
%while_nez(ra, { body })
%while_ltz(ra, { body })
%do_while_eq(ra, rb, { body })
%do_while_ne(ra, rb, { body })
%do_while_lt(ra, rb, { body })
%do_while_ltu(ra, rb, { body })
%do_while_eqz(ra, { body })
%do_while_nez(ra, { body })
%do_while_ltz(ra, { body })
%while_<cc> tests the condition before the body; %do_while_<cc> after.
In both, the condition is a positive sense ("continue while ra == rb").
The operand registers are re-read on every iteration, so body may update
them.
All %while_<cc> macros share a single lowering pattern so they work
uniformly across conditions, including lt, ltu, and ltz which have no
inverted P1 branches.
These are unscoped: they do not support %break / %continue. Use the
%while_scoped_<cc> family below if the body needs mid-body exit.
%for_lt
%for_lt(i_reg, n_reg, { body })
Counts i_reg from 0 up to but not including n_reg, with step +1,
under signed comparison. On entry, i_reg is set to 0; after each body
iteration, i_reg is incremented by 1; the loop exits once
i_reg < n_reg is false.
n_reg is re-read each iteration, so body may update the bound. Body may
read i_reg but must not otherwise modify it. If body issues a CALL,
the caller is responsible for keeping i_reg live across the call — in
practice, this means i_reg should be a callee-saved register (s0–s3)
or explicitly spilled.
libp1pp does not provide an unsigned variant, an immediate-bound variant, a
step-by-k variant, or a count-down variant. Pointer iteration and
other shapes are better expressed as %while_<cc> plus explicit
increments.
%loop
%loop({ body })
An unconditional unscoped loop with no built-in exit. The body runs
forever unless it transfers control out by another mechanism. Use
%loop_scoped if the body needs %break.
Scoped loops: %loop_scoped, %while_scoped_<cc>, %for_lt_scoped
%loop_scoped({ body })
%while_scoped_eq(ra, rb, { body })
%while_scoped_ne(ra, rb, { body })
%while_scoped_lt(ra, rb, { body })
%while_scoped_ltu(ra, rb, { body })
%while_scoped_eqz(ra, { body })
%while_scoped_nez(ra, { body })
%while_scoped_ltz(ra, { body })
%for_lt_scoped(i_reg, n_reg, { body })
Each scoped loop opens a hex2++ .scope around its expansion and defines
two dotted labels inside it: :.top at the point where %continue should
land, and :.end immediately after the loop. For top-tested
%while_scoped_<cc>, .top names the condition test; for
%for_lt_scoped, .top names the increment-and-test block; for
%loop_scoped, .top names the head of the body.
%break
Emits B &.end. Transfers control to the .end of the innermost
enclosing scoped loop, resolved by hex2++'s scope-walk.
%continue
Emits B &.top. Transfers control to the innermost enclosing scoped
loop's re-test / increment point.
%break and %continue work from arbitrary depth inside a scoped loop,
including inside %if_<cc>, %ifelse_<cc>, or another nested loop's
body — because none of those forms open their own .scope, the lookup
walks past them to the enclosing scoped loop.
A nested scoped loop does open its own .scope and shadows the outer
.top / .end. Inside the inner loop, %break / %continue target the
inner loop. To break out of an outer loop from inside an inner one, fall
through with a manual branch or a status flag — libp1pp does not provide
named-label break.
Unscoped forms (%while_<cc>, %for_lt, %loop) are preferred when the
body does not need %break or %continue. They emit no .scope and use
per-expansion local labels that cannot collide.
Frame locals
libp1pp does not introduce a new local-variable macro. Use M1PP's %struct
directly: its 8-byte stride matches WORD on P1-64, and it already
synthesizes %name.SIZE for ENTER.
%struct parse_f { state cursor endp tmp }
:parse_one
ENTER %parse_f.SIZE
ST a0, [sp + %parse_f.state]
ST a1, [sp + %parse_f.cursor]
...
ERET
If the function stages stack-passed outgoing arguments for calls with more than four word arguments, reserve the low-addressed fields for that staging:
%struct parse_f { _o0 _o1 state cursor endp tmp }
The caller places outgoing argument word k at [sp + k * 8] immediately
before the CALL, then reads locals from higher offsets. libp1pp does not
otherwise enforce this convention.
Function definition
%fn(name, size, { body })
Defines a non-leaf function named name with size bytes of
frame-local storage. Expands to:
- a global label
:nameat the function entry, - a
.scopepush, so dotted labels insidebody(:.start,:.done) are local to the function and never collide with sibling functions, - an
%enter(size)prologue, - the body,
- an
%eret()epilogue, - a matching
.endscope.
%fn does not itself define .top or .end, so a bare %break /
%continue directly in body would resolve outside the function (or
fail to resolve) — they should appear only inside a nested scoped loop.
Example:
%struct parse_f { state cursor }
%fn(parse_number, %parse_f.SIZE, {
ST a0, [sp + %parse_f.state]
ST a1, [sp + %parse_f.cursor]
...
BEQZ t0, &.done
...
:.done
LD a0, [sp + %parse_f.state]
})
size may be a literal byte count, a %struct SIZE reference, or any
M1PP-time integer expression that the backend %enter macro accepts.
%fn2(name, { local1 local2 ... }, { body })
Like %fn, but the second argument is a braced list of local names
instead of a byte frame size. Synthesizes a name_FRAME %struct (one
8-byte slot per local), opens both a hex2++ .scope and an M1PP
%frame named after the function, and sizes the stack frame from
%name_FRAME.SIZE.
Inside the body these helpers resolve against the enclosing %frame:
%local(slot) byte offset of local `slot`
%stl(reg, slot) store reg into local `slot`
%ldl(reg, slot) load local `slot` into reg
A zero-local function uses {} for the locals list.
Leaf functions that need no frame do not use %fn: they write the
entry label, body, and %ret() directly, and may optionally wrap the
body in .scope / .endscope if they want scope-local dotted labels.
Memory and strings
Byte-buffer primitives
memcpy(dst, src, n) -> dst
memmove(dst, src, n) -> dst
memset(dst, byte, n) -> dst
memcmp(a, b, n) -> sign # -1 / 0 / 1
These four entries are the canonical compiler-builtin mem runtime* for every build chain in this tree. cc.scm + libp1pp, cc-libc (libp1pp
- libc), tcc-cc, and tcc-gcc all resolve bare
extern memcpyagainst libp1pp here; the vendored mes-libc is flattened with its own copies omitted so the symbols are not duplicated at hex2++ time, and the gcc-built tcc-gcc binary linkstcc/cc/mem.cfor the same reason.
memcpy does not support overlapping ranges where dst > src && dst < src + n;
use memmove for overlap.
memmove picks the safe direction based on dst vs src.
memset stores only the low 8 bits of byte.
memcmp performs an unsigned byte-wise three-way compare and returns
-1, 0, or 1. It stops at the first differing byte.
NUL-terminated strings
strlen(cstr) -> n
streq(a_cstr, b_cstr) -> 0 or 1
strcmp(a_cstr, b_cstr) -> sign # -1 / 0 / 1
strlen returns the byte count up to but not including the terminating NUL.
streq returns 1 iff the two strings are byte-equal including length.
strcmp compares byte-wise until either a differing byte is found or one
side's NUL is reached, and returns the sign of the first difference (the
shorter string compares less when it is a prefix of the other).
Integer parsing and formatting
Parsers
parse_dec(buf, len) -> (value, consumed)
parse_hex(buf, len) -> (value, consumed)
Both use the two-word direct-result convention: a0 holds the parsed
integer value and a1 holds the number of bytes consumed. consumed == 0
means the input did not start with a valid literal.
parse_dec accepts an optional leading - followed by one or more decimal
digits. On overflow, the result is truncated to 64 bits modulo 2^64;
detection of overflow is not part of the portable contract.
parse_hex accepts one or more hex digits (0-9, a-f, A-F). It does
not consume a 0x prefix; callers handle any prefix themselves. The
result is the unsigned value of the parsed digits, truncated to 64 bits.
Parsers do not skip leading whitespace.
Formatters
fmt_dec(buf, value) -> n_bytes
fmt_hex(buf, value) -> n_bytes
Both write a human-readable representation into buf, starting at offset
0, and return the number of bytes written. Neither writes a terminating
NUL.
fmt_dec emits a signed decimal representation: a leading - for
negative values, then one or more decimal digits. At most 20 bytes are
written.
fmt_hex emits an unsigned lowercase hex representation with no prefix
and no leading zeros (except that 0 is rendered as 0). At most 16
bytes are written.
Callers provide a buffer at least as large as the documented maximum.
Character predicates
All predicates take a single one-byte value (passed as a word; the high
bits are ignored) and return 1 or 0.
is_digit(c) -> 0 or 1 # '0'..'9'
is_hex_digit(c) -> 0 or 1 # 0-9, a-f, A-F
is_space(c) -> 0 or 1 # ' ', '\t', '\n', '\r', '\v', '\f'
is_alpha(c) -> 0 or 1 # a-z, A-Z
is_alnum(c) -> 0 or 1 # is_alpha OR is_digit
Predicates are functions.
IO
Raw syscall wrappers
sys_read(fd, buf, len) -> n # bytes read; 0 at EOF; <0 error
sys_write(fd, buf, len) -> n # bytes written; <0 error
sys_open(path_cstr, flags, mode)
-> fd # fd >= 0 on success; <0 error
sys_close(fd) -> r # 0 on success; <0 error
sys_exit(code) -> ! # does not return
These are thin wrappers over the P1 SYSCALL op. They set the syscall
number themselves using the backend's %p1_sys_<name> data-word macros,
marshal arguments into the syscall-argument registers, and return the raw
kernel return value unchanged.
sys_open is a logical open: the backend may implement it via open or
openat(AT_FDCWD, ...) as appropriate for the target.
sys_exit terminates the process with the low 8 bits of code as the
exit status. It never returns.
No wrapper interprets the negative return as a specific errno. Callers
that need such detail inspect a0 directly.
Print helpers
print(buf, len) -> r # 0 on success; <0 error
println(buf, len) -> r # writes buf then "\n"
print_cstr(cstr) -> r # writes strlen(cstr) bytes
print_int(value) -> r # decimal
print_hex(value) -> r # hex, no prefix
eprint(buf, len) -> r
eprintln(buf, len) -> r
eprint_cstr(cstr) -> r
print* helpers write to fd 1; eprint* to fd 2. All return 0 on a
successful write of all bytes, or a negative value if the underlying
sys_write reported an error. A partial write is retried until complete
or the kernel returns an error.
print_int and print_hex render into a small internal stack buffer,
then write. They allocate no heap memory.
File helpers
read_file(path_cstr, buf, cap)
-> n # bytes read, or -1
Opens path_cstr read-only, reads up to cap bytes into buf, and
closes the fd. Returns the number of bytes read on success, or -1 if
the file could not be opened, a read failed, or the file exceeds cap
(in which case buf may have been partially written).
write_file(path_cstr, buf, len)
-> r # 0 on success; -1 on error
Creates or truncates path_cstr, writes len bytes from buf, and
closes the fd. Returns 0 on success or -1 if any step failed. The
created file's mode is implementation-defined but intended to be a
reasonable default (typically 0644).
Bump allocator
libp1pp provides a single global bump allocator. Memory is carved from a caller-supplied region; libp1pp does not own or reserve storage itself.
bump_init(base, cap) -> 0
Installs [base, base + cap) as the live arena and sets the cursor to
base. Discards any prior state. base should be word-aligned; cap
should be a multiple of 8. libp1pp does not validate these.
bump_alloc(n) -> ptr # 0 on exhaustion
Advances the cursor by n bytes rounded up to the next multiple of 8 and
returns the pre-advance cursor. Returns 0 and leaves the cursor
unchanged if the rounded-up request would exceed the arena.
Returned memory is not zeroed. Callers that need zero-init memset themselves.
bump_mark() -> saved
Returns the current cursor value as an opaque word.
bump_release(saved) -> 0
Rewinds the cursor to a value previously returned by bump_mark. Any
pointers handed out since that mark become invalid. Passing a value that
was not produced by bump_mark against the currently installed arena is
undefined behavior.
bump_reset() -> 0
Rewinds the cursor to the arena's base.
libp1pp provides exactly one arena.
Panic and assertions
panic
panic(msg_cstr) -> !
Writes msg_cstr followed by "\n" to fd 2, then calls sys_exit(1).
Does not return.
User code is encouraged to use panic for its own invariant violations.
%assert_<cc> macros
%assert_eq(ra, rb, msg_label)
%assert_ne(ra, rb, msg_label)
%assert_lt(ra, rb, msg_label)
%assert_ltu(ra, rb, msg_label)
%assert_eqz(ra, msg_label)
%assert_nez(ra, msg_label)
%assert_ltz(ra, msg_label)
Each macro asserts that the named condition holds and calls panic with
msg_label (a NUL-terminated string label in the program image) if it
does not. They lower to a B<cc> past an LA a0, msg_label /
CALL &panic sequence, so the non-failure path adds no runtime cost
beyond the original branch.
Because the failure path issues a CALL, %assert_* may be used only in
functions that have established a frame with ENTER.