boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit dd05e292cf67df0b1d60b89a27a4e3b0182fd508
parent f65c910837279a40cbb6635a374d94c506a81d8f
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Sat, 25 Apr 2026 22:50:53 -0700

cc: implement lex / pp / cg / parse modules

Four modules brought up to working state by parallel agents against
the contracts in docs/CC-INTERNALS.md and docs/CC-CONTRACTS.md.

cc/lex.scm (~650 lines, 16 fixtures):
  Bytestream → token list. Folds trigraph translation and `\<newline>`
  line splicing into a single `%lex-peek` primitive. Emits NL tokens
  at every source line per pp's contract.

cc/pp.scm (~700 lines, 22 hand-built-token fixtures + 1 e2e):
  Token list → expanded token list. Object/function/variadic macros,
  `#`-stringize, `##`-paste, full conditional stack, hide-set per C11
  6.10.3.4 (constructive), built-ins, full constant-expression
  evaluator. `#include` rejected per CC.md.

cc/cg.scm (~700 lines, 15 fixtures):
  Codegen state + emission API. Stack-machine to frame slots; no
  register allocator. Slot offsets emitted symbolically as
  `(+ %<fn>__SO N)` with `__SO` macro defined at fn-end carrying the
  outgoing-arg staging size — solves the staging-vs-locals layout
  ordering. Structured CFG via libp1pp `%fn` / `%ifelse_nez` /
  `%loop_tag` / `%break` / `%continue`. Mangling `:cc__<name>` per
  CC-CONTRACTS §5; entry stub `:p1_main` calls `:cc__main`.

  Stretch validation: 7 of 15 cg fixtures successfully passed through
  m1pp + the existing P1 toolchain to native aarch64 ELF and ran with
  correct exit codes — including the phase-1 milestone `return argc`
  binary.

cc/parse.scm (~900 lines, 15 fixtures via cg-trace mock):
  Recursive descent + Pratt expression parser; spiral declarator via
  continuation-passing. Owns type promotion / arith-conv / lval→rval
  per CC-CONTRACTS §4.1. Parser maintains its own loop-tag stack and
  passes typedef-name set in lockstep with the scope chain.

  tests/cc-parse/cg-trace.scm: swap-in mock of cg's API. Records each
  call as a Scheme list per CC-CONTRACTS §2.2.

  tests/cc-parse/mini-prelude.scm: 2KB subset of scheme1/prelude.scm
  used at parse test time. Full prelude + util + data + cg-trace +
  parse exceeds scheme1's 64KB source-buffer cap; parse tests don't
  use shell.scm bits.

Open contract drift (deferred):
  - cg-loop tag protocol: cg returns a tag from cg-loop, but parse
    needs a tag *before* body-thunk runs (for break/continue inside).
    Parse pre-allocates its own tag and ignores cg-loop's return.
    Fix: thread cg's tag into the body-thunk's signature.
  - cg-assign doesn't auto-cast rhs to lhs type; parser cannot peek
    deeper than vstack top to find lhs's type. Fix: either expose a
    cg-peek-deep API, or move the cast into cg-assign.
  Neither is exercised by the cg-trace mock; both will surface during
  end-to-end parse→cg integration. Tracked for the integration pass.

Diffstat:
Mcc/cg.scm | 710+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
Mcc/lex.scm | 650++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
Mcc/parse.scm | 1051+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
Mcc/pp.scm | 777++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
Atests/cc-cg/00-fn-empty.expected | 14++++++++++++++
Atests/cc-cg/01-return-imm.expected | 14++++++++++++++
Atests/cc-cg/01-return-imm.scm | 9+++++++++
Atests/cc-cg/02-one-param.expected | 17+++++++++++++++++
Atests/cc-cg/02-one-param.scm | 13+++++++++++++
Atests/cc-cg/03-two-params.expected | 18++++++++++++++++++
Atests/cc-cg/03-two-params.scm | 15+++++++++++++++
Atests/cc-cg/04-binop-add.expected | 18++++++++++++++++++
Atests/cc-cg/04-binop-add.scm | 11+++++++++++
Atests/cc-cg/05-load-binop-store.expected | 26++++++++++++++++++++++++++
Atests/cc-cg/05-load-binop-store.scm | 22++++++++++++++++++++++
Atests/cc-cg/06-if.expected | 20++++++++++++++++++++
Atests/cc-cg/06-if.scm | 14++++++++++++++
Atests/cc-cg/07-ifelse.expected | 21+++++++++++++++++++++
Atests/cc-cg/07-ifelse.scm | 14++++++++++++++
Atests/cc-cg/08-while-break-continue.expected | 20++++++++++++++++++++
Atests/cc-cg/08-while-break-continue.scm | 20++++++++++++++++++++
Atests/cc-cg/09-call.expected | 17+++++++++++++++++
Atests/cc-cg/09-call.scm | 13+++++++++++++
Atests/cc-cg/10-string.expected | 22++++++++++++++++++++++
Atests/cc-cg/10-string.scm | 15+++++++++++++++
Atests/cc-cg/11-global-var.expected | 23+++++++++++++++++++++++
Atests/cc-cg/11-global-var.scm | 12++++++++++++
Atests/cc-cg/12-entry-stub.expected | 4++++
Atests/cc-cg/12-entry-stub.scm | 5+++++
Atests/cc-cg/13-call-5args.expected | 22++++++++++++++++++++++
Atests/cc-cg/13-call-5args.scm | 19+++++++++++++++++++
Atests/cc-cg/14-take-addr.expected | 23+++++++++++++++++++++++
Atests/cc-cg/14-take-addr.scm | 16++++++++++++++++
Mtests/cc-lex/01-keywords.expected-toks | 1+
Atests/cc-lex/02-integers.c | 1+
Atests/cc-lex/02-integers.expected-toks | 11+++++++++++
Atests/cc-lex/03-strings.c | 1+
Atests/cc-lex/03-strings.expected-toks | 7+++++++
Atests/cc-lex/04-chars.c | 1+
Atests/cc-lex/04-chars.expected-toks | 10++++++++++
Atests/cc-lex/05-comments.c | 4++++
Atests/cc-lex/05-comments.expected-toks | 13+++++++++++++
Atests/cc-lex/06-line-splice.c | 2++
Atests/cc-lex/06-line-splice.expected-toks | 7+++++++
Atests/cc-lex/07-punctuators.c | 3+++
Atests/cc-lex/07-punctuators.expected-toks | 52++++++++++++++++++++++++++++++++++++++++++++++++++++
Atests/cc-lex/08-digraphs.c | 1+
Atests/cc-lex/08-digraphs.expected-toks | 16++++++++++++++++
Atests/cc-lex/09-kw-vs-ident.c | 1+
Atests/cc-lex/09-kw-vs-ident.expected-toks | 10++++++++++
Atests/cc-lex/10-nl-tokens.c | 3+++
Atests/cc-lex/10-nl-tokens.expected-toks | 6++++++
Atests/cc-lex/11-trigraphs.c | 1+
Atests/cc-lex/11-trigraphs.expected-toks | 8++++++++
Atests/cc-lex/12-program.c | 3+++
Atests/cc-lex/12-program.expected-toks | 20++++++++++++++++++++
Atests/cc-lex/13-keywords-all.c | 6++++++
Atests/cc-lex/13-keywords-all.expected-toks | 33+++++++++++++++++++++++++++++++++
Atests/cc-lex/14-reject-float.c | 1+
Atests/cc-lex/14-reject-float.expected-exit | 1+
Atests/cc-lex/14-reject-float.expected-toks | 4++++
Atests/cc-lex/15-reject-multichar.c | 1+
Atests/cc-lex/15-reject-multichar.expected-exit | 1+
Atests/cc-lex/15-reject-multichar.expected-toks | 4++++
Atests/cc-lex/run-lex.scm | 117+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Atests/cc-lex/run.sh | 104+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mtests/cc-parse/00-empty-main.expected-trace | 1+
Atests/cc-parse/00-empty-main.scm | 27+++++++++++++++++++++++++++
Atests/cc-parse/01-return-argc.c | 1+
Atests/cc-parse/01-return-argc.expected-trace | 6++++++
Atests/cc-parse/01-return-argc.scm | 17+++++++++++++++++
Atests/cc-parse/02-add-const.c | 1+
Atests/cc-parse/02-add-const.expected-trace | 10++++++++++
Atests/cc-parse/02-add-const.scm | 15+++++++++++++++
Atests/cc-parse/03-local-assign.c | 1+
Atests/cc-parse/03-local-assign.expected-trace | 10++++++++++
Atests/cc-parse/03-local-assign.scm | 16++++++++++++++++
Atests/cc-parse/04-if-else.c | 1+
Atests/cc-parse/04-if-else.expected-trace | 13+++++++++++++
Atests/cc-parse/04-if-else.scm | 18++++++++++++++++++
Atests/cc-parse/05-while-break.c | 1+
Atests/cc-parse/05-while-break.expected-trace | 11+++++++++++
Atests/cc-parse/05-while-break.scm | 19+++++++++++++++++++
Atests/cc-parse/06-call-no-args.c | 2++
Atests/cc-parse/06-call-no-args.expected-trace | 6++++++
Atests/cc-parse/06-call-no-args.scm | 18++++++++++++++++++
Atests/cc-parse/07-call-with-args.c | 2++
Atests/cc-parse/07-call-with-args.expected-trace | 8++++++++
Atests/cc-parse/07-call-with-args.scm | 19+++++++++++++++++++
Atests/cc-parse/08-pointer-deref.c | 1+
Atests/cc-parse/08-pointer-deref.expected-trace | 8++++++++
Atests/cc-parse/08-pointer-deref.scm | 16++++++++++++++++
Atests/cc-parse/09-address-of.c | 1+
Atests/cc-parse/09-address-of.expected-trace | 6++++++
Atests/cc-parse/09-address-of.scm | 15+++++++++++++++
Atests/cc-parse/10-typedef.c | 2++
Atests/cc-parse/10-typedef.expected-trace | 6++++++
Atests/cc-parse/10-typedef.scm | 16++++++++++++++++
Atests/cc-parse/11-two-params.c | 1+
Atests/cc-parse/11-two-params.expected-trace | 12++++++++++++
Atests/cc-parse/11-two-params.scm | 17+++++++++++++++++
Atests/cc-parse/12-comparison.c | 1+
Atests/cc-parse/12-comparison.expected-trace | 12++++++++++++
Atests/cc-parse/12-comparison.scm | 17+++++++++++++++++
Atests/cc-parse/13-while-continue.c | 1+
Atests/cc-parse/13-while-continue.expected-trace | 10++++++++++
Atests/cc-parse/13-while-continue.scm | 19+++++++++++++++++++
Atests/cc-parse/14-mul-paren.c | 1+
Atests/cc-parse/14-mul-paren.expected-trace | 18++++++++++++++++++
Atests/cc-parse/14-mul-paren.scm | 21+++++++++++++++++++++
Atests/cc-parse/cg-trace.scm | 153+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Atests/cc-parse/mini-prelude.scm | 66++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Atests/cc-pp/01-obj-macro.expected-exit | 1+
Atests/cc-pp/01-obj-macro.scm | 18++++++++++++++++++
Atests/cc-pp/02-obj-macro-multi-tok.expected-exit | 1+
Atests/cc-pp/02-obj-macro-multi-tok.scm | 24++++++++++++++++++++++++
Atests/cc-pp/03-fn-macro.expected-exit | 1+
Atests/cc-pp/03-fn-macro.scm | 26++++++++++++++++++++++++++
Atests/cc-pp/04-fn-macro-2args.expected-exit | 1+
Atests/cc-pp/04-fn-macro-2args.scm | 29+++++++++++++++++++++++++++++
Atests/cc-pp/05-variadic.expected-exit | 1+
Atests/cc-pp/05-variadic.scm | 35+++++++++++++++++++++++++++++++++++
Atests/cc-pp/06-stringize.expected-exit | 1+
Atests/cc-pp/06-stringize.scm | 25+++++++++++++++++++++++++
Atests/cc-pp/07-paste.expected-exit | 1+
Atests/cc-pp/07-paste.scm | 30++++++++++++++++++++++++++++++
Atests/cc-pp/08-nested-expansion.expected-exit | 1+
Atests/cc-pp/08-nested-expansion.scm | 24++++++++++++++++++++++++
Atests/cc-pp/09-hideset-self.expected-exit | 1+
Atests/cc-pp/09-hideset-self.scm | 26++++++++++++++++++++++++++
Atests/cc-pp/10-if-defined.expected-exit | 1+
Atests/cc-pp/10-if-defined.scm | 30++++++++++++++++++++++++++++++
Atests/cc-pp/11-if-arith.expected-exit | 1+
Atests/cc-pp/11-if-arith.scm | 21+++++++++++++++++++++
Atests/cc-pp/12-ifdef-ifndef.expected-exit | 1+
Atests/cc-pp/12-ifdef-ifndef.scm | 46++++++++++++++++++++++++++++++++++++++++++++++
Atests/cc-pp/13-elif-chain.expected-exit | 1+
Atests/cc-pp/13-elif-chain.scm | 38++++++++++++++++++++++++++++++++++++++
Atests/cc-pp/14-nested-if.expected-exit | 1+
Atests/cc-pp/14-nested-if.scm | 43+++++++++++++++++++++++++++++++++++++++++++
Atests/cc-pp/15-undef.expected-exit | 1+
Atests/cc-pp/15-undef.scm | 28++++++++++++++++++++++++++++
Atests/cc-pp/16-error.expected-exit | 1+
Atests/cc-pp/16-error.scm | 10++++++++++
Atests/cc-pp/17-include-rejected.expected-exit | 1+
Atests/cc-pp/17-include-rejected.scm | 10++++++++++
Atests/cc-pp/18-builtin-stdc.expected-exit | 1+
Atests/cc-pp/18-builtin-stdc.scm | 12++++++++++++
Atests/cc-pp/19-pragma-dropped.expected-exit | 1+
Atests/cc-pp/19-pragma-dropped.scm | 14++++++++++++++
Atests/cc-pp/20-cexpr-ops.expected-exit | 1+
Atests/cc-pp/20-cexpr-ops.scm | 26++++++++++++++++++++++++++
Atests/cc-pp/21-undefined-id-zero.expected-exit | 1+
Atests/cc-pp/21-undefined-id-zero.scm | 20++++++++++++++++++++
Atests/cc-pp/22-initial-defines.expected-exit | 1+
Atests/cc-pp/22-initial-defines.scm | 15+++++++++++++++
Atests/cc-pp/30-define-end-to-end.c | 2++
Atests/cc-pp/30-define-end-to-end.expected-toks | 5+++++
Atests/cc-pp/run-pp.scm | 59+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
159 files changed, 5215 insertions(+), 185 deletions(-)

diff --git a/cc/cg.scm b/cc/cg.scm @@ -1,97 +1,705 @@ ;; cc/cg.scm — codegen state and emission API. +;; Realization of docs/CC-INTERNALS.md §cg.scm. +;; Conversion split per CC-CONTRACTS §4: parse owns promotion etc; +;; cg owns sign extension, signed/unsigned dispatch, pointer scaling. ;; -;; Realization of docs/CC-INTERNALS.md §cg.scm. Conversion- -;; responsibility split per CC-CONTRACTS §4: parse owns -;; promotion/arith-conv/lval-to-rval; cg owns sign extension, -;; signed/unsigned variant selection, and pointer scaling. +;; Output uses libp1pp's structured macros (%fn, %ifelse_nez, +;; %loop_tag, %break, %continue) per docs/LIBP1PP.md. ;; -;; Output goes through libp1pp's structured macros (%fn, %ifelse_nez, -;; %loop_tag, %break, %continue) — see docs/LIBP1PP.md. -;; -;; Owner: <unassigned> +;; Frame layout (CC-CONTRACTS §3): +;; [sp + 0 .. staging*8) outgoing-arg staging +;; [sp + staging*8 ..) locals + spilled vstack values +;; Slot offsets are emitted symbolically as `(+ %<fn>__SO N)` so the +;; staging size, only known at fn-end, can be filled in via a 0-arg +;; M1pp macro `<fn>__SO` defined just before the `%fn(...)` block. + +(define (%cg-emit-buf cg) + (let ((fb (cg-fn-buf cg))) (if fb fb (cg-text cg)))) + +(define (%cg-emit cg bv) + (buf-push! (%cg-emit-buf cg) bv)) + +(define (%cg-emit-many cg bvs) + (for-each (lambda (b) (%cg-emit cg b)) bvs)) + +(define (%n n) (number->string n 10)) + +;; Per-fn metadata (name, ret-slot, ret-type) is stashed on cg-globals +;; under symbol keys that don't collide with bv name keys. +(define (%cg-fn-set! cg key val) + (cg-globals-set! cg (alist-update key (lambda (_) val) (cg-globals cg)))) + +(define (%cg-fn-get cg key) (alist-ref/eq key (cg-globals cg))) + +(define (%cg-fresh-label cg prefix) + (let* ((n (cg-label-ctr cg)) + (bv (bytevector-append prefix (%n n)))) + (cg-label-ctr-set! cg (+ n 1)) + bv)) + +(define (%cg-fresh-loop-tag cg) (%cg-fresh-label cg "L")) +(define (%cg-fresh-lbl cg) (%cg-fresh-label cg "lbl_")) + +(define (%cg-bump-outgoing! cg n) + (if (< (cg-max-outgoing cg) n) (cg-max-outgoing-set! cg n) 0)) + +(define (%cg-slot-expr cg logical-off) + (let ((nm (%cg-fn-get cg '%fn-name))) + (bv-cat (list "(+ %" nm "__SO " (%n logical-off) ")")))) + +(define (%cg-mangle-global name-bv) + (bytevector-append "cc__" name-bv)) + +(define (%cg-reg->bv r) (symbol->string r)) + +(define (%cg-emit-li cg reg n) + (%cg-emit-many cg (list "%li(" (%cg-reg->bv reg) ", " (%n n) ")\n"))) + +(define (%cg-emit-la cg reg label-bv) + (%cg-emit-many cg (list "%la(" (%cg-reg->bv reg) ", &" label-bv ")\n"))) + +(define (%cg-emit-ld-slot cg reg logical-off) + (%cg-emit-many cg (list "%ld(" (%cg-reg->bv reg) ", sp, " + (%cg-slot-expr cg logical-off) ")\n"))) + +(define (%cg-emit-st-slot cg reg logical-off) + (%cg-emit-many cg (list "%st(" (%cg-reg->bv reg) ", sp, " + (%cg-slot-expr cg logical-off) ")\n"))) + +(define (%cg-emit-ld cg reg base off) + (%cg-emit-many cg (list "%ld(" (%cg-reg->bv reg) ", " + (%cg-reg->bv base) ", " (%n off) ")\n"))) + +(define (%cg-emit-st cg reg base off) + (%cg-emit-many cg (list "%st(" (%cg-reg->bv reg) ", " + (%cg-reg->bv base) ", " (%n off) ")\n"))) + +(define (%cg-load-opnd-into cg op reg) + (let ((kind (opnd-kind op)) (lv? (opnd-lval? op)) (ext (opnd-ext op))) + (cond + ((eq? kind 'imm) (%cg-emit-li cg reg ext)) + ((eq? kind 'frame) (%cg-emit-ld-slot cg reg ext)) + ((eq? kind 'global) + (cond + ((not lv?) (%cg-emit-la cg reg ext)) + (else (%cg-emit-la cg reg ext) (%cg-emit-ld cg reg reg 0)))) + (else (die #f "cg internal: unknown opnd-kind" kind))))) + +(define (%cg-spill-reg cg reg ty) + (let* ((off (cg-alloc-slot cg 8 8)) + (op (%opnd 'frame ty off #f))) + (%cg-emit-st-slot cg reg off) + (cg-vstack-set! cg (cons op (cg-vstack cg))) + op)) + +(define (%ctype-ptr? t) + (let ((k (ctype-kind t))) + (if (eq? k 'ptr) #t (eq? k 'arr)))) + +(define (%ctype-pointee t) + (cond ((eq? (ctype-kind t) 'ptr) (ctype-ext t)) + ((eq? (ctype-kind t) 'arr) (car (ctype-ext t))) + (else #f))) + +(define (%ctype-unsigned? t) + (let ((k (ctype-kind t))) + (cond ((eq? k 'u8) #t) ((eq? k 'u16) #t) ((eq? k 'u32) #t) + ((eq? k 'u64) #t) ((eq? k 'bool) #t) + ((eq? k 'ptr) #t) ((eq? k 'arr) #t) ((eq? k 'fn) #t) + (else #f)))) + +(define (%ctype-size t) (ctype-size t)) + +(define (%reg-by-idx i) + (cond ((= i 0) 'a0) ((= i 1) 'a1) ((= i 2) 'a2) ((= i 3) 'a3) + (else (die #f "cg: param idx > 3 needs ldarg path" i)))) ;; -------------------------------------------------------------------- ;; Lifecycle ;; -------------------------------------------------------------------- -(define (cg-init) (error "TBD: cg-init")) -(define (cg-finish cg) (error "TBD: cg-finish")) -(define (cg-fn-begin cg name params return-type) (error "TBD: cg-fn-begin")) -(define (cg-fn-end cg) (error "TBD: cg-fn-end")) +(define (cg-init) + (%cg (make-buf) (make-buf) (make-buf) '() 0 0 '() '() #f #f 0)) + +(define (cg-finish cg) + (let ((stub (bv-cat (list + "# entry stub\n" + "%fn(p1_main, 16, {\n" + "%call(&cc__main)\n" + "})\n")))) + (buf-push! (cg-text cg) stub)) + (bv-cat (list (buf-flush (cg-text cg)) + (buf-flush (cg-data cg)) + (buf-flush (cg-bss cg))))) + +(define (cg-fn-begin cg name params return-type) + (cg-fn-buf-set! cg (make-buf)) + (cg-prologue-buf-set! cg (make-buf)) + (cg-vstack-set! cg '()) + (cg-frame-hi-set! cg 0) + (cg-label-ctr-set! cg 0) + (cg-max-outgoing-set! cg 0) + (%cg-fn-set! cg '%fn-name name) + (%cg-fn-set! cg '%fn-ret-type return-type) + (%cg-fn-set! cg '%indirect-slots '()) + (let ((ret-slot (cg-alloc-slot cg 8 8))) + (%cg-fn-set! cg '%fn-ret-slot ret-slot)) + (let walk ((ps params) (idx 0) (out '())) + (cond + ((null? ps) (reverse out)) + (else + (let* ((p (car ps)) + (ty (sym-type p)) + (off (cg-alloc-slot cg 8 8))) + (cond + ((< idx 4) + (let ((ar (%reg-by-idx idx))) + (buf-push! (cg-prologue-buf cg) + (bv-cat (list "%st(" (%cg-reg->bv ar) + ", sp, " (%cg-slot-expr cg off) ")\n"))))) + (else + (buf-push! (cg-prologue-buf cg) + (bv-cat (list "%ldarg(t0, " (%n (- idx 4)) ")\n" + "%st(t0, sp, " (%cg-slot-expr cg off) ")\n"))))) + (walk (cdr ps) (+ idx 1) + (cons (%sym (sym-name p) 'param #f ty off) out))))))) + +(define (cg-fn-end cg) + (let* ((name (%cg-fn-get cg '%fn-name)) + (ret-slot (%cg-fn-get cg '%fn-ret-slot)) + (ret-type (%cg-fn-get cg '%fn-ret-type)) + (locals-hi (cg-frame-hi cg)) + (staging-bytes (* 8 (cg-max-outgoing cg))) + (raw-size (+ staging-bytes locals-hi)) + (frame-size (align-up raw-size 16)) + (ret-block + (cond + ((eq? (ctype-kind ret-type) 'void) + (bv-cat (list "::ret\n%li(a0, 0)\n"))) + (else + (bv-cat (list "::ret\n%ld(a0, sp, " + (%cg-slot-expr cg ret-slot) ")\n"))))) + (so-macro + (bv-cat (list "%macro " name "__SO()\n" + (%n staging-bytes) "\n%endm\n"))) + (prologue (buf-flush (cg-prologue-buf cg))) + (body (buf-flush (cg-fn-buf cg))) + (mangled (%cg-mangle-global name)) + (fn-block (bv-cat (list + so-macro + "%fn(" mangled ", " (%n frame-size) ", {\n" + prologue body ret-block + "})\n")))) + (buf-push! (cg-text cg) fn-block) + (cg-fn-buf-set! cg #f) + (cg-prologue-buf-set! cg #f) + (cg-vstack-set! cg '()) + (cg-frame-hi-set! cg 0) + (cg-max-outgoing-set! cg 0) + 0)) ;; -------------------------------------------------------------------- -;; Vstack — push / pop / inspect +;; Vstack ;; -------------------------------------------------------------------- -(define (cg-push cg op) (error "TBD: cg-push")) -(define (cg-pop cg) (error "TBD: cg-pop")) -(define (cg-top cg) (error "TBD: cg-top")) -(define (cg-depth cg) (error "TBD: cg-depth")) +(define (cg-push cg op) + (cg-vstack-set! cg (cons op (cg-vstack cg))) + op) + +(define (cg-pop cg) + (let ((s (cg-vstack cg))) + (cond ((null? s) (die #f "cg-pop: empty vstack")) + (else (cg-vstack-set! cg (cdr s)) (car s))))) + +(define (cg-top cg) + (let ((s (cg-vstack cg))) + (cond ((null? s) (die #f "cg-top: empty vstack")) (else (car s))))) + +(define (cg-depth cg) (length (cg-vstack cg))) ;; -------------------------------------------------------------------- ;; Materialize ;; -------------------------------------------------------------------- -(define (cg-push-imm cg ctype value) (error "TBD: cg-push-imm")) -(define (cg-push-string cg bv-content) (error "TBD: cg-push-string")) -(define (cg-push-sym cg sym) (error "TBD: cg-push-sym")) -(define (cg-push-deref cg) (error "TBD: cg-push-deref")) +(define (cg-push-imm cg ctype value) + (cg-push cg (%opnd 'imm ctype value #f))) + +(define (cg-push-string cg bv-content) + (let* ((label (cg-intern-string cg bv-content)) + (cp-ty (%ctype 'ptr 8 8 %t-i8))) + (cg-push cg (%opnd 'global cp-ty label #f)))) + +(define (cg-push-sym cg sym) + (let ((k (sym-kind sym)) (ty (sym-type sym))) + (cond + ((eq? k 'fn) + (cg-push cg (%opnd 'global ty (%cg-mangle-global (sym-name sym)) #f))) + ((eq? k 'enum-const) + (cg-push cg (%opnd 'imm ty (sym-slot sym) #f))) + ((eq? k 'var) + (let ((stg (sym-storage sym))) + (cond + ((eq? stg 'extern) + (cg-push cg (%opnd 'global ty (%cg-mangle-global (sym-name sym)) #t))) + ((eq? stg 'static) + (cg-push cg (%opnd 'global ty (%cg-mangle-global (sym-name sym)) #t))) + (else + (cg-push cg (%opnd 'frame ty (sym-slot sym) #t)))))) + ((eq? k 'param) + (cg-push cg (%opnd 'frame ty (sym-slot sym) #t))) + (else (die #f "cg-push-sym: unsupported sym-kind" k))))) + +;; A cg-push-deref result is a frame-lval whose slot HOLDS THE ADDRESS +;; (not the value). To distinguish from ordinary frame-lvals (whose +;; slot holds the value directly), we tag indirect slots in +;; %indirect-slots so cg-load and cg-assign can do the extra +;; indirection. +(define (%cg-mark-indirect! cg off) + (let ((cur (or (%cg-fn-get cg '%indirect-slots) '()))) + (%cg-fn-set! cg '%indirect-slots (cons off cur)))) + +(define (%cg-indirect? cg off) + (let ((cur (or (%cg-fn-get cg '%indirect-slots) '()))) + (let loop ((xs cur)) + (cond ((null? xs) #f) ((= (car xs) off) #t) (else (loop (cdr xs))))))) + +(define (cg-push-deref cg) + (let* ((p (cg-pop cg)) + (pt (opnd-type p)) + (pe (cond ((eq? (ctype-kind pt) 'ptr) (ctype-ext pt)) + ((eq? (ctype-kind pt) 'arr) (car (ctype-ext pt))) + (else #f)))) + (cond + ((not pe) (die #f "cg-push-deref: not a pointer" pt)) + (else + (%cg-load-opnd-into cg p 't0) + (let ((off (cg-alloc-slot cg 8 8))) + (%cg-emit-st-slot cg 't0 off) + (%cg-mark-indirect! cg off) + (cg-push cg (%opnd 'frame pe off #t))))))) ;; -------------------------------------------------------------------- -;; Address & deref operators +;; Address & deref ;; -------------------------------------------------------------------- -(define (cg-take-addr cg) (error "TBD: cg-take-addr")) -(define (cg-load cg) (error "TBD: cg-load")) +(define (cg-take-addr cg) + (let* ((p (cg-pop cg)) + (ty (opnd-type p)) + (pty (%ctype 'ptr 8 8 ty))) + (cond + ((not (opnd-lval? p)) + (die #f "cg-take-addr: not an lvalue")) + ((eq? (opnd-kind p) 'frame) + (cond + ((%cg-indirect? cg (opnd-ext p)) + ;; The address itself lives at sp+slot — &*p degenerates to p. + (%cg-emit-ld-slot cg 't0 (opnd-ext p)) + (%cg-spill-reg cg 't0 pty)) + (else + ;; %mov(rd, sp) gives the portable-sp pointer (the backend + ;; handles any hidden frame-header offset). Then add slot. + (%cg-emit-many cg (list "%mov(t0, sp)\n" + "%addi(t0, t0, " + (%cg-slot-expr cg (opnd-ext p)) ")\n")) + (%cg-spill-reg cg 't0 pty)))) + ((eq? (opnd-kind p) 'global) + (%cg-emit-la cg 't0 (opnd-ext p)) + (%cg-spill-reg cg 't0 pty)) + (else (die #f "cg-take-addr: non-addressable" (opnd-kind p)))))) + +(define (cg-load cg) + (let* ((p (cg-pop cg)) (ty (opnd-type p))) + (cond + ((not (opnd-lval? p)) (die #f "cg-load: not an lvalue")) + ((and (eq? (opnd-kind p) 'frame) + (%cg-indirect? cg (opnd-ext p))) + ;; Indirect frame-lval: slot holds the address. Load addr → t0, + ;; then load value through t0. + (%cg-emit-ld-slot cg 't0 (opnd-ext p)) + (%cg-emit-ld cg 't0 't0 0) + (%cg-spill-reg cg 't0 ty)) + (else (%cg-load-opnd-into cg p 't0) (%cg-spill-reg cg 't0 ty))))) ;; -------------------------------------------------------------------- ;; Type conversions ;; -------------------------------------------------------------------- -(define (cg-cast cg to-type) (error "TBD: cg-cast")) -(define (cg-promote cg) (error "TBD: cg-promote")) -(define (cg-arith-conv cg) (error "TBD: cg-arith-conv")) +(define (cg-cast cg to-type) + (let* ((p (cg-pop cg)) + (from-ty (opnd-type p)) + (from-sz (%ctype-size from-ty)) + (to-sz (%ctype-size to-type)) + (to-kind (ctype-kind to-type))) + (cond + ((eq? to-kind 'bool) + (%cg-load-opnd-into cg p 't0) + (%cg-emit-many cg (list + "%ifelse_eqz(t0, { %li(t0, 0) }, { %li(t0, 1) })\n")) + (%cg-spill-reg cg 't0 to-type)) + ((or (eq? to-kind 'ptr) + (and (or (eq? to-kind 'i64) (eq? to-kind 'u64)) + (or (eq? (ctype-kind from-ty) 'ptr) + (eq? (ctype-kind from-ty) 'arr)))) + (cg-push cg (%opnd (opnd-kind p) to-type (opnd-ext p) (opnd-lval? p)))) + ((>= to-sz from-sz) + (cg-push cg (%opnd (opnd-kind p) to-type (opnd-ext p) (opnd-lval? p)))) + (else + (%cg-load-opnd-into cg p 't0) + (cond + ((= to-sz 1) (%cg-emit-many cg (list "%andi(t0, t0, 255)\n"))) + ((= to-sz 2) + (%cg-emit-many cg (list "%li(t1, 65535)\n%and(t0, t0, t1)\n"))) + ((= to-sz 4) + (%cg-emit-many cg (list "%li(t1, 4294967295)\n%and(t0, t0, t1)\n"))) + (else 0)) + (%cg-spill-reg cg 't0 to-type))))) + +(define (cg-promote cg) + (let* ((p (cg-pop cg)) + (ty (opnd-type p)) + (sz (%ctype-size ty))) + (cond + ((< sz 4) + (cond + ((%ctype-unsigned? ty) + (cg-push cg (%opnd (opnd-kind p) %t-u32 (opnd-ext p) (opnd-lval? p)))) + (else + (cg-push cg (%opnd (opnd-kind p) %t-i32 (opnd-ext p) (opnd-lval? p)))))) + (else (cg-push cg p))))) + +(define (cg-arith-conv cg) + (let* ((b (cg-pop cg)) + (a (cg-pop cg)) + (ta (opnd-type a)) + (tb (opnd-type b)) + (sa (%ctype-size ta)) + (sb (%ctype-size tb)) + (common (cond + ((> sa sb) ta) + ((> sb sa) tb) + ((%ctype-unsigned? ta) ta) + ((%ctype-unsigned? tb) tb) + (else ta)))) + (cg-push cg (%opnd (opnd-kind a) common (opnd-ext a) (opnd-lval? a))) + (cg-push cg (%opnd (opnd-kind b) common (opnd-ext b) (opnd-lval? b))))) ;; -------------------------------------------------------------------- -;; Operators (CC-CONTRACTS §1.11) +;; Operators ;; -------------------------------------------------------------------- -(define (cg-binop cg op) (error "TBD: cg-binop")) -(define (cg-unop cg op) (error "TBD: cg-unop")) -(define (cg-assign cg) (error "TBD: cg-assign")) +(define (%cg-emit-rrr cg op rd ra rb) + (%cg-emit-many cg (list "%" op "(" (%cg-reg->bv rd) ", " + (%cg-reg->bv ra) ", " (%cg-reg->bv rb) ")\n"))) + +(define (%cg-emit-cmp cg cc ra rb rd) + (%cg-emit-many cg (list "%ifelse_" cc "(" + (%cg-reg->bv ra) ", " (%cg-reg->bv rb) + ", { %li(" (%cg-reg->bv rd) ", 1) }, " + "{ %li(" (%cg-reg->bv rd) ", 0) })\n"))) + +(define (cg-binop cg op) + (let* ((b (cg-pop cg)) + (a (cg-pop cg)) + (ta (opnd-type a)) + (tb (opnd-type b)) + (unsigned? (or (%ctype-unsigned? ta) (%ctype-unsigned? tb))) + (a-ptr? (%ctype-ptr? ta)) + (b-ptr? (%ctype-ptr? tb)) + (result-ty + (cond + ((or (eq? op 'eq) (eq? op 'ne) + (eq? op 'lt) (eq? op 'le) (eq? op 'gt) (eq? op 'ge)) + %t-i32) + ((and a-ptr? b-ptr? (eq? op 'sub)) %t-i64) + (a-ptr? ta) + (b-ptr? tb) + (else ta)))) + (cond + ((and a-ptr? (or (eq? op 'add) (eq? op 'sub)) (not b-ptr?)) + (%cg-load-opnd-into cg a 'a0) + (%cg-load-opnd-into cg b 'a1) + (let ((sz (%ctype-size (%ctype-pointee ta)))) + (cond ((> sz 1) (%cg-emit-many cg (list "%li(t0, " (%n sz) ")\n")) + (%cg-emit-rrr cg "mul" 'a1 'a1 't0)) + (else 0))) + (%cg-emit-rrr cg (if (eq? op 'add) "add" "sub") 't0 'a0 'a1) + (%cg-spill-reg cg 't0 result-ty)) + ((and b-ptr? (eq? op 'add) (not a-ptr?)) + (%cg-load-opnd-into cg a 'a0) + (%cg-load-opnd-into cg b 'a1) + (let ((sz (%ctype-size (%ctype-pointee tb)))) + (cond ((> sz 1) (%cg-emit-many cg (list "%li(t0, " (%n sz) ")\n")) + (%cg-emit-rrr cg "mul" 'a0 'a0 't0)) + (else 0))) + (%cg-emit-rrr cg "add" 't0 'a0 'a1) + (%cg-spill-reg cg 't0 result-ty)) + ((and a-ptr? b-ptr? (eq? op 'sub)) + (%cg-load-opnd-into cg a 'a0) + (%cg-load-opnd-into cg b 'a1) + (%cg-emit-rrr cg "sub" 't0 'a0 'a1) + (let ((sz (%ctype-size (%ctype-pointee ta)))) + (cond ((> sz 1) (%cg-emit-many cg (list "%li(t1, " (%n sz) ")\n")) + (%cg-emit-rrr cg "div" 't0 't0 't1)) + (else 0))) + (%cg-spill-reg cg 't0 result-ty)) + (else + (%cg-load-opnd-into cg a 'a0) + (%cg-load-opnd-into cg b 'a1) + (cond + ((eq? op 'add) (%cg-emit-rrr cg "add" 't0 'a0 'a1)) + ((eq? op 'sub) (%cg-emit-rrr cg "sub" 't0 'a0 'a1)) + ((eq? op 'mul) (%cg-emit-rrr cg "mul" 't0 'a0 'a1)) + ((eq? op 'and) (%cg-emit-rrr cg "and" 't0 'a0 'a1)) + ((eq? op 'or) (%cg-emit-rrr cg "or" 't0 'a0 'a1)) + ((eq? op 'xor) (%cg-emit-rrr cg "xor" 't0 'a0 'a1)) + ((eq? op 'shl) (%cg-emit-rrr cg "shl" 't0 'a0 'a1)) + ((eq? op 'shr) + (if unsigned? (%cg-emit-rrr cg "shr" 't0 'a0 'a1) + (%cg-emit-rrr cg "sar" 't0 'a0 'a1))) + ((eq? op 'div) (%cg-emit-rrr cg "div" 't0 'a0 'a1)) + ((eq? op 'rem) (%cg-emit-rrr cg "rem" 't0 'a0 'a1)) + ((eq? op 'eq) (%cg-emit-cmp cg "eq" 'a0 'a1 't0)) + ((eq? op 'ne) (%cg-emit-cmp cg "ne" 'a0 'a1 't0)) + ((eq? op 'lt) (%cg-emit-cmp cg (if unsigned? "ltu" "lt") 'a0 'a1 't0)) + ((eq? op 'gt) (%cg-emit-cmp cg (if unsigned? "ltu" "lt") 'a1 'a0 't0)) + ((eq? op 'le) + (%cg-emit-cmp cg (if unsigned? "ltu" "lt") 'a1 'a0 't0) + (%cg-emit-many cg (list "%xori(t0, t0, 1)\n"))) + ((eq? op 'ge) + (%cg-emit-cmp cg (if unsigned? "ltu" "lt") 'a0 'a1 't0) + (%cg-emit-many cg (list "%xori(t0, t0, 1)\n"))) + (else (die #f "cg-binop: unknown op" op))) + (%cg-spill-reg cg 't0 result-ty))))) + +(define (cg-unop cg op) + (let* ((p (cg-pop cg)) (ty (opnd-type p))) + (%cg-load-opnd-into cg p 't0) + (cond + ((eq? op 'neg) + (%cg-emit-many cg (list "%li(t1, 0)\n%sub(t0, t1, t0)\n")) + (%cg-spill-reg cg 't0 ty)) + ((eq? op 'bnot) + (%cg-emit-many cg (list "%li(t1, -1)\n%xor(t0, t0, t1)\n")) + (%cg-spill-reg cg 't0 ty)) + ((eq? op 'lnot) + (%cg-emit-many cg (list "%ifelse_eqz(t0, { %li(t0, 1) }, { %li(t0, 0) })\n")) + (%cg-spill-reg cg 't0 %t-i32)) + (else (die #f "cg-unop: unknown op" op))))) + +(define (cg-assign cg) + (let* ((rhs (cg-pop cg)) + (lhs (cg-pop cg)) + (ty (opnd-type lhs))) + (cond ((not (opnd-lval? lhs)) (die #f "cg-assign: lhs not lvalue"))) + (%cg-load-opnd-into cg rhs 'a0) + (cond + ((eq? (opnd-kind lhs) 'frame) + (cond + ((%cg-indirect? cg (opnd-ext lhs)) + (%cg-emit-ld-slot cg 't0 (opnd-ext lhs)) + (%cg-emit-st cg 'a0 't0 0)) + (else + (%cg-emit-st-slot cg 'a0 (opnd-ext lhs))))) + ((eq? (opnd-kind lhs) 'global) + (%cg-emit-la cg 't0 (opnd-ext lhs)) + (%cg-emit-st cg 'a0 't0 0)) + (else (die #f "cg-assign: unsupported lhs kind" (opnd-kind lhs)))) + (%cg-spill-reg cg 'a0 ty))) ;; -------------------------------------------------------------------- -;; Calls — pops fn and args from vstack; pushes result if has-result? +;; Calls ;; -------------------------------------------------------------------- -(define (cg-call cg arity has-result?) (error "TBD: cg-call")) +(define (cg-call cg arity has-result?) + (let* ((args (let loop ((i 0) (acc '())) + (cond ((= i arity) acc) + (else (loop (+ i 1) (cons (cg-pop cg) acc)))))) + (fn-op (cg-pop cg))) + (let stage ((xs args) (idx 0)) + (cond + ((null? xs) 0) + ((< idx 4) + (%cg-load-opnd-into cg (car xs) (%reg-by-idx idx)) + (stage (cdr xs) (+ idx 1))) + (else + (%cg-load-opnd-into cg (car xs) 't0) + (%cg-emit-st cg 't0 'sp (* 8 (- idx 4))) + (stage (cdr xs) (+ idx 1))))) + (cond ((> arity 4) (%cg-bump-outgoing! cg (- arity 4))) (else 0)) + (cond + ((and (eq? (opnd-kind fn-op) 'global) (not (opnd-lval? fn-op))) + (%cg-emit-many cg (list "%call(&" (opnd-ext fn-op) ")\n"))) + (else + (%cg-load-opnd-into cg fn-op 't0) + (%cg-emit-many cg (list "%callr(t0)\n")))) + (cond + (has-result? + (let* ((fty (opnd-type fn-op)) + (rty (cond + ((eq? (ctype-kind fty) 'fn) (car (ctype-ext fty))) + ((eq? (ctype-kind fty) 'ptr) + (let ((p (ctype-ext fty))) + (if (eq? (ctype-kind p) 'fn) (car (ctype-ext p)) %t-i64))) + (else %t-i64)))) + (%cg-spill-reg cg 'a0 rty))) + (else #f)))) ;; -------------------------------------------------------------------- -;; Return — pops rval, stores into return slot, branches to ::ret +;; Return ;; -------------------------------------------------------------------- -(define (cg-return cg) (error "TBD: cg-return")) +(define (cg-return cg) + (let ((ret-slot (%cg-fn-get cg '%fn-ret-slot)) + (ret-type (%cg-fn-get cg '%fn-ret-type))) + (cond + ((eq? (ctype-kind ret-type) 'void) + (%cg-emit-many cg (list "%b(&::ret)\n"))) + (else + (let ((p (cg-pop cg))) + (%cg-load-opnd-into cg p 'a0) + (%cg-emit-st-slot cg 'a0 ret-slot) + (%cg-emit-many cg (list "%b(&::ret)\n"))))))) ;; -------------------------------------------------------------------- -;; Structured control flow — thunks let parse recurse into the body. +;; Structured control flow ;; -------------------------------------------------------------------- -(define (cg-if cg then-thunk) (error "TBD: cg-if")) -(define (cg-ifelse cg then-thunk else-thunk) (error "TBD: cg-ifelse")) +(define (cg-if cg then-thunk) + (let ((p (cg-pop cg))) + (%cg-load-opnd-into cg p 't0) + (%cg-emit-many cg (list "%if_nez(t0, {\n")) + (then-thunk) + (%cg-emit-many cg (list "})\n")))) + +(define (cg-ifelse cg then-thunk else-thunk) + (let ((p (cg-pop cg))) + (%cg-load-opnd-into cg p 't0) + (%cg-emit-many cg (list "%ifelse_nez(t0, {\n")) + (then-thunk) + (%cg-emit-many cg (list "}, {\n")) + (else-thunk) + (%cg-emit-many cg (list "})\n")))) + +(define (cg-loop cg head-thunk body-thunk) + (let ((tag (%cg-fresh-loop-tag cg))) + (%cg-emit-many cg (list "%loop_tag(" tag ", {\n")) + (head-thunk) + (cond + ((zero? (cg-depth cg)) 0) + (else + (let ((c (cg-pop cg))) + (%cg-load-opnd-into cg c 't0) + (%cg-emit-many cg (list "%if_eqz(t0, { %break(" tag ") })\n"))))) + (body-thunk) + (%cg-emit-many cg (list "})\n")) + tag)) + +(define (cg-loop-end cg tag) 0) + +(define (cg-break cg tag) + (%cg-emit-many cg (list "%break(" tag ")\n"))) -(define (cg-loop cg head-thunk body-thunk) (error "TBD: cg-loop")) -(define (cg-loop-end cg tag) (error "TBD: cg-loop-end")) -(define (cg-break cg tag) (error "TBD: cg-break")) -(define (cg-continue cg tag) (error "TBD: cg-continue")) +(define (cg-continue cg tag) + (%cg-emit-many cg (list "%continue(" tag ")\n"))) ;; -------------------------------------------------------------------- ;; switch ;; -------------------------------------------------------------------- -(define (cg-switch-begin cg) (error "TBD: cg-switch-begin")) -(define (cg-switch-case cg swctx const-int) (error "TBD: cg-switch-case")) -(define (cg-switch-default cg swctx) (error "TBD: cg-switch-default")) -(define (cg-switch-end cg swctx) (error "TBD: cg-switch-end")) +(define-record-type swctx + (%swctx ctrl-slot end-tag default-lbl) + swctx? + (ctrl-slot swctx-ctrl-slot) + (end-tag swctx-end-tag) + (default-lbl swctx-default-lbl swctx-default-lbl-set!)) + +(define (cg-switch-begin cg) + (let* ((p (cg-pop cg)) + (off (cg-alloc-slot cg 8 8)) + (tag (%cg-fresh-loop-tag cg)) + (disp-lbl (bytevector-append "sw_disp_" tag))) + (%cg-load-opnd-into cg p 't0) + (%cg-emit-st-slot cg 't0 off) + (%cg-emit-many cg (list "%loop_tag(" tag ", {\n" + "%b(&::" disp-lbl ")\n")) + (%swctx off tag #f))) + +(define (cg-switch-case cg sw const-int) + (let* ((lbl (%cg-fresh-lbl cg)) + (key (string->symbol + (bytevector-append "%sw_cases__" (swctx-end-tag sw)))) + (cur (or (alist-ref/eq key (cg-globals cg)) '())) + (entry (cons const-int lbl))) + (%cg-fn-set! cg key (cons entry cur)) + (%cg-emit-many cg (list "::" lbl "\n")))) + +(define (cg-switch-default cg sw) + (let ((lbl (%cg-fresh-lbl cg))) + (swctx-default-lbl-set! sw lbl) + (%cg-emit-many cg (list "::" lbl "\n")))) + +(define (cg-switch-end cg sw) + (let* ((tag (swctx-end-tag sw)) + (key (string->symbol (bytevector-append "%sw_cases__" tag))) + (cases (reverse (or (alist-ref/eq key (cg-globals cg)) '()))) + (default-lbl (swctx-default-lbl sw)) + (disp-lbl (bytevector-append "sw_disp_" tag))) + (%cg-emit-many cg (list "%break(" tag ")\n" + "::" disp-lbl "\n")) + (%cg-emit-many cg (list "%ld(t0, sp, " + (%cg-slot-expr cg (swctx-ctrl-slot sw)) ")\n")) + (for-each + (lambda (c) + (%cg-emit-many cg (list "%li(t1, " (%n (car c)) ")\n" + "%beq(t0, t1, &::" (cdr c) ")\n"))) + cases) + (cond + (default-lbl (%cg-emit-many cg (list "%b(&::" default-lbl ")\n"))) + (else 0)) + (%cg-emit-many cg (list "%break(" tag ")\n" + "})\n")))) ;; -------------------------------------------------------------------- ;; Globals and data ;; -------------------------------------------------------------------- -(define (cg-emit-global cg sym init-bv-or-#f) (error "TBD: cg-emit-global")) -(define (cg-emit-extern cg sym) (error "TBD: cg-emit-extern")) -(define (cg-intern-string cg bv-content) (error "TBD: cg-intern-string")) +(define (cg-emit-global cg sym init-bv-or-false) + (let* ((nm (sym-name sym)) + (lbl (%cg-mangle-global nm)) + (sz (ctype-size (sym-type sym))) + (size (if (< sz 0) 8 sz))) + (cond + (init-bv-or-false + (buf-push! (cg-data cg) + (bv-cat (list "\n:" lbl "\n" + "\"" init-bv-or-false "\"\n")))) + (else + (buf-push! (cg-bss cg) + (bv-cat (list "\n:" lbl "\n" + (let zero-loop ((rem size) (acc '())) + (cond + ((<= rem 0) (bv-cat (reverse acc))) + ((>= rem 8) + (zero-loop (- rem 8) (cons "$(0)\n" acc))) + (else + (zero-loop (- rem 1) (cons "!(0)\n" acc)))))))))) + (cg-globals-set! cg (alist-set (sym-name sym) sym (cg-globals cg))) + 0)) + +(define (cg-emit-extern cg sym) + (cg-globals-set! cg (alist-set (sym-name sym) sym (cg-globals cg))) + 0) + +(define (cg-intern-string cg bv-content) + (let ((p (alist-ref bv-content (cg-str-pool cg)))) + (cond + (p p) + (else + (let* ((n (length (cg-str-pool cg))) + (lbl (bytevector-append "cc__str_" (%n n)))) + (cg-str-pool-set! cg + (alist-set bv-content lbl (cg-str-pool cg))) + (buf-push! (cg-data cg) + (bv-cat (list "\n:" lbl "\n" + "\"" bv-content "\"\n" + "!(0)\n"))) + lbl))))) ;; -------------------------------------------------------------------- -;; Frame — used internally by cg and by parse for locals +;; Frame ;; -------------------------------------------------------------------- -(define (cg-alloc-slot cg bytes align) (error "TBD: cg-alloc-slot")) +(define (cg-alloc-slot cg bytes align) + (let* ((aligned (align-up (cg-frame-hi cg) align)) + (new-hi (+ aligned bytes))) + (cg-frame-hi-set! cg new-hi) + aligned)) diff --git a/cc/lex.scm b/cc/lex.scm @@ -5,35 +5,649 @@ ;; (KW, PUNCT, tok-kind) live in cc/data.scm; do not duplicate. ;; ;; Owner: <unassigned> +;; +;; Implementation notes: +;; +;; - The lexer walks `src` byte-by-byte, threading (pos, line, col) +;; explicitly through every helper (no mutable state). Each token +;; captures its starting loc; helpers return (tok npos nline ncol). +;; - Trigraphs and `\<newline>` line splicing are handled via a single +;; logical-byte primitive `%lex-peek`: it advances over splices and +;; translates trigraphs in-place, so downstream code only ever sees +;; the "translation phase 2" stream. +;; - Comments are stripped at the same level as whitespace. +;; - NL tokens are emitted at every physical newline so pp can use +;; them to terminate directives. ;; -------------------------------------------------------------------- -;; lex-tokenize -;; src : bv (the C bytestream; may be very large) -;; file : bv (filename, recorded into tok-loc) -;; -> : list of tok ending in a single (tok kind=EOF); never #f -;; aborts via util's `die` on unrecognized byte sequences +;; Byte-class predicates (raw u8 values, not chars). +;; -------------------------------------------------------------------- +(define (%digit? b) (if (< b 48) #f (if (< 57 b) #f #t))) ; '0'..'9' +(define (%hex? b) + (cond ((%digit? b) #t) + ((if (< b 65) #f (if (< 70 b) #f #t)) #t) ; 'A'..'F' + ((if (< b 97) #f (if (< 102 b) #f #t)) #t) ; 'a'..'f' + (else #f))) +(define (%octal? b) (if (< b 48) #f (if (< 55 b) #f #t))) ; '0'..'7' +(define (%alpha? b) + (cond ((if (< b 65) #f (if (< 90 b) #f #t)) #t) ; 'A'..'Z' + ((if (< b 97) #f (if (< 122 b) #f #t)) #t) ; 'a'..'z' + (else #f))) +(define (%ident-start? b) (or (%alpha? b) (= b 95))) ; '_' +(define (%ident-cont? b) (or (%ident-start? b) (%digit? b))) +(define (%hspace? b) (or (= b 32) (= b 9) (= b 11) (= b 12))) ; SP TAB VT FF +(define (%newline? b) (= b 10)) ; '\n' + +;; -------------------------------------------------------------------- +;; Logical byte access. %lex-peek returns +;; (byte npos nline ncol) +;; where (npos, nline, ncol) points *just past* the consumed physical +;; bytes. On EOF it returns (#f pos line col). +;; +;; Two transformations folded in here: +;; +;; - Trigraphs: ??= ??( ??/ ??) ??' ??< ??! ??> ??- +;; # [ \ ] ^ { | } ~ +;; The pair `??` followed by one of the nine trigraph completers +;; produces the translated byte and advances 3 source bytes. +;; - Line splice: a backslash immediately followed by `\n` is removed +;; as a unit (incrementing line, resetting col to 1) and we recurse +;; to fetch the next logical byte. +;; +;; Other escapes (e.g. `\<not-newline>`) are returned as-is — string and +;; char literals do their own escape-handling. +;; -------------------------------------------------------------------- +(define (%trigraph-byte b) + ;; Map the third trigraph byte to its replacement, or #f. + (cond ((= b 61) 35) ; '=' -> '#' + ((= b 40) 91) ; '(' -> '[' + ((= b 47) 92) ; '/' -> '\\' + ((= b 41) 93) ; ')' -> ']' + ((= b 39) 94) ; '\'' -> '^' + ((= b 60) 123) ; '<' -> '{' + ((= b 33) 124) ; '!' -> '|' + ((= b 62) 125) ; '>' -> '}' + ((= b 45) 126) ; '-' -> '~' + (else #f))) + +(define (%lex-peek src pos line col) + (let ((n (bytevector-length src))) + (cond + ((>= pos n) (list #f pos line col)) + (else + (let ((b (bytevector-u8-ref src pos))) + (cond + ;; Trigraph: ?? + completer + ((and (= b 63) + (< (+ pos 2) n) + (= (bytevector-u8-ref src (+ pos 1)) 63)) + (let ((tr (%trigraph-byte (bytevector-u8-ref src (+ pos 2))))) + (if tr + (list tr (+ pos 3) line (+ col 3)) + (list b (+ pos 1) line (+ col 1))))) + ;; Line splice: backslash + newline (consume both, no token) + ((and (= b 92) + (< (+ pos 1) n) + (= (bytevector-u8-ref src (+ pos 1)) 10)) + (%lex-peek src (+ pos 2) (+ line 1) 1)) + ;; Newline: pass through but caller decides line/col bump + ((%newline? b) + (list b (+ pos 1) (+ line 1) 1)) + (else + (list b (+ pos 1) line (+ col 1))))))))) + +;; Convenience accessors over the 4-list. +(define (%pk-byte p) (car p)) +(define (%pk-pos p) (car (cdr p))) +(define (%pk-line p) (car (cdr (cdr p)))) +(define (%pk-col p) (car (cdr (cdr (cdr p))))) + +;; -------------------------------------------------------------------- +;; Build a fresh logical-byte snapshot of `src`. Used by lex-read-* +;; helpers (which consume from a bv with translations already applied, +;; rather than re-walking through %lex-peek for each character class +;; check). The translated bv mirrors the original 1:1 for ASCII text +;; that contains no trigraphs and no line splices, which is the common +;; case; tests rarely exercise the slow path. +;; +;; Returns a fresh bv. Original line/col tracking is unaffected because +;; the main tokenizer walks via %lex-peek directly; helpers only use +;; the snapshot for value-extraction (number/string/char/ident bytes). +;; +;; We do not currently materialize a snapshot — every helper uses +;; %lex-peek directly. The function is reserved for later if profiling +;; shows it's worth it. ;; -------------------------------------------------------------------- -(define (lex-tokenize src file) - (error "TBD: lex-tokenize")) ;; -------------------------------------------------------------------- -;; Internal helpers exposed for unit tests in tests/cc-lex/. -;; Each consumes one logical token starting at byte offset `pos` and -;; returns a (tok . pos') pair; pos' is the byte just past the token. +;; Whitespace + comment skipper. Returns (pos line col). +;; Handles spaces/tabs, // line comments, /* block */ comments. Does +;; *not* consume `\n` — newlines are tokens. +;; -------------------------------------------------------------------- +(define (%skip-ws-and-comments src pos line col file) + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ((not b) (list pos line col)) + ((%hspace? b) + (%skip-ws-and-comments src (%pk-pos p) (%pk-line p) (%pk-col p) file)) + ;; '/' starts a possible comment + ((= b 47) + (let* ((q (%lex-peek src (%pk-pos p) (%pk-line p) (%pk-col p))) + (b2 (%pk-byte q))) + (cond + ;; "//" line comment — eat to but not through '\n' + ((and b2 (= b2 47)) + (%skip-line-comment src (%pk-pos q) (%pk-line q) (%pk-col q) file)) + ;; "/*" block comment — eat until closing "*/" + ((and b2 (= b2 42)) + (%skip-block-comment src (%pk-pos q) (%pk-line q) (%pk-col q) + file line col)) + (else (list pos line col))))) + (else (list pos line col))))) + +(define (%skip-line-comment src pos line col file) + ;; Consume bytes until end-of-stream or until we *see* '\n' (do not + ;; consume the newline itself; outer loop emits the NL). + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ((not b) (%skip-ws-and-comments src pos line col file)) + ((%newline? b) (%skip-ws-and-comments src pos line col file)) + (else + (%skip-line-comment src (%pk-pos p) (%pk-line p) (%pk-col p) file))))) + +(define (%skip-block-comment src pos line col file start-line start-col) + (let* ((p (%lex-peek src pos line col)) + (b1 (%pk-byte p))) + (cond + ((not b1) + (die (%loc file start-line start-col) + "unterminated /* block comment")) + ((= b1 42) ; '*' + (let* ((q (%lex-peek src (%pk-pos p) (%pk-line p) (%pk-col p))) + (b2 (%pk-byte q))) + (cond + ((not b2) + (die (%loc file start-line start-col) + "unterminated /* block comment")) + ((= b2 47) ; '*' '/' + (%skip-ws-and-comments src (%pk-pos q) (%pk-line q) (%pk-col q) + file)) + (else + ;; Re-scan starting at the byte after '*'; the '*' was not + ;; the closer, but the next byte might itself be '*'. + (%skip-block-comment src (%pk-pos p) (%pk-line p) (%pk-col p) + file start-line start-col))))) + (else + (%skip-block-comment src (%pk-pos p) (%pk-line p) (%pk-col p) + file start-line start-col))))) + +;; -------------------------------------------------------------------- +;; Identifier / keyword reader. +;; +;; Returns (tok npos nline ncol). Caller has already verified that the +;; first byte at `pos` satisfies %ident-start?. +;; -------------------------------------------------------------------- +(define (%collect-ident src pos line col) + ;; Walk %ident-cont? bytes; return (chunks-rev npos nline ncol) where + ;; chunks-rev is a reversed list of one-byte bvs. We reuse make-tok + ;; loc from the caller. + (let loop ((pos pos) (line line) (col col) (acc '())) + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (if (and b (%ident-cont? b)) + (loop (%pk-pos p) (%pk-line p) (%pk-col p) + (cons (bv-of-byte b) acc)) + (list (reverse acc) pos line col))))) + +(define (lex-read-ident src pos file) + ;; Public for tests. Threads line/col from a fresh start. + (%lex-read-ident src pos 1 (+ pos 1) file)) + +(define (%lex-read-ident src pos line col file) + (let* ((start-loc (%loc file line col)) + (r (%collect-ident src pos line col)) + (chs (car r)) + (npos (car (cdr r))) + (nline (car (cdr (cdr r)))) + (ncol (car (cdr (cdr (cdr r))))) + (name (bv-cat chs)) + (kw (alist-ref name %keyword-alist))) + (cons (if kw + (make-tok 'KW kw start-loc) + (make-tok 'IDENT name start-loc)) + (list npos nline ncol)))) + +;; -------------------------------------------------------------------- +;; Number reader. +;; +;; Decimal: [1-9][0-9]* (suffix: u U l L ll LL combinations) +;; Hex: 0x[0-9a-fA-F]+ | 0X... +;; Octal: 0[0-7]* +;; Float: anything looking like 1.0, 1e3, .5 → die crisply. +;; +;; Returns (tok npos nline ncol) on success. Aborts via `die` on float. ;; -------------------------------------------------------------------- (define (lex-read-number src pos file) - (error "TBD: lex-read-number")) + (%lex-read-number src pos 1 (+ pos 1) file)) + +(define (%collect-while pred src pos line col) + ;; Generic byte collector. Returns (chunks-rev-list npos nline ncol). + (let loop ((pos pos) (line line) (col col) (acc '())) + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (if (and b (pred b)) + (loop (%pk-pos p) (%pk-line p) (%pk-col p) + (cons b acc)) + (list (reverse acc) pos line col))))) + +(define (%bv-of-bytes bs) + ;; Build a fresh bv from a list of byte ints. Allocates one bv. + (let* ((n (length bs)) + (out (make-bytevector n 0))) + (let loop ((i 0) (xs bs)) + (cond ((null? xs) out) + (else (bytevector-u8-set! out i (car xs)) + (loop (+ i 1) (cdr xs))))))) +(define (%digits-value bs base) + ;; Convert list of ASCII digit bytes (already validated) to fixnum + ;; under the given base. Big-endian (leftmost = most significant). + (let loop ((xs bs) (acc 0)) + (cond ((null? xs) acc) + (else + (let* ((b (car xs)) + (d (cond ((%digit? b) (- b 48)) + ((if (< b 65) #f (if (< 70 b) #f #t)) + (+ (- b 65) 10)) + ((if (< b 97) #f (if (< 102 b) #f #t)) + (+ (- b 97) 10)) + (else 0)))) + (loop (cdr xs) (+ (* acc base) d))))))) + +(define (%lex-read-number src pos line col file) + (let* ((start-loc (%loc file line col)) + (p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ;; '0x' / '0X' hex prefix + ((and (= b 48) + (let* ((q (%lex-peek src (%pk-pos p) (%pk-line p) (%pk-col p))) + (b2 (%pk-byte q))) + (and b2 (or (= b2 120) (= b2 88))))) ; 'x' or 'X' + (let* ((q (%lex-peek src (%pk-pos p) (%pk-line p) (%pk-col p))) + (after-x-pos (%pk-pos q)) + (after-x-line (%pk-line q)) + (after-x-col (%pk-col q)) + (r (%collect-while %hex? src after-x-pos after-x-line after-x-col)) + (digs (car r)) + (pos2 (car (cdr r))) + (line2 (car (cdr (cdr r)))) + (col2 (car (cdr (cdr (cdr r)))))) + (if (null? digs) + (die start-loc "expected hex digits after 0x") + (let* ((after (%lex-strip-int-suffix src pos2 line2 col2 file))) + (cons (make-tok 'INT (%digits-value digs 16) start-loc) + after))))) + ;; '0' alone → octal sequence (could be just zero) + ((= b 48) + (let* ((r (%collect-while %octal? src (%pk-pos p) (%pk-line p) (%pk-col p))) + (digs (car r)) + (pos2 (car (cdr r))) + (line2 (car (cdr (cdr r)))) + (col2 (car (cdr (cdr (cdr r)))))) + ;; Reject '.' / 'e' / 'E' immediately after the octal run — float. + (%check-no-float src pos2 line2 col2 file start-loc) + ;; Reject stray digits 8/9 in an octal context (e.g. 089). + (let* ((p3 (%lex-peek src pos2 line2 col2)) + (b3 (%pk-byte p3))) + (if (and b3 (%digit? b3)) + (die start-loc "invalid octal digit" (bv-of-byte b3)) + (let* ((after (%lex-strip-int-suffix src pos2 line2 col2 file)) + (val (%digits-value digs 8))) + (cons (make-tok 'INT val start-loc) after)))))) + ;; '1'-'9' → decimal + ((%digit? b) + (let* ((r (%collect-while %digit? src pos line col)) + (digs (car r)) + (pos2 (car (cdr r))) + (line2 (car (cdr (cdr r)))) + (col2 (car (cdr (cdr (cdr r)))))) + (%check-no-float src pos2 line2 col2 file start-loc) + (let* ((after (%lex-strip-int-suffix src pos2 line2 col2 file)) + (val (%digits-value digs 10))) + (cons (make-tok 'INT val start-loc) after)))) + ;; '.' followed by a digit = float-style literal — reject. + ((= b 46) + (let* ((q (%lex-peek src (%pk-pos p) (%pk-line p) (%pk-col p))) + (b2 (%pk-byte q))) + (if (and b2 (%digit? b2)) + (die start-loc "floating-point literal not supported") + ;; Otherwise '.' was a punctuator — caller wouldn't have + ;; routed here unless it was a digit-led prefix. + (die start-loc "internal: number reader on non-number")))) + (else + (die start-loc "internal: number reader on non-number"))))) + +(define (%check-no-float src pos line col file start-loc) + ;; If the byte at pos starts a fractional/exponent part, abort. + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ((not b) #t) + ((= b 46) ; '.' + (die start-loc "floating-point literal not supported")) + ((or (= b 101) (= b 69)) ; 'e' / 'E' + ;; Only a float exponent if followed by [+-]?digit. + (let* ((q (%lex-peek src (%pk-pos p) (%pk-line p) (%pk-col p))) + (b2 (%pk-byte q))) + (cond + ((and b2 (%digit? b2)) + (die start-loc "floating-point literal not supported")) + ((and b2 (or (= b2 43) (= b2 45))) + (let* ((r (%lex-peek src (%pk-pos q) (%pk-line q) (%pk-col q))) + (b3 (%pk-byte r))) + (if (and b3 (%digit? b3)) + (die start-loc "floating-point literal not supported") + #t))) + (else #t)))) + (else #t)))) + +(define (%lex-strip-int-suffix src pos line col file) + ;; Consume any combination of u U l L (the long can be doubled). We + ;; don't validate orderings strictly; tcc.c uses the canonical forms. + ;; Returns (npos nline ncol). + (let loop ((pos pos) (line line) (col col)) + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ((not b) (list pos line col)) + ((or (= b 117) (= b 85) ; u U + (= b 108) (= b 76)) ; l L + (loop (%pk-pos p) (%pk-line p) (%pk-col p))) + (else (list pos line col)))))) + +;; -------------------------------------------------------------------- +;; String reader. +;; +;; Caller has verified src[pos] == '"' (raw byte 34). Returns +;; (tok npos nline ncol) with the raw decoded bytes (no NUL appended). +;; -------------------------------------------------------------------- (define (lex-read-string src pos file) - (error "TBD: lex-read-string")) + (%lex-read-string src pos 1 (+ pos 1) file)) + +(define (%lex-read-string src pos line col file) + (let* ((start-loc (%loc file line col)) + ;; pos points to the opening quote — step over it first. + (p0 (%lex-peek src pos line col)) + (b0 (%pk-byte p0))) + (if (not (and b0 (= b0 34))) + (die start-loc "internal: string reader on non-quote") + (%collect-string src (%pk-pos p0) (%pk-line p0) (%pk-col p0) + file start-loc '())))) +(define (%collect-string src pos line col file start-loc acc) + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ((not b) + (die start-loc "unterminated string literal")) + ((= b 34) ; closing '"' + (let ((bv (%bv-of-bytes (reverse acc)))) + (cons (make-tok 'STR bv start-loc) + (list (%pk-pos p) (%pk-line p) (%pk-col p))))) + ((%newline? b) + (die start-loc "newline in string literal")) + ((= b 92) ; '\\' — escape + (%read-escape src (%pk-pos p) (%pk-line p) (%pk-col p) + file start-loc acc 'string)) + (else + (%collect-string src (%pk-pos p) (%pk-line p) (%pk-col p) + file start-loc (cons b acc)))))) + +;; -------------------------------------------------------------------- +;; Char reader. +;; +;; Caller has verified src[pos] == '\''. Multi-character constants +;; ('AB') are rejected via die. +;; -------------------------------------------------------------------- (define (lex-read-char src pos file) - (error "TBD: lex-read-char")) + (%lex-read-char src pos 1 (+ pos 1) file)) -(define (lex-read-ident src pos file) - ;; Produces IDENT or KW after consulting %keyword-alist (cc/data.scm). - (error "TBD: lex-read-ident")) +(define (%lex-read-char src pos line col file) + (let* ((start-loc (%loc file line col)) + (p0 (%lex-peek src pos line col)) + (b0 (%pk-byte p0))) + (if (not (and b0 (= b0 39))) + (die start-loc "internal: char reader on non-quote") + (%collect-char src (%pk-pos p0) (%pk-line p0) (%pk-col p0) + file start-loc)))) + +(define (%collect-char src pos line col file start-loc) + ;; Read exactly one byte (handling escapes), then expect closing '\''. + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ((not b) (die start-loc "unterminated char literal")) + ((= b 39) (die start-loc "empty char literal")) + ((%newline? b) (die start-loc "newline in char literal")) + ((= b 92) ; escape + (let* ((r (%read-escape src (%pk-pos p) (%pk-line p) (%pk-col p) + file start-loc '() 'char)) + ;; r is a list (val npos nline ncol) for char-mode escapes. + (val (car r)) + (pos2 (car (cdr r))) + (line2 (car (cdr (cdr r)))) + (col2 (car (cdr (cdr (cdr r)))))) + (%expect-char-close src pos2 line2 col2 file start-loc val))) + (else + (%expect-char-close src (%pk-pos p) (%pk-line p) (%pk-col p) + file start-loc b))))) + +(define (%expect-char-close src pos line col file start-loc val) + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ((not b) (die start-loc "unterminated char literal")) + ((= b 39) + (cons (make-tok 'CHAR val start-loc) + (list (%pk-pos p) (%pk-line p) (%pk-col p)))) + (else + (die start-loc "multi-character char constant not supported"))))) + +;; -------------------------------------------------------------------- +;; Escape sequence reader. +;; +;; mode = 'string : returns (%collect-string ...) tail (recurse) +;; mode = 'char : returns (val npos nline ncol) +;; -------------------------------------------------------------------- +(define (%read-escape src pos line col file start-loc acc mode) + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ((not b) (die start-loc "unterminated escape sequence")) + ;; \xNN — 1+ hex digits (tcc.c uses 1- and 2-digit forms). + ((or (= b 120) (= b 88)) ; 'x' / 'X' + (let* ((r (%collect-while %hex? src (%pk-pos p) (%pk-line p) (%pk-col p))) + (digs (car r)) + (pos2 (car (cdr r))) + (line2 (car (cdr (cdr r)))) + (col2 (car (cdr (cdr (cdr r)))))) + (if (null? digs) + (die start-loc "expected hex digits after \\x") + (let ((val (bit-and (%digits-value digs 16) 255))) + (%escape-result mode val pos2 line2 col2 + src file start-loc acc))))) + ;; \NNN — 1..3 octal digits. + ((%octal? b) + (let* ((r (%collect-octals src pos line col 3 '())) + (digs (car r)) + (pos2 (car (cdr r))) + (line2 (car (cdr (cdr r)))) + (col2 (car (cdr (cdr (cdr r))))) + (val (bit-and (%digits-value digs 8) 255))) + (%escape-result mode val pos2 line2 col2 src file start-loc acc))) + (else + (let ((val (cond ((= b 110) 10) ; n + ((= b 116) 9) ; t + ((= b 114) 13) ; r + ((= b 92) 92) ; \\ + ((= b 39) 39) ; ' + ((= b 34) 34) ; " + ((= b 48) 0) ; 0 (already handled by octal but be safe) + ((= b 97) 7) ; \a -> BEL + ((= b 98) 8) ; \b + ((= b 102) 12) ; \f + ((= b 118) 11) ; \v + ((= b 63) 63) ; \? + (else + (die start-loc "unknown escape" (bv-of-byte b)))))) + (%escape-result mode val (%pk-pos p) (%pk-line p) (%pk-col p) + src file start-loc acc)))))) + +(define (%collect-octals src pos line col k acc) + ;; Collect up to k octal-digit bytes into acc (as a forward list of + ;; ASCII bytes), returning (digits npos nline ncol). + (cond + ((zero? k) (list (reverse acc) pos line col)) + (else + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (if (and b (%octal? b)) + (%collect-octals src (%pk-pos p) (%pk-line p) (%pk-col p) + (- k 1) (cons b acc)) + (list (reverse acc) pos line col)))))) +(define (%escape-result mode val pos line col src file start-loc acc) + (cond + ((eq? mode 'string) + (%collect-string src pos line col file start-loc (cons val acc))) + ((eq? mode 'char) + (list val pos line col)) + (else (die start-loc "internal: bad escape mode")))) + +;; -------------------------------------------------------------------- +;; Punctuator reader. +;; +;; Greedy longest-match against %punct-alist (cc/data.scm). The alist +;; is already ordered longest-first. +;; -------------------------------------------------------------------- (define (lex-read-punct src pos file) - ;; Greedy longest-match against %punct-alist (cc/data.scm). - (error "TBD: lex-read-punct")) + (%lex-read-punct src pos 1 (+ pos 1) file)) + +(define (%lex-read-punct src pos line col file) + (let ((start-loc (%loc file line col))) + (%punct-loop src pos line col file start-loc %punct-alist))) + +(define (%punct-loop src pos line col file start-loc al) + (cond + ((null? al) + (let* ((p (%lex-peek src pos line col))) + (die start-loc "unrecognized byte" + (if (%pk-byte p) (bv-of-byte (%pk-byte p)) "EOF")))) + (else + (let* ((entry (car al)) + (pat (car entry)) + (sym (cdr entry)) + (m (%match-bytes src pos line col pat 0))) + (if m + (cons (make-tok 'PUNCT sym start-loc) m) + (%punct-loop src pos line col file start-loc (cdr al))))))) + +(define (%match-bytes src pos line col pat i) + ;; If the next bytes from (pos line col), in logical-byte stream + ;; order, equal `pat[i..]`, return (npos nline ncol) after the + ;; match. Otherwise #f. + (cond + ((= i (bytevector-length pat)) (list pos line col)) + (else + (let* ((p (%lex-peek src pos line col)) + (b (%pk-byte p))) + (cond + ((not b) #f) + ((= b (bytevector-u8-ref pat i)) + (%match-bytes src (%pk-pos p) (%pk-line p) (%pk-col p) pat (+ i 1))) + (else #f)))))) + +;; -------------------------------------------------------------------- +;; lex-tokenize src file -> list of tok ending in EOF. +;; -------------------------------------------------------------------- +(define (lex-tokenize src file) + (%lex-loop src 0 1 1 file '())) + +(define (%lex-loop src pos line col file acc) + ;; Skip non-newline whitespace + comments first, then dispatch on + ;; the next logical byte. + (let* ((sw (%skip-ws-and-comments src pos line col file)) + (pos1 (car sw)) + (line1 (car (cdr sw))) + (col1 (car (cdr (cdr sw)))) + (p (%lex-peek src pos1 line1 col1)) + (b (%pk-byte p))) + (cond + ;; EOF + ((not b) + (let* ((eof-tok (make-tok 'EOF #f (%loc file line1 col1)))) + (reverse (cons eof-tok acc)))) + ;; Newline → emit NL, keep going. + ((%newline? b) + (let ((nl (make-tok 'NL #f (%loc file line1 col1)))) + (%lex-loop src (%pk-pos p) (%pk-line p) (%pk-col p) + file (cons nl acc)))) + ;; Identifier / keyword + ((%ident-start? b) + (let* ((r (%lex-read-ident src pos1 line1 col1 file)) + (tok (car r)) + (npos (car (cdr r))) + (nline (car (cdr (cdr r)))) + (ncol (car (cdr (cdr (cdr r)))))) + (%lex-loop src npos nline ncol file (cons tok acc)))) + ;; Number (digit start) + ((%digit? b) + (let* ((r (%lex-read-number src pos1 line1 col1 file)) + (tok (car r)) + (npos (car (cdr r))) + (nline (car (cdr (cdr r)))) + (ncol (car (cdr (cdr (cdr r)))))) + (%lex-loop src npos nline ncol file (cons tok acc)))) + ;; '.' might start a number (1.0 actually starts with digit; .5 + ;; would route here). We keep this as a punctuator unless followed + ;; by a digit, in which case the lexer rejects per spec. + ((= b 46) + (let* ((q (%lex-peek src (%pk-pos p) (%pk-line p) (%pk-col p))) + (b2 (%pk-byte q))) + (cond + ((and b2 (%digit? b2)) + (die (%loc file line1 col1) "floating-point literal not supported")) + (else + (let* ((r (%lex-read-punct src pos1 line1 col1 file)) + (tok (car r)) + (npos (car (cdr r))) + (nline (car (cdr (cdr r)))) + (ncol (car (cdr (cdr (cdr r)))))) + (%lex-loop src npos nline ncol file (cons tok acc))))))) + ;; String + ((= b 34) + (let* ((r (%lex-read-string src pos1 line1 col1 file)) + (tok (car r)) + (npos (car (cdr r))) + (nline (car (cdr (cdr r)))) + (ncol (car (cdr (cdr (cdr r)))))) + (%lex-loop src npos nline ncol file (cons tok acc)))) + ;; Char + ((= b 39) + (let* ((r (%lex-read-char src pos1 line1 col1 file)) + (tok (car r)) + (npos (car (cdr r))) + (nline (car (cdr (cdr r)))) + (ncol (car (cdr (cdr (cdr r)))))) + (%lex-loop src npos nline ncol file (cons tok acc)))) + ;; Punctuator (default) + (else + (let* ((r (%lex-read-punct src pos1 line1 col1 file)) + (tok (car r)) + (npos (car (cdr r))) + (nline (car (cdr (cdr r)))) + (ncol (car (cdr (cdr (cdr r)))))) + (%lex-loop src npos nline ncol file (cons tok acc))))))) diff --git a/cc/parse.scm b/cc/parse.scm @@ -1,74 +1,979 @@ -;; cc/parse.scm — recursive-descent + Pratt expression parser. -;; xcc-style direct emit: no AST. Calls cg as it parses. -;; -;; Realization of docs/CC-INTERNALS.md §parse.scm. Conversion- -;; responsibility split per CC-CONTRACTS §4. -;; -;; Owner: <unassigned> - -;; -------------------------------------------------------------------- -;; parse-translation-unit — single entry point -;; ps : pstate with ps-toks set (output of pp-expand) and ps-cg -;; initialized via (cg-init). All other ps-fields start empty. -;; Mutates ps until ps-toks reaches a single (tok kind=EOF). -;; -------------------------------------------------------------------- +;; cc/parse.scm — recursive-descent + Pratt parser. Minimal scheme1. + +(define (make-pstate toks cg) + (%pstate toks (list '()) (list '()) '() #f '() cg)) + +(define (peek ps) (car (ps-toks ps))) +(define (peek2 ps) + (let ((r (cdr (ps-toks ps)))) + (if (null? r) (car (ps-toks ps)) (car r)))) +(define (advance ps) + (let ((t (peek ps))) (ps-toks-set! ps (cdr (ps-toks ps))) t)) +(define (at-kw? ps s) + (let ((t (peek ps))) + (and (eq? (tok-kind t) 'KW) (eq? (tok-value t) s)))) +(define (at-punct? ps s) + (let ((t (peek ps))) + (and (eq? (tok-kind t) 'PUNCT) (eq? (tok-value t) s)))) +(define (expect-kw ps s) + (let ((t (peek ps))) + (if (and (eq? (tok-kind t) 'KW) (eq? (tok-value t) s)) + (advance ps) (die (tok-loc t) "expected kw" s)))) +(define (expect-punct ps s) + (let ((t (peek ps))) + (if (and (eq? (tok-kind t) 'PUNCT) (eq? (tok-value t) s)) + (advance ps) (die (tok-loc t) "expected punct" s)))) + +(define (scope-enter! ps) + (ps-scope-set! ps (cons '() (ps-scope ps))) + (ps-tags-set! ps (cons '() (ps-tags ps)))) +(define (scope-leave! ps) + (ps-scope-set! ps (cdr (ps-scope ps))) + (ps-tags-set! ps (cdr (ps-tags ps)))) +(define (scope-bind! ps n s) + (let* ((f (ps-scope ps)) (top (car f)) (r (cdr f))) + (if (alist-ref n top) (die #f "dup decl" n) + (ps-scope-set! ps (cons (alist-set n s top) r))))) +(define (scope-lookup ps n) + (let loop ((f (ps-scope ps))) + (cond ((null? f) #f) + (else + (let ((v (alist-ref n (car f)))) + (if v v (loop (cdr f)))))))) +(define (tag-bind! ps n c) + (let* ((f (ps-tags ps)) (top (car f)) (r (cdr f))) + (ps-tags-set! ps (cons (alist-set n c top) r)))) +(define (tag-lookup ps n) + (let loop ((f (ps-tags ps))) + (cond ((null? f) #f) + (else (let ((v (alist-ref n (car f)))) + (if v v (loop (cdr f)))))))) +(define (typedef-add! ps n) + (ps-typedefs-set! ps (alist-set n #t (ps-typedefs ps)))) +(define (typedef? ps n) + (if (alist-ref n (ps-typedefs ps)) #t #f)) + +(define (%mk-ptr p) (%ctype 'ptr 8 8 p)) +(define (%mk-arr e n) + (%ctype 'arr (if (< n 0) -1 (* n (ctype-size e))) + (ctype-align e) (cons e n))) +(define (%mk-fn r p v) (%ctype 'fn -1 -1 (list r p v))) +(define (ctype-is-ptr? t) (eq? (ctype-kind t) 'ptr)) +(define (ctype-is-fn? t) (eq? (ctype-kind t) 'fn)) +(define (ctype-is-arr? t) (eq? (ctype-kind t) 'arr)) + +(define (eat-cv-quals! ps) + (cond ((or (at-kw? ps 'const) (at-kw? ps 'volatile) + (at-kw? ps 'restrict)) + (advance ps) (eat-cv-quals! ps)) + (else #t))) + +(define (parse-decl-spec ps) + (let loop ((sto #f) (sn #f) (lg 0) (b #f) (saw #f)) + (let ((t (peek ps))) + (cond + ((or (at-kw? ps 'auto) (at-kw? ps 'register)) + (advance ps) (loop sto sn lg b #t)) + ((at-kw? ps 'static) (advance ps) (loop 'static sn lg b #t)) + ((at-kw? ps 'extern) (advance ps) (loop 'extern sn lg b #t)) + ((at-kw? ps 'typedef) (advance ps) (loop 'typedef sn lg b #t)) + ((or (at-kw? ps 'const) (at-kw? ps 'volatile) + (at-kw? ps 'restrict) (at-kw? ps 'inline)) + (advance ps) (loop sto sn lg b #t)) + ((at-kw? ps 'signed) (advance ps) (loop sto 'signed lg b #t)) + ((at-kw? ps 'unsigned) (advance ps) (loop sto 'unsigned lg b #t)) + ((at-kw? ps 'short) (advance ps) (loop sto sn -1 b #t)) + ((at-kw? ps 'long) (advance ps) (loop sto sn (+ lg 1) b #t)) + ((at-kw? ps 'void) (advance ps) (loop sto sn lg 'void #t)) + ((at-kw? ps 'char) (advance ps) (loop sto sn lg 'char #t)) + ((at-kw? ps 'int) (advance ps) (loop sto sn lg 'int #t)) + ((at-kw? ps '_Bool) (advance ps) (loop sto sn lg 'bool #t)) + ((or (at-kw? ps 'float) (at-kw? ps 'double) + (at-kw? ps '_Complex) (at-kw? ps '_Imaginary)) + (die (tok-loc t) "no float" (tok-value t))) + ((or (at-kw? ps '_Atomic) (at-kw? ps '_Thread_local) + (at-kw? ps '_Alignas) (at-kw? ps '_Generic) + (at-kw? ps '_Alignof) (at-kw? ps '_Static_assert)) + (die (tok-loc t) "rejected" (tok-value t))) + ((at-kw? ps 'struct) + (loop sto sn lg (parse-aggregate-spec ps 'struct) #t)) + ((at-kw? ps 'union) + (loop sto sn lg (parse-aggregate-spec ps 'union) #t)) + ((at-kw? ps 'enum) + (loop sto sn lg (parse-enum-spec ps) #t)) + ((and (not b) (eq? (tok-kind t) 'IDENT) + (typedef? ps (tok-value t))) + (let* ((tk (advance ps)) (sm (scope-lookup ps (tok-value tk)))) + (if (and sm (eq? (sym-kind sm) 'typedef)) + (loop sto sn lg (sym-type sm) #t) + (die (tok-loc tk) "typedef no sym" (tok-value tk))))) + (else + (cond ((not saw) (die (tok-loc t) "expected decl-spec" + (tok-value t))) + (else (cons sto (resolve-base t sn lg b))))))))) + +(define (resolve-base loc sn lg b) + (cond + ((eq? b 'void) + (if (or sn (not (zero? lg))) (die loc "void+qual") %t-void)) + ((eq? b 'bool) + (if (or sn (not (zero? lg))) (die loc "bool+qual") %t-bool)) + ((eq? b 'char) + (cond ((eq? sn 'unsigned) %t-u8) (else %t-i8))) + ((or (eq? b 'int) (and (not b) (or sn (not (zero? lg))))) + (cond ((= lg -1) (if (eq? sn 'unsigned) %t-u16 %t-i16)) + ((= lg 0) (if (eq? sn 'unsigned) %t-u32 %t-i32)) + (else (if (eq? sn 'unsigned) %t-u64 %t-i64)))) + ((ctype? b) + (if (or sn (not (zero? lg))) (die loc "type+qual") b)) + (else (die loc "unknown decl-spec")))) + +(define (parse-aggregate-spec ps kind) + (advance ps) + (let ((tag (cond ((eq? (tok-kind (peek ps)) 'IDENT) + (tok-value (advance ps))) (else #f)))) + (cond + ((at-punct? ps 'lbrace) + (advance ps) + (let* ((ex (and tag (tag-lookup ps tag))) + (ct (cond ((and ex (eq? (ctype-kind ex) kind)) ex) + (else (let ((c (%ctype kind -1 -1 + (list (or tag #f) #f '())))) + (if tag (tag-bind! ps tag c)) c)))) + (fields (parse-struct-fields ps))) + (expect-punct ps 'rbrace) + (complete-agg! ct kind tag fields) ct)) + (tag (let ((ex (tag-lookup ps tag))) + (cond (ex ex) + (else (let ((c (%ctype kind -1 -1 + (list tag #f '())))) + (tag-bind! ps tag c) c))))) + (else (die (tok-loc (peek ps)) "anon agg"))))) + +(define (parse-struct-fields ps) + (let loop ((acc '()) (off 0)) + (cond + ((at-punct? ps 'rbrace) (reverse acc)) + (else + (let ((spec (parse-decl-spec ps))) + (let dl ((acc2 acc) (o2 off)) + (let* ((p (parse-declarator ps (cdr spec))) + (nm (car p)) (ty (cdr p)) + (al (max (ctype-align ty) 1)) + (sz (ctype-size ty)) + (oa (align-up o2 al))) + (cond + ((at-punct? ps 'comma) + (advance ps) + (dl (cons (list nm ty oa) acc2) + (+ oa (max sz 0)))) + ((at-punct? ps 'semi) + (advance ps) + (loop (cons (list nm ty oa) acc2) + (+ oa (max sz 0)))) + (else (die (tok-loc (peek ps)) "field")))))))))) + +(define (complete-agg! ct k tag fs) + (let* ((ma (let m ((xs fs) (a 1)) + (if (null? xs) a + (m (cdr xs) (max a (ctype-align (cadr (car xs)))))))) + (last (let l ((xs fs) (e 0)) + (if (null? xs) e + (let* ((f (car xs)) (off (car (cddr f))) + (sz (ctype-size (cadr f)))) + (l (cdr xs) (max e (+ off (max sz 0)))))))) + (sz (cond ((eq? k 'union) + (let u ((xs fs) (s 0)) + (if (null? xs) s + (u (cdr xs) + (max s (ctype-size (cadr (car xs)))))))) + (else (align-up last ma))))) + (ctype-size-set! ct sz) + (ctype-align-set! ct ma) + (ctype-ext-set! ct (list tag #t fs)))) + +(define (parse-enum-spec ps) + (advance ps) + (let ((tag (cond ((eq? (tok-kind (peek ps)) 'IDENT) + (tok-value (advance ps))) (else #f)))) + (cond + ((at-punct? ps 'lbrace) + (advance ps) + (let ((ct (%ctype 'enum 4 4 (list tag '())))) + (if tag (tag-bind! ps tag ct)) + (let loop ((vs '()) (nv 0)) + (cond + ((at-punct? ps 'rbrace) + (advance ps) + (ctype-ext-set! ct (list tag (reverse vs))) ct) + (else + (let* ((nt (advance ps)) (nm (tok-value nt)) + (val (cond ((at-punct? ps 'assign) + (advance ps) (parse-const-int ps)) + (else nv)))) + (scope-bind! ps nm + (%sym nm 'enum-const #f %t-i32 val)) + (cond ((at-punct? ps 'comma) (advance ps)) + ((at-punct? ps 'rbrace) #t) + (else (die (tok-loc (peek ps)) "enum"))) + (loop (cons (cons nm val) vs) (+ val 1)))))))) + (tag (let ((e (tag-lookup ps tag))) + (cond (e e) + (else (let ((c (%ctype 'enum 4 4 (list tag '())))) + (tag-bind! ps tag c) c))))) + (else (die (tok-loc (peek ps)) "enum"))))) + +(define (parse-const-int ps) + (let ((t (peek ps))) + (cond + ((eq? (tok-kind t) 'INT) (tok-value (advance ps))) + ((and (eq? (tok-kind t) 'PUNCT) (eq? (tok-value t) 'minus)) + (advance ps) (- 0 (parse-const-int ps))) + ((and (eq? (tok-kind t) 'PUNCT) (eq? (tok-value t) 'plus)) + (advance ps) (parse-const-int ps)) + ((eq? (tok-kind t) 'IDENT) + (let ((sm (scope-lookup ps (tok-value t)))) + (cond ((and sm (eq? (sym-kind sm) 'enum-const)) + (advance ps) (sym-slot sm)) + (else (die (tok-loc t) "const?" (tok-value t)))))) + (else (die (tok-loc t) "const?" (tok-value t)))))) + +(define (parse-declarator ps base) + ((cdr (parse-decl-cont ps)) base + (lambda (n t) (cons n t)))) + +(define (parse-decl-cont ps) + (cond + ((at-punct? ps 'star) + (advance ps) (eat-cv-quals! ps) + (let* ((r (parse-decl-cont ps)) (rf (cdr r))) + (cons (car r) (lambda (b k) (rf (%mk-ptr b) k))))) + ((and (at-punct? ps 'lparen) (paren-is-group? ps)) + (advance ps) + (let* ((i (parse-decl-cont ps)) (if- (cdr i))) + (expect-punct ps 'rparen) + (let ((s (parse-decl-suf-cont ps))) + (cons (car i) (lambda (b k) (if- (s b) k)))))) + ((eq? (tok-kind (peek ps)) 'IDENT) + (let* ((tk (advance ps)) (n (tok-value tk)) + (s (parse-decl-suf-cont ps))) + (cons n (lambda (b k) (k n (s b)))))) + (else + (let ((s (parse-decl-suf-cont ps))) + (cons #f (lambda (b k) (k #f (s b)))))))) + +(define (parse-decl-suf-cont ps) + (cond + ((at-punct? ps 'lbrack) + (advance ps) + (let* ((ln (cond ((at-punct? ps 'rbrack) -1) + (else (parse-const-int ps)))) + (_ (expect-punct ps 'rbrack)) + (r (parse-decl-suf-cont ps))) + (lambda (b) (r (%mk-arr b ln))))) + ((at-punct? ps 'lparen) + (advance ps) + (let* ((res (parse-fn-params ps)) + (p (car res)) (v (cdr res))) + (expect-punct ps 'rparen) + (let ((r (parse-decl-suf-cont ps))) + (lambda (b) (r (%mk-fn b p v)))))) + (else (lambda (b) b)))) + +(define (paren-is-group? ps) + (let ((t (peek2 ps))) + (cond + ((eq? (tok-kind t) 'KW) + (let ((v (tok-value t))) + (cond ((or (eq? v 'void) (eq? v 'char) (eq? v 'short) + (eq? v 'int) (eq? v 'long) (eq? v 'signed) + (eq? v 'unsigned) (eq? v '_Bool) + (eq? v 'struct) (eq? v 'union) (eq? v 'enum) + (eq? v 'const) (eq? v 'volatile) + (eq? v 'restrict) (eq? v 'static) + (eq? v 'extern) (eq? v 'register)) #f) + (else #t)))) + ((eq? (tok-kind t) 'IDENT) + (cond ((typedef? ps (tok-value t)) #f) (else #t))) + ((eq? (tok-kind t) 'PUNCT) + (let ((v (tok-value t))) + (cond ((eq? v 'rparen) #f) + ((or (eq? v 'star) (eq? v 'lparen) (eq? v 'lbrack)) #t) + (else #f)))) + (else #f)))) + +(define (parse-fn-params ps) + (cond + ((at-punct? ps 'rparen) (cons '() #f)) + ((and (at-kw? ps 'void) + (eq? (tok-kind (peek2 ps)) 'PUNCT) + (eq? (tok-value (peek2 ps)) 'rparen)) + (advance ps) (cons '() #f)) + (else + (let loop ((acc '())) + (cond + ((at-punct? ps 'ellipsis) + (advance ps) (cons (reverse acc) #t)) + (else + (let* ((sp (parse-decl-spec ps)) + (p (parse-declarator ps (cdr sp))) + (nm (car p)) (ty (cdr p)) + (ty2 (cond ((ctype-is-arr? ty) + (%mk-ptr (car (ctype-ext ty)))) + ((ctype-is-fn? ty) (%mk-ptr ty)) + (else ty)))) + (cond + ((at-punct? ps 'comma) + (advance ps) (loop (cons (cons nm ty2) acc))) + ((at-punct? ps 'rparen) + (cons (reverse (cons (cons nm ty2) acc)) #f)) + (else (die (tok-loc (peek ps)) "param")))))))))) + (define (parse-translation-unit ps) - (error "TBD: parse-translation-unit")) - -;; -------------------------------------------------------------------- -;; Top-level alternates -;; -------------------------------------------------------------------- -(define (parse-decl-or-fn ps) (error "TBD: parse-decl-or-fn")) -(define (parse-decl-spec ps) (error "TBD: parse-decl-spec")) -(define (parse-declarator ps base-type) (error "TBD: parse-declarator")) -(define (parse-init-list ps target-type) (error "TBD: parse-init-list")) -(define (parse-fn-body ps name decl-type) (error "TBD: parse-fn-body")) - -;; -------------------------------------------------------------------- -;; Statements -;; -------------------------------------------------------------------- -(define (parse-stmt ps) (error "TBD: parse-stmt")) -(define (parse-compound-stmt ps) (error "TBD: parse-compound-stmt")) -(define (parse-if-stmt ps) (error "TBD: parse-if-stmt")) -(define (parse-while-stmt ps) (error "TBD: parse-while-stmt")) -(define (parse-do-stmt ps) (error "TBD: parse-do-stmt")) -(define (parse-for-stmt ps) (error "TBD: parse-for-stmt")) -(define (parse-switch-stmt ps) (error "TBD: parse-switch-stmt")) -(define (parse-return-stmt ps) (error "TBD: parse-return-stmt")) -(define (parse-goto-stmt ps) (error "TBD: parse-goto-stmt")) -(define (parse-expr-stmt ps) (error "TBD: parse-expr-stmt")) - -;; -------------------------------------------------------------------- -;; Expressions — Pratt climber -;; -------------------------------------------------------------------- -(define (parse-expr ps) (parse-expr-bp ps 0)) -(define (parse-expr-bp ps min-bp) (error "TBD: parse-expr-bp")) -(define (parse-primary ps) (error "TBD: parse-primary")) -(define (parse-postfix ps) (error "TBD: parse-postfix")) -(define (parse-unary ps) (error "TBD: parse-unary")) -(define (parse-cast-or-unary ps) (error "TBD: parse-cast-or-unary")) - -;; -------------------------------------------------------------------- -;; Token stream — peek / advance / expect (private convention) -;; -------------------------------------------------------------------- -(define (peek ps) (error "TBD: peek")) -(define (peek2 ps) (error "TBD: peek2")) -(define (advance ps) (error "TBD: advance")) -(define (expect-kw ps sym) (error "TBD: expect-kw")) -(define (expect-punct ps sym) (error "TBD: expect-punct")) -(define (at-kw? ps sym) (error "TBD: at-kw?")) -(define (at-punct? ps sym) (error "TBD: at-punct?")) - -;; -------------------------------------------------------------------- -;; Scope helpers -;; -------------------------------------------------------------------- -(define (scope-enter! ps) (error "TBD: scope-enter!")) -(define (scope-leave! ps) (error "TBD: scope-leave!")) -(define (scope-bind! ps name sym) (error "TBD: scope-bind!")) -(define (scope-lookup ps name) (error "TBD: scope-lookup")) - -(define (tag-bind! ps name ctype) (error "TBD: tag-bind!")) -(define (tag-lookup ps name) (error "TBD: tag-lookup")) - -(define (typedef-add! ps name) (error "TBD: typedef-add!")) -(define (typedef? ps name) (error "TBD: typedef?")) + (cond + ((eq? (tok-kind (peek ps)) 'EOF) #t) + (else (parse-decl-or-fn ps) (parse-translation-unit ps)))) + +(define (parse-decl-or-fn ps) + (let* ((sp (parse-decl-spec ps)) + (sto (car sp)) (b (cdr sp))) + (cond + ((at-punct? ps 'semi) (advance ps) 'decl) + (else + (let* ((p (parse-declarator ps b)) + (n (car p)) (t (cdr p))) + (cond + ((and (ctype-is-fn? t) (at-punct? ps 'lbrace)) + (parse-fn-body ps n t) 'fn) + (else + (handle-decl ps sto n t) + (let lp () + (cond + ((at-punct? ps 'comma) + (advance ps) + (let* ((p2 (parse-declarator ps b)) + (n2 (car p2)) (t2 (cdr p2))) + (handle-decl ps sto n2 t2) (lp))) + (else (expect-punct ps 'semi) 'decl)))))))))) + +(define (handle-decl ps sto n ty) + (cond + ((not n) (die #f "no name")) + ((eq? sto 'typedef) + (typedef-add! ps n) + (scope-bind! ps n (%sym n 'typedef #f ty #f))) + ((ctype-is-fn? ty) + (scope-bind! ps n + (%sym n 'fn (or sto 'extern) ty + (bytevector-append "cc__" n)))) + (else + (cond + ((not (ps-fn-ctx ps)) + (let ((sm (%sym n 'var (or sto 'extern) ty + (bytevector-append "cc__" n)))) + (scope-bind! ps n sm) + (cond + ((at-punct? ps 'assign) + (advance ps) + (cg-emit-global (ps-cg ps) sm + (parse-init-list ps ty))) + ((eq? sto 'extern) (cg-emit-extern (ps-cg ps) sm)) + (else (cg-emit-global (ps-cg ps) sm #f))))) + (else + (let* ((sz (max (ctype-size ty) 1)) + (al (max (ctype-align ty) 1)) + (sl (cg-alloc-slot (ps-cg ps) sz al)) + (sm (%sym n 'var (or sto 'auto) ty sl))) + (scope-bind! ps n sm) + (cond + ((at-punct? ps 'assign) + (advance ps) + (cg-push-sym (ps-cg ps) sm) + (parse-expr-bp ps 4) (rval! ps) + (cg-cast (ps-cg ps) ty) + (cg-assign (ps-cg ps)) + (cg-pop (ps-cg ps))) + (else #t)))))))) + +(define (parse-init-list ps ty) + (cond + ((at-punct? ps 'lbrace) + (advance ps) + (let lp ((d 1)) + (cond + ((<= d 0) #f) + (else + (let ((t (advance ps))) + (cond + ((eq? (tok-kind t) 'EOF) (die (tok-loc t) "EOF init")) + ((and (eq? (tok-kind t) 'PUNCT) + (eq? (tok-value t) 'lbrace)) (lp (+ d 1))) + ((and (eq? (tok-kind t) 'PUNCT) + (eq? (tok-value t) 'rbrace)) (lp (- d 1))) + (else (lp d)))))))) + (else (parse-const-int ps) #f))) + +(define (parse-fn-body ps name dt) + (let* ((e (ctype-ext dt)) (ret (car e)) + (par (cadr e)) (var (car (cddr e)))) + (cond ((not (scope-lookup ps name)) + (scope-bind! ps name + (%sym name 'fn 'extern dt + (bytevector-append "cc__" name))))) + (let ((psyms (cg-fn-begin (ps-cg ps) name par ret))) + (ps-fn-ctx-set! ps + (%fn-ctx name ret (map cdr psyms) var '())) + (scope-enter! ps) + (for-each (lambda (p) (scope-bind! ps (car p) (cdr p))) + psyms) + (expect-punct ps 'lbrace) + (parse-cstmt-body ps) + (expect-punct ps 'rbrace) + (scope-leave! ps) + (ps-fn-ctx-set! ps #f) + (cg-fn-end (ps-cg ps))))) + +(define (parse-stmt ps) + (cond + ((at-punct? ps 'lbrace) (parse-cstmt ps)) + ((at-kw? ps 'if) (parse-if-stmt ps)) + ((at-kw? ps 'while) (parse-while-stmt ps)) + ((at-kw? ps 'do) (parse-do-stmt ps)) + ((at-kw? ps 'for) (parse-for-stmt ps)) + ((at-kw? ps 'switch) (parse-switch-stmt ps)) + ((at-kw? ps 'return) (parse-return-stmt ps)) + ((at-kw? ps 'goto) (parse-goto-stmt ps)) + ((at-kw? ps 'break) + (advance ps) (expect-punct ps 'semi) (do-break ps)) + ((at-kw? ps 'continue) + (advance ps) (expect-punct ps 'semi) (do-continue ps)) + ((at-kw? ps 'case) (parse-case-stmt ps)) + ((at-kw? ps 'default) (parse-default-stmt ps)) + ((and (eq? (tok-kind (peek ps)) 'IDENT) + (eq? (tok-kind (peek2 ps)) 'PUNCT) + (eq? (tok-value (peek2 ps)) 'colon)) + (parse-labelled-stmt ps)) + ((stmt-starts-decl? ps) (parse-local-decl ps)) + (else (parse-expr-stmt ps)))) + +(define (stmt-starts-decl? ps) + (let ((t (peek ps))) + (cond + ((eq? (tok-kind t) 'KW) + (let ((v (tok-value t))) + (or (eq? v 'auto) (eq? v 'register) (eq? v 'static) + (eq? v 'extern) (eq? v 'typedef) (eq? v 'const) + (eq? v 'volatile) (eq? v 'restrict) (eq? v 'inline) + (eq? v 'void) (eq? v 'char) (eq? v 'short) (eq? v 'int) + (eq? v 'long) (eq? v 'signed) (eq? v 'unsigned) + (eq? v '_Bool) (eq? v 'struct) (eq? v 'union) + (eq? v 'enum)))) + ((eq? (tok-kind t) 'IDENT) (typedef? ps (tok-value t))) + (else #f)))) + +(define (parse-local-decl ps) + (let* ((sp (parse-decl-spec ps)) + (sto (car sp)) (b (cdr sp))) + (cond + ((at-punct? ps 'semi) (advance ps) #t) + (else + (let lp () + (let* ((p (parse-declarator ps b)) + (n (car p)) (t (cdr p))) + (handle-decl ps sto n t) + (cond ((at-punct? ps 'comma) (advance ps) (lp)) + (else (expect-punct ps 'semi) #t)))))))) + +(define (parse-cstmt ps) + (expect-punct ps 'lbrace) + (scope-enter! ps) + (parse-cstmt-body ps) + (scope-leave! ps) + (expect-punct ps 'rbrace) #t) + +(define (parse-cstmt-body ps) + (cond + ((at-punct? ps 'rbrace) #t) + ((eq? (tok-kind (peek ps)) 'EOF) + (die (tok-loc (peek ps)) "EOF in cstmt")) + (else (parse-stmt ps) (parse-cstmt-body ps)))) + +(define (parse-compound-stmt ps) (parse-cstmt ps)) + +(define (parse-if-stmt ps) + (expect-kw ps 'if) + (expect-punct ps 'lparen) + (parse-expr ps) (rval! ps) + (expect-punct ps 'rparen) + (cg-ifelse (ps-cg ps) + (lambda () (parse-stmt ps)) + (lambda () + (cond ((at-kw? ps 'else) + (advance ps) (parse-stmt ps)) + (else #t))))) + +(define %parse-tag-namer (make-namer "Lp")) + +(define (parse-while-stmt ps) + (expect-kw ps 'while) + (expect-punct ps 'lparen) + (let ((tag (%parse-tag-namer))) + (cg-loop (ps-cg ps) + (lambda () (parse-expr ps) (rval! ps)) + (lambda () + (expect-punct ps 'rparen) + (push-loop-ctx! ps 'while tag #t) + (parse-stmt ps) + (pop-loop-ctx! ps))) #t)) + +(define (parse-do-stmt ps) + (expect-kw ps 'do) + (let ((tag (%parse-tag-namer))) + (push-loop-ctx! ps 'do tag #t) + (cg-loop (ps-cg ps) + (lambda () #t) + (lambda () + (parse-stmt ps) + (expect-kw ps 'while) (expect-punct ps 'lparen) + (parse-expr ps) (rval! ps) + (expect-punct ps 'rparen) (expect-punct ps 'semi) + (cg-unop (ps-cg ps) 'lnot) + (cg-if (ps-cg ps) + (lambda () (cg-break (ps-cg ps) tag))))) + (pop-loop-ctx! ps) #t)) + +(define (parse-for-stmt ps) + (expect-kw ps 'for) (expect-punct ps 'lparen) + (scope-enter! ps) + (cond + ((at-punct? ps 'semi) (advance ps)) + ((stmt-starts-decl? ps) (parse-local-decl ps)) + (else (parse-expr ps) (cg-pop (ps-cg ps)) + (expect-punct ps 'semi))) + (let ((tag (%parse-tag-namer))) + (cg-loop (ps-cg ps) + (lambda () + (cond ((at-punct? ps 'semi) + (cg-push-imm (ps-cg ps) %t-i32 1)) + (else (parse-expr ps) (rval! ps))) + (expect-punct ps 'semi)) + (lambda () + (let ((stk (collect-til-rparen ps))) + (expect-punct ps 'rparen) + (push-loop-ctx! ps 'for tag #t) + (parse-stmt ps) + (pop-loop-ctx! ps) + (cond + ((null? stk) #t) + (else + (let ((sv (ps-toks ps))) + (ps-toks-set! ps + (append stk (list (make-tok 'EOF #f #f)))) + (parse-expr ps) (cg-pop (ps-cg ps)) + (ps-toks-set! ps sv)))))))) + (scope-leave! ps) #t) + +(define (collect-til-rparen ps) + (let loop ((acc '()) (d 0)) + (let ((t (peek ps))) + (cond + ((eq? (tok-kind t) 'EOF) + (die (tok-loc t) "EOF in for-step")) + ((and (zero? d) (eq? (tok-kind t) 'PUNCT) + (eq? (tok-value t) 'rparen)) (reverse acc)) + (else + (let ((nt (advance ps))) + (loop (cons nt acc) + (cond ((not (eq? (tok-kind nt) 'PUNCT)) d) + ((eq? (tok-value nt) 'lparen) (+ d 1)) + ((eq? (tok-value nt) 'rparen) (- d 1)) + (else d))))))))) + +(define (parse-switch-stmt ps) + (expect-kw ps 'switch) (expect-punct ps 'lparen) + (parse-expr ps) (rval! ps) + (expect-punct ps 'rparen) + (let ((sw (cg-switch-begin (ps-cg ps))) (tg (%parse-tag-namer))) + (push-loop-ctx-sw! ps 'switch tg sw) + (parse-stmt ps) + (pop-loop-ctx! ps) + (cg-switch-end (ps-cg ps) sw))) + +(define (parse-case-stmt ps) + (expect-kw ps 'case) + (let ((v (parse-const-int ps))) + (expect-punct ps 'colon) + (cg-switch-case (ps-cg ps) (innermost-sw ps) v) + (parse-stmt ps))) + +(define (parse-default-stmt ps) + (expect-kw ps 'default) (expect-punct ps 'colon) + (cg-switch-default (ps-cg ps) (innermost-sw ps)) + (parse-stmt ps)) + +(define (parse-return-stmt ps) + (expect-kw ps 'return) + (cond + ((at-punct? ps 'semi) (advance ps) (cg-return (ps-cg ps))) + (else + (parse-expr ps) (rval! ps) + (let ((fc (ps-fn-ctx ps))) + (cond + ((and fc (not (eq? (ctype-kind (fn-ctx-return-type fc)) 'void))) + (cg-cast (ps-cg ps) (fn-ctx-return-type fc))) + (else #t))) + (cg-return (ps-cg ps)) + (expect-punct ps 'semi)))) + +(define (parse-goto-stmt ps) + (expect-kw ps 'goto) + (let ((t (advance ps))) + (cond ((eq? (tok-kind t) 'IDENT) + (cg-break (ps-cg ps) + (bytevector-append "user_" (tok-value t)))) + (else (die (tok-loc t) "label?")))) + (expect-punct ps 'semi)) + +(define (parse-labelled-stmt ps) + (advance ps) (expect-punct ps 'colon) (parse-stmt ps)) + +(define (parse-expr-stmt ps) + (cond + ((at-punct? ps 'semi) (advance ps) #t) + (else (parse-expr ps) (cg-pop (ps-cg ps)) + (expect-punct ps 'semi)))) + +(define (push-loop-ctx! ps k tg hc) + (ps-loops-set! ps (cons (%loop-ctx k tg hc) (ps-loops ps)))) +(define (push-loop-ctx-sw! ps k tg sw) + (ps-loops-set! ps + (cons (%loop-ctx k (cons tg sw) #f) (ps-loops ps)))) +(define (pop-loop-ctx! ps) + (ps-loops-set! ps (cdr (ps-loops ps)))) +(define (do-break ps) + (let ((c (innermost-loop ps))) + (cond + ((not c) (die #f "break outside")) + ((eq? (loop-ctx-kind c) 'switch) + (cg-break (ps-cg ps) (car (loop-ctx-tag c)))) + (else (cg-break (ps-cg ps) (loop-ctx-tag c)))))) +(define (do-continue ps) + (let ((c (innermost-cont ps))) + (cond ((not c) (die #f "cont outside")) + (else (cg-continue (ps-cg ps) (loop-ctx-tag c)))))) +(define (innermost-loop ps) + (cond ((null? (ps-loops ps)) #f) (else (car (ps-loops ps))))) +(define (innermost-cont ps) + (let lp ((xs (ps-loops ps))) + (cond ((null? xs) #f) + ((eq? (loop-ctx-kind (car xs)) 'switch) (lp (cdr xs))) + (else (car xs))))) +(define (innermost-sw ps) + (let lp ((xs (ps-loops ps))) + (cond ((null? xs) (die #f "case outside switch")) + ((eq? (loop-ctx-kind (car xs)) 'switch) + (cdr (loop-ctx-tag (car xs)))) + (else (lp (cdr xs)))))) + +(define %binop-bp + (list + (cons 'assign (cons 4 3)) (cons 'plus-eq (cons 4 3)) + (cons 'minus-eq (cons 4 3)) (cons 'star-eq (cons 4 3)) + (cons 'slash-eq (cons 4 3)) (cons 'pct-eq (cons 4 3)) + (cons 'shl-eq (cons 4 3)) (cons 'shr-eq (cons 4 3)) + (cons 'amp-eq (cons 4 3)) (cons 'caret-eq (cons 4 3)) + (cons 'bar-eq (cons 4 3)) (cons 'qmark (cons 6 5)) + (cons 'lor (cons 10 11)) (cons 'land (cons 20 21)) + (cons 'bar (cons 30 31)) (cons 'caret (cons 40 41)) + (cons 'amp (cons 50 51)) + (cons 'eq2 (cons 60 61)) (cons 'ne (cons 60 61)) + (cons 'lt (cons 70 71)) (cons 'le (cons 70 71)) + (cons 'gt (cons 70 71)) (cons 'ge (cons 70 71)) + (cons 'shl (cons 80 81)) (cons 'shr (cons 80 81)) + (cons 'plus (cons 90 91)) (cons 'minus (cons 90 91)) + (cons 'star (cons 100 101)) (cons 'slash (cons 100 101)) + (cons 'pct (cons 100 101)))) + +(define (binop-bp-of s) (alist-ref/eq s %binop-bp)) + +(define (punct-to-cgop s) + (cond ((eq? s 'plus) 'add) ((eq? s 'minus) 'sub) + ((eq? s 'star) 'mul) ((eq? s 'slash) 'div) + ((eq? s 'pct) 'rem) ((eq? s 'amp) 'and) + ((eq? s 'bar) 'or) ((eq? s 'caret) 'xor) + ((eq? s 'shl) 'shl) ((eq? s 'shr) 'shr) + ((eq? s 'eq2) 'eq) ((eq? s 'ne) 'ne) + ((eq? s 'lt) 'lt) ((eq? s 'le) 'le) + ((eq? s 'gt) 'gt) ((eq? s 'ge) 'ge) + (else (die #f "binop" s)))) + +(define (compound-op s) + (cond ((eq? s 'plus-eq) 'add) ((eq? s 'minus-eq) 'sub) + ((eq? s 'star-eq) 'mul) ((eq? s 'slash-eq) 'div) + ((eq? s 'pct-eq) 'rem) ((eq? s 'shl-eq) 'shl) + ((eq? s 'shr-eq) 'shr) ((eq? s 'amp-eq) 'and) + ((eq? s 'caret-eq) 'xor) ((eq? s 'bar-eq) 'or) + (else #f))) + +(define (parse-expr ps) (parse-expr-bp ps 0)) + +(define (parse-expr-bp ps mn) + (parse-unary ps) (parse-binary-rhs ps mn)) + +(define (parse-binary-rhs ps mn) + (let ((t (peek ps))) + (cond + ((not (eq? (tok-kind t) 'PUNCT)) #t) + (else + (let ((bp (binop-bp-of (tok-value t)))) + (cond + ((not bp) #t) + ((< (car bp) mn) #t) + (else + (let ((op (tok-value t)) (rb (cdr bp))) + (advance ps) + (cond + ((eq? op 'assign) + (parse-expr-bp ps rb) (rval! ps) + (cg-assign (ps-cg ps))) + ((compound-op op) + (let ((b (compound-op op))) + (cg-take-addr (ps-cg ps)) + (cg-push-deref (ps-cg ps)) + (cg-load (ps-cg ps)) + (parse-expr-bp ps rb) (rval! ps) + (cg-arith-conv (ps-cg ps)) + (cg-binop (ps-cg ps) b) + (cg-assign (ps-cg ps)))) + ((eq? op 'qmark) + (rval! ps) + (cg-ifelse (ps-cg ps) + (lambda () + (parse-expr-bp ps 0) (rval! ps)) + (lambda () + (expect-punct ps 'colon) + (parse-expr-bp ps rb) (rval! ps)))) + ((eq? op 'land) + (rval! ps) + (cg-ifelse (ps-cg ps) + (lambda () + (parse-expr-bp ps rb) (rval! ps)) + (lambda () + (cg-push-imm (ps-cg ps) %t-i32 0)))) + ((eq? op 'lor) + (rval! ps) + (cg-ifelse (ps-cg ps) + (lambda () + (cg-push-imm (ps-cg ps) %t-i32 1)) + (lambda () + (parse-expr-bp ps rb) (rval! ps)))) + (else + (rval! ps) (cg-promote (ps-cg ps)) + (parse-expr-bp ps rb) (rval! ps) + (cg-promote (ps-cg ps)) + (cg-arith-conv (ps-cg ps)) + (cg-binop (ps-cg ps) (punct-to-cgop op)))) + (parse-binary-rhs ps mn))))))))) + +(define (parse-unary ps) + (let ((t (peek ps))) + (cond + ((eq? (tok-kind t) 'PUNCT) + (let ((v (tok-value t))) + (cond + ((eq? v 'amp) + (advance ps) (parse-unary ps) + (cg-take-addr (ps-cg ps))) + ((eq? v 'star) + (advance ps) (parse-unary ps) (rval! ps) + (cg-push-deref (ps-cg ps))) + ((eq? v 'plus) + (advance ps) (parse-unary ps) + (rval! ps) (cg-promote (ps-cg ps))) + ((eq? v 'minus) + (advance ps) (parse-unary ps) + (rval! ps) (cg-promote (ps-cg ps)) + (cg-unop (ps-cg ps) 'neg)) + ((eq? v 'tilde) + (advance ps) (parse-unary ps) + (rval! ps) (cg-promote (ps-cg ps)) + (cg-unop (ps-cg ps) 'bnot)) + ((eq? v 'bang) + (advance ps) (parse-unary ps) (rval! ps) + (cg-unop (ps-cg ps) 'lnot)) + ((eq? v 'inc) (advance ps) (parse-unary ps) + (cg-take-addr (ps-cg ps)) (cg-push-deref (ps-cg ps)) + (cg-load (ps-cg ps)) + (cg-push-imm (ps-cg ps) %t-i32 1) + (cg-binop (ps-cg ps) 'add) (cg-assign (ps-cg ps))) + ((eq? v 'dec) (advance ps) (parse-unary ps) + (cg-take-addr (ps-cg ps)) (cg-push-deref (ps-cg ps)) + (cg-load (ps-cg ps)) + (cg-push-imm (ps-cg ps) %t-i32 1) + (cg-binop (ps-cg ps) 'sub) (cg-assign (ps-cg ps))) + ((eq? v 'lparen) (parse-cast-or-unary ps)) + (else (parse-postfix ps))))) + ((and (eq? (tok-kind t) 'KW) (eq? (tok-value t) 'sizeof)) + (advance ps) + (cond + ((at-punct? ps 'lparen) + (advance ps) + (cond + ((token-is-decl? ps) + (let* ((sp (parse-decl-spec ps)) + (p (parse-declarator ps (cdr sp))) + (ty (cdr p))) + (expect-punct ps 'rparen) + (cg-push-imm (ps-cg ps) %t-u64 + (max (ctype-size ty) 0)))) + (else + (parse-expr ps) (expect-punct ps 'rparen) + (cg-pop (ps-cg ps)) + (cg-push-imm (ps-cg ps) %t-u64 8)))) + (else (parse-unary ps) (cg-pop (ps-cg ps)) + (cg-push-imm (ps-cg ps) %t-u64 8)))) + (else (parse-postfix ps))))) + +(define (token-is-decl? ps) + (let ((t (peek ps))) + (cond + ((eq? (tok-kind t) 'KW) + (let ((v (tok-value t))) + (or (eq? v 'void) (eq? v 'char) (eq? v 'short) (eq? v 'int) + (eq? v 'long) (eq? v 'signed) (eq? v 'unsigned) + (eq? v '_Bool) (eq? v 'struct) (eq? v 'union) + (eq? v 'enum) (eq? v 'const) (eq? v 'volatile) + (eq? v 'restrict) (eq? v 'inline)))) + ((eq? (tok-kind t) 'IDENT) (typedef? ps (tok-value t))) + (else #f)))) + +(define (parse-cast-or-unary ps) + (let ((t (peek2 ps))) + (cond + ((and (eq? (tok-kind t) 'KW) + (let ((v (tok-value t))) + (or (eq? v 'void) (eq? v 'char) (eq? v 'short) + (eq? v 'int) (eq? v 'long) (eq? v 'signed) + (eq? v 'unsigned) (eq? v '_Bool) + (eq? v 'struct) (eq? v 'union) (eq? v 'enum) + (eq? v 'const) (eq? v 'volatile) + (eq? v 'restrict)))) + (advance ps) + (let* ((sp (parse-decl-spec ps)) + (p (parse-declarator ps (cdr sp))) + (ty (cdr p))) + (expect-punct ps 'rparen) + (parse-unary ps) + (cond ((not (ctype-is-ptr? ty)) (rval! ps)) (else #t)) + (cg-cast (ps-cg ps) ty))) + ((and (eq? (tok-kind t) 'IDENT) (typedef? ps (tok-value t))) + (advance ps) + (let* ((sp (parse-decl-spec ps)) + (p (parse-declarator ps (cdr sp))) + (ty (cdr p))) + (expect-punct ps 'rparen) + (parse-unary ps) + (cond ((not (ctype-is-ptr? ty)) (rval! ps)) (else #t)) + (cg-cast (ps-cg ps) ty))) + (else (advance ps) (parse-expr ps) + (expect-punct ps 'rparen) + (parse-postfix-rest ps))))) + +(define (parse-postfix ps) + (parse-primary ps) (parse-postfix-rest ps)) + +(define (parse-postfix-rest ps) + (let lp () + (let ((t (peek ps))) + (cond + ((not (eq? (tok-kind t) 'PUNCT)) #t) + (else + (let ((v (tok-value t))) + (cond + ((eq? v 'lbrack) + (advance ps) (rval! ps) + (parse-expr ps) (rval! ps) + (expect-punct ps 'rbrack) + (cg-binop (ps-cg ps) 'add) + (cg-push-deref (ps-cg ps)) (lp)) + ((eq? v 'lparen) + (advance ps) (rval-not-fn! ps) + (let ((n (parse-call-args ps))) + (expect-punct ps 'rparen) + (cg-call (ps-cg ps) n #t) + (lp))) + ((eq? v 'dot) + (advance ps) (advance ps) + (cg-push-imm (ps-cg ps) %t-i64 0) + (cg-binop (ps-cg ps) 'add) + (cg-push-deref (ps-cg ps)) (lp)) + ((eq? v 'arrow) + (advance ps) (advance ps) (rval! ps) + (cg-push-imm (ps-cg ps) %t-i64 0) + (cg-binop (ps-cg ps) 'add) + (cg-push-deref (ps-cg ps)) (lp)) + ((eq? v 'inc) + (advance ps) + (cg-take-addr (ps-cg ps)) (cg-push-deref (ps-cg ps)) + (cg-load (ps-cg ps)) + (cg-push-imm (ps-cg ps) %t-i32 1) + (cg-binop (ps-cg ps) 'add) + (cg-assign (ps-cg ps)) (lp)) + ((eq? v 'dec) + (advance ps) + (cg-take-addr (ps-cg ps)) (cg-push-deref (ps-cg ps)) + (cg-load (ps-cg ps)) + (cg-push-imm (ps-cg ps) %t-i32 1) + (cg-binop (ps-cg ps) 'sub) + (cg-assign (ps-cg ps)) (lp)) + (else #t)))))))) + +(define (parse-call-args ps) + (cond + ((at-punct? ps 'rparen) 0) + (else + (let lp ((n 0)) + (parse-expr-bp ps 4) (rval! ps) + (let ((m (+ n 1))) + (cond ((at-punct? ps 'comma) (advance ps) (lp m)) + (else m))))))) + +(define (parse-primary ps) + (let ((t (peek ps))) + (cond + ((eq? (tok-kind t) 'INT) + (advance ps) + (cg-push-imm (ps-cg ps) %t-i32 (tok-value t))) + ((eq? (tok-kind t) 'CHAR) + (advance ps) + (cg-push-imm (ps-cg ps) %t-i8 (tok-value t))) + ((eq? (tok-kind t) 'STR) + (advance ps) + (cg-push-string (ps-cg ps) (tok-value t))) + ((eq? (tok-kind t) 'IDENT) + (let ((sm (scope-lookup ps (tok-value t)))) + (advance ps) + (cond + ((not sm) (die (tok-loc t) "undecl" (tok-value t))) + ((eq? (sym-kind sm) 'enum-const) + (cg-push-imm (ps-cg ps) %t-i32 (sym-slot sm))) + (else (cg-push-sym (ps-cg ps) sm))))) + ((eq? (tok-kind t) 'PUNCT) + (cond + ((eq? (tok-value t) 'lparen) + (advance ps) (parse-expr ps) (expect-punct ps 'rparen)) + (else (die (tok-loc t) "unexp" (tok-value t))))) + (else (die (tok-loc t) "unexp" (tok-value t)))))) + +(define (rval! ps) + (let ((tp (cg-top (ps-cg ps)))) + (cond ((and tp (opnd? tp) (opnd-lval? tp)) + (cg-load (ps-cg ps))) + (else #t)))) + +(define (rval-not-fn! ps) + (let ((tp (cg-top (ps-cg ps)))) + (cond ((and tp (opnd? tp) (opnd-lval? tp) + (not (ctype-is-fn? (opnd-type tp)))) + (cg-load (ps-cg ps))) + (else #t)))) diff --git a/cc/pp.scm b/cc/pp.scm @@ -1,46 +1,737 @@ -;; cc/pp.scm — token list → expanded token list. -;; -;; Realization of docs/CC-INTERNALS.md §pp.scm. Hide-set discipline -;; per C11 6.10.3.4. #include is rejected (see CC.md §Toolchain -;; envelope). -;; -;; Owner: <unassigned> - -;; -------------------------------------------------------------------- -;; pp-expand -;; toks : list of tok (output of lex-tokenize) -;; initial-defines : alist (bv . macro) — typically from -D flags -;; -> : list of tok with HASH and NL stripped -;; KW / IDENT / INT / STR / CHAR / PUNCT / EOF only -;; aborts via die on directive errors -;; -------------------------------------------------------------------- +;; cc/pp.scm — token list -> expanded token list. +;; Realizes docs/CC-INTERNALS.md §pp.scm. Hide-set per C11 6.10.3.4. +;; #include rejected (CC.md §Toolchain envelope). + +;; --- helpers (TODO: promote to util.scm if shared more broadly) --- +(define (%pp-bv-mem? x xs) + (cond ((null? xs) #f) + ((bv= x (car xs)) #t) + (else (%pp-bv-mem? x (cdr xs))))) + +(define (%pp-bv-union a b) + (cond ((null? a) b) + ((%pp-bv-mem? (car a) b) (%pp-bv-union (cdr a) b)) + (else (cons (car a) (%pp-bv-union (cdr a) b))))) + +(define (%pp-with-hide t hide) + (%tok (tok-kind t) (tok-value t) (tok-loc t) hide)) +(define (%pp-with-loc t loc) + (%tok (tok-kind t) (tok-value t) loc (tok-hide t))) + +;; --- pp-state (private record) --- +;; cond-stack: list of (active? . has-taken?). Outer-active gating is +;; computed by walking the stack rather than encoding it in frames. +(define-record-type pp-state + (%pp-state macros cond-stack cur-file line-delta) + pp-state? + (macros pps-macros pps-macros-set!) + (cond-stack pps-cond-stack pps-cond-stack-set!) + (cur-file pps-cur-file pps-cur-file-set!) + (line-delta pps-line-delta pps-line-delta-set!)) + +(define (%pp-make-state defs) (%pp-state defs '() #f 0)) + +(define (%pp-active? state) + (let loop ((xs (pps-cond-stack state))) + (cond ((null? xs) #t) + ((not (car (car xs))) #f) + (else (loop (cdr xs)))))) + +;; Active for the *parent* of the top frame (used by elif/else). +(define (%pp-parent-active? state) + (let ((cs (pps-cond-stack state))) + (cond ((null? cs) #t) + (else + (let loop ((xs (cdr cs))) + (cond ((null? xs) #t) + ((not (car (car xs))) #f) + (else (loop (cdr xs))))))))) + +;; --- token classification --- +(define (%pp-eof? t) (eq? (tok-kind t) 'EOF)) +(define (%pp-nl? t) (eq? (tok-kind t) 'NL)) +(define (%pp-hash? t) (eq? (tok-kind t) 'HASH)) +(define (%pp-ident? t) (eq? (tok-kind t) 'IDENT)) +(define (%pp-int? t) (eq? (tok-kind t) 'INT)) +(define (%pp-punct? t pname) + (and (eq? (tok-kind t) 'PUNCT) (eq? (tok-value t) pname))) +(define (%pp-ident-name? t name-bv) + (and (%pp-ident? t) (bv= (tok-value t) name-bv))) +(define (%pp-skip-ws toks) toks) + +;; --- built-in macro names --- +(define %pp-bv-FILE "__FILE__") +(define %pp-bv-LINE "__LINE__") +(define %pp-bv-STDC "__STDC__") +(define %pp-bv-LISPCC "__LISPCC__") +(define %pp-bv-VA-ARGS "__VA_ARGS__") +(define %pp-bv-defined "defined") + +(define (%pp-builtin? name) + (or (bv= name %pp-bv-FILE) (bv= name %pp-bv-LINE) + (bv= name %pp-bv-STDC) (bv= name %pp-bv-LISPCC))) + +(define (%pp-expand-builtin name loc state) + (let* ((file (or (pps-cur-file state) (loc-file loc))) + (line (+ (loc-line loc) (pps-line-delta state))) + (col (loc-col loc)) + (here (%loc file line col))) + (cond + ((bv= name %pp-bv-FILE) (list (%tok 'STR file here '()))) + ((bv= name %pp-bv-LINE) (list (%tok 'INT line here '()))) + ((bv= name %pp-bv-STDC) (list (%tok 'INT 1 here '()))) + ((bv= name %pp-bv-LISPCC) (list (%tok 'INT 1 here '()))) + (else (die loc "internal: not a builtin" name))))) + +;; --- buf-list: simple reversed-list buffer of toks --- +(define-record-type buf-list + (%buf-list xs) + buf-list? + (xs buf-list-xs buf-list-xs-set!)) +(define (make-buf-list) (%buf-list '())) +(define (buf-list-push! b t) (buf-list-xs-set! b (cons t (buf-list-xs b)))) +(define (buf-list-push-many! b ts) + (let loop ((ts ts)) + (cond ((null? ts) #t) + (else (buf-list-push! b (car ts)) (loop (cdr ts)))))) +(define (buf-list-flush b) (reverse (buf-list-xs b))) + +;; --- pp-expand: top-level driver --- (define (pp-expand toks initial-defines) - (error "TBD: pp-expand")) + (let ((state (%pp-make-state initial-defines)) + (out (make-buf-list))) + (let loop ((toks toks)) + (cond + ((null? toks) (die #f "pp-expand: missing EOF token")) + ((%pp-eof? (car toks)) + (cond ((not (null? (pps-cond-stack state))) + (die (tok-loc (car toks)) "unterminated #if/#ifdef/#ifndef")) + (else + (buf-list-push! out (car toks)) + (buf-list-flush out)))) + ((%pp-nl? (car toks)) (loop (cdr toks))) + ((%pp-hash? (car toks)) + (let* ((lr (%pp-take-line (cdr toks))) + (line (car lr)) (rest (cdr lr))) + (%pp-dispatch-directive (car toks) line state out) + (loop rest))) + (else + (let* ((lr (%pp-take-line toks)) + (line (car lr)) (rest (cdr lr))) + (cond ((%pp-active? state) + (%pp-emit-expanded line state out)) + (else #t)) + (loop rest))))))) + +;; Take tokens up to (not including) the next NL or EOF. NL is consumed; +;; EOF is left in the stream so the driver sees it next. +(define (%pp-take-line toks) + (let loop ((toks toks) (acc '())) + (cond + ((null? toks) (cons (reverse acc) toks)) + ((%pp-eof? (car toks)) (cons (reverse acc) toks)) + ((%pp-nl? (car toks)) (cons (reverse acc) (cdr toks))) + (else (loop (cdr toks) (cons (car toks) acc)))))) + +;; --- directive dispatch --- +;; pmatch-based on the directive name bv. bv literals match by equal?. +(define (%pp-dispatch-directive hash-tok line state out) + (let ((line (%pp-skip-ws line))) + (cond + ((null? line) #t) ; bare `#` line — null directive + ((%pp-ident? (car line)) + (let ((name (tok-value (car line))) + (rest (cdr line)) + (loc (tok-loc (car line)))) + (pmatch name + ("define" (cond ((%pp-active? state) (%pp-do-define rest state)) (else #t))) + ("undef" (cond ((%pp-active? state) (%pp-do-undef rest state)) (else #t))) + ("if" (%pp-do-if rest state)) + ("ifdef" (%pp-do-ifdef rest state)) + ("ifndef" (%pp-do-ifndef rest state)) + ("elif" (%pp-do-elif rest state)) + ("else" (%pp-do-else rest state)) + ("endif" (%pp-do-endif rest state)) + ("error" (cond ((%pp-active? state) + (%pp-do-error (cons (car line) rest) state)) + (else #t))) + ("line" (cond ((%pp-active? state) (%pp-do-line rest state)) (else #t))) + ("pragma" (cond ((%pp-active? state) (%pp-do-pragma rest state)) (else #t))) + ("include" (cond ((%pp-active? state) (%pp-do-include rest state)) (else #t))) + (else (die loc "unknown preprocessor directive" name))))) + (else + (die (tok-loc (car line)) "expected directive name after '#'" + (tok-kind (car line))))))) + +;; --- #define --- +;; function-like vs object-like is decided by an immediately-adjacent `(`. +;; "Adjacent" = column of `(` equals column of name + length of name. +(define (%pp-do-define line state) + (cond + ((null? line) (die #f "#define requires a macro name")) + ((not (%pp-ident? (car line))) + (die (tok-loc (car line)) "#define: expected identifier")) + (else + (let* ((nt (car line)) (name (tok-value nt)) (rest (cdr line))) + (cond + ((and (not (null? rest)) + (%pp-punct? (car rest) 'lparen) + (= (loc-col (tok-loc (car rest))) + (+ (loc-col (tok-loc nt)) + (bytevector-length name)))) + (%pp-define-fn name (cdr rest) (tok-loc nt) state)) + (else + (let ((m (%macro 'obj '() rest))) + (pps-macros-set! state (alist-set name m (pps-macros state)))))))))) + +(define (%pp-define-fn name post-lparen nloc state) + (let loop ((toks post-lparen) (params '()) (variadic? #f)) + (cond + ((null? toks) (die nloc "#define: unterminated parameter list")) + ((%pp-punct? (car toks) 'rparen) + (let* ((body (cdr toks)) + (kind (if variadic? 'fn-vararg 'fn)) + (m (%macro kind (reverse params) body))) + (pps-macros-set! state (alist-set name m (pps-macros state))))) + ((%pp-punct? (car toks) 'ellipsis) + (let ((rest (cdr toks))) + (cond + ((null? rest) (die (tok-loc (car toks)) "#define: '...' must precede ')'")) + ((%pp-punct? (car rest) 'rparen) (loop rest params #t)) + (else (die (tok-loc (car rest)) "#define: garbage after '...'"))))) + ((null? params) + (cond + ((%pp-ident? (car toks)) + (loop (cdr toks) (cons (tok-value (car toks)) params) #f)) + (else (die (tok-loc (car toks)) "#define: expected parameter name")))) + (else + (cond + ((%pp-punct? (car toks) 'comma) + (let ((after (cdr toks))) + (cond + ((null? after) (die (tok-loc (car toks)) "#define: trailing ','")) + ((%pp-punct? (car after) 'ellipsis) + (let ((aa (cdr after))) + (cond + ((and (not (null? aa)) (%pp-punct? (car aa) 'rparen)) + (loop aa params #t)) + (else (die (tok-loc (car after)) + "#define: '...' must precede ')'"))))) + ((%pp-ident? (car after)) + (loop (cdr after) (cons (tok-value (car after)) params) #f)) + (else + (die (tok-loc (car after)) + "#define: expected parameter name after ','"))))) + (else (die (tok-loc (car toks)) + "#define: expected ',' or ')' in parameter list"))))))) + +;; --- #undef --- +(define (%pp-do-undef line state) + (cond + ((null? line) (die #f "#undef requires a macro name")) + ((not (%pp-ident? (car line))) + (die (tok-loc (car line)) "#undef: expected identifier")) + (else + (pps-macros-set! state + (%pp-alist-drop (tok-value (car line)) (pps-macros state)))))) + +(define (%pp-alist-drop key al) + (cond ((null? al) '()) + ((bv= (car (car al)) key) (cdr al)) + (else (cons (car al) (%pp-alist-drop key (cdr al)))))) + +;; --- #if / #ifdef / #ifndef / #elif / #else / #endif --- +(define (%pp-do-if line state) + (cond + ((not (%pp-active? state)) + (pps-cond-stack-set! state (cons (cons #f #f) (pps-cond-stack state)))) + (else + (let* ((v (pp-eval-cexpr line (pps-macros state))) + (a? (not (= v 0)))) + (pps-cond-stack-set! state (cons (cons a? a?) (pps-cond-stack state))))))) + +(define (%pp-do-ifdef line state) + (cond + ((not (%pp-active? state)) + (pps-cond-stack-set! state (cons (cons #f #f) (pps-cond-stack state)))) + (else + (let ((d? (%pp-defined? (%pp-name-of-single line) state))) + (pps-cond-stack-set! state + (cons (cons d? d?) (pps-cond-stack state))))))) + +(define (%pp-do-ifndef line state) + (cond + ((not (%pp-active? state)) + (pps-cond-stack-set! state (cons (cons #f #f) (pps-cond-stack state)))) + (else + (let ((a? (not (%pp-defined? (%pp-name-of-single line) state)))) + (pps-cond-stack-set! state + (cons (cons a? a?) (pps-cond-stack state))))))) + +(define (%pp-name-of-single line) + (cond + ((null? line) (die #f "#ifdef/#ifndef: missing identifier")) + ((not (%pp-ident? (car line))) + (die (tok-loc (car line)) "#ifdef/#ifndef: expected identifier")) + (else (tok-value (car line))))) + +(define (%pp-defined? name state) + (or (alist-ref name (pps-macros state)) + (%pp-builtin? name) + #f)) + +(define (%pp-do-elif line state) + (let ((cs (pps-cond-stack state))) + (cond + ((null? cs) (die #f "#elif outside #if")) + (else + (let* ((top (car cs)) (rest (cdr cs)) + (taken? (cdr top)) + (par? (%pp-parent-active? state))) + (cond + ((or (not par?) taken?) + (pps-cond-stack-set! state (cons (cons #f taken?) rest))) + (else + (let* ((v (pp-eval-cexpr line (pps-macros state))) + (a? (not (= v 0)))) + (pps-cond-stack-set! state + (cons (cons a? (or a? taken?)) rest)))))))))) + +(define (%pp-do-else line state) + (let ((cs (pps-cond-stack state))) + (cond + ((null? cs) (die #f "#else outside #if")) + (else + (let* ((top (car cs)) (rest (cdr cs)) + (taken? (cdr top)) + (par? (%pp-parent-active? state))) + (cond + ((not par?) + (pps-cond-stack-set! state (cons (cons #f taken?) rest))) + (taken? + (pps-cond-stack-set! state (cons (cons #f #t) rest))) + (else + (pps-cond-stack-set! state (cons (cons #t #t) rest))))))))) + +(define (%pp-do-endif line state) + (let ((cs (pps-cond-stack state))) + (cond ((null? cs) (die #f "#endif outside #if")) + (else (pps-cond-stack-set! state (cdr cs)))))) + +;; --- #error --- +;; line[0] is the directive name "error"; the rest is the user message. +(define (%pp-do-error line state) + (let* ((msg-toks (if (null? line) '() (cdr line))) + (loc (if (null? line) #f (tok-loc (car line)))) + (msg (%pp-toks->display msg-toks))) + (die loc "#error" msg))) + +(define (%pp-toks->display toks) + (let loop ((toks toks) (acc '()) (first? #t)) + (cond + ((null? toks) (bv-cat (reverse acc))) + (else + (let ((p (%pp-tok->bv (car toks)))) + (loop (cdr toks) + (if first? (cons p acc) (cons p (cons " " acc))) + #f)))))) + +(define (%pp-tok->bv t) + (let ((k (tok-kind t)) (v (tok-value t))) + (cond + ((eq? k 'IDENT) v) + ((eq? k 'INT) (fixnum->bv v 10)) + ((eq? k 'STR) (bytevector-append "\"" (bytevector-append v "\""))) + ((eq? k 'CHAR) (bytevector-append "'" (bytevector-append (bv-of-byte v) "'"))) + ((eq? k 'KW) (symbol->string v)) + ((eq? k 'PUNCT) (symbol->string v)) + (else "?")))) + +;; --- #line / #pragma / #include --- +;; Approximate #line: subsequent toks have line = (orig-line + delta), +;; where delta = (N - here-line - 1). Good enough for most cases. +(define (%pp-do-line line state) + (cond + ((null? line) (die #f "#line requires a line number")) + ((not (%pp-int? (car line))) + (die (tok-loc (car line)) "#line: expected integer")) + (else + (let* ((nt (car line)) (n (tok-value nt)) + (rest (cdr line)) + (here (loc-line (tok-loc nt)))) + (pps-line-delta-set! state (- n here 1)) + (cond + ((null? rest) #t) + ((eq? (tok-kind (car rest)) 'STR) + (pps-cur-file-set! state (tok-value (car rest)))) + (else (die (tok-loc (car rest)) + "#line: expected string after number"))))))) + +(define (%pp-do-pragma line state) #t) -;; -------------------------------------------------------------------- -;; Constant-expression evaluator for #if / #elif. Tokens are the -;; expression body *after* macro expansion; identifiers that aren't -;; macros are zero per the C standard. -;; -------------------------------------------------------------------- -(define (pp-eval-cexpr toks macros) - (error "TBD: pp-eval-cexpr")) - -;; -------------------------------------------------------------------- -;; Directive handlers — internal. Each takes the token list of one -;; logical directive line (between HASH and the next NL) plus a -;; private pp-state record (defined inside this module, not exposed). -;; -------------------------------------------------------------------- -(define (%pp-do-define line state) (error "TBD: %pp-do-define")) -(define (%pp-do-undef line state) (error "TBD: %pp-do-undef")) -(define (%pp-do-if line state) (error "TBD: %pp-do-if")) -(define (%pp-do-ifdef line state) (error "TBD: %pp-do-ifdef")) -(define (%pp-do-ifndef line state) (error "TBD: %pp-do-ifndef")) -(define (%pp-do-elif line state) (error "TBD: %pp-do-elif")) -(define (%pp-do-else line state) (error "TBD: %pp-do-else")) -(define (%pp-do-endif line state) (error "TBD: %pp-do-endif")) -(define (%pp-do-error line state) (error "TBD: %pp-do-error")) -(define (%pp-do-line line state) (error "TBD: %pp-do-line")) -(define (%pp-do-pragma line state) (error "TBD: %pp-do-pragma")) (define (%pp-do-include line state) - ;; Always dies — pre-flatten step is upstream of cc. - (error "TBD: %pp-do-include (must die per CC.md)")) + (die (if (null? line) #f (tok-loc (car line))) + "#include: file inclusion is handled upstream by pre-flatten")) + +;; --- macro expansion engine --- +;; Walk toks; for each IDENT, look up in macros / builtins. Hide-set: +;; if the name is in t.hide, don't expand. Otherwise expand and rescan +;; the produced body, with hide += {name}. +(define (%pp-emit-expanded toks state out) + (let loop ((toks toks)) + (cond + ((null? toks) #t) + (else + (let* ((t (car toks)) (rest (cdr toks))) + (cond + ((not (%pp-ident? t)) + (buf-list-push! out (%pp-relocate t state)) + (loop rest)) + (else + (let ((name (tok-value t))) + (cond + ((%pp-bv-mem? name (tok-hide t)) + (buf-list-push! out (%pp-relocate t state)) + (loop rest)) + ((%pp-builtin? name) + (buf-list-push-many! out + (%pp-expand-builtin name (tok-loc t) state)) + (loop rest)) + (else + (let ((m (alist-ref name (pps-macros state)))) + (cond + ((not m) + (buf-list-push! out (%pp-relocate t state)) + (loop rest)) + (else + (%pp-apply-macro t m rest state out loop)))))))))))))) + +(define (%pp-apply-macro t m rest state out cont) + (let ((kind (macro-kind m)) (name (tok-value t))) + (cond + ((eq? kind 'obj) + (let ((bodies (%pp-prepare-body (macro-body m) + (cons name (tok-hide t))))) + (%pp-emit-expanded bodies state out) + (cont rest))) + (else + (let ((after (%pp-skip-ws rest))) + (cond + ((or (null? after) (not (%pp-punct? (car after) 'lparen))) + (buf-list-push! out (%pp-relocate t state)) + (cont rest)) + (else + (let* ((ar (%pp-collect-args (cdr after) (tok-loc t))) + (args (car ar)) (rest2 (cdr ar)) + (params (macro-params m)) + (variadic? (eq? kind 'fn-vararg)) + (env (%pp-bind-args params args variadic? (tok-loc t))) + (sub (%pp-substitute (macro-body m) env (tok-loc t))) + (bodies (%pp-prepare-body sub + (cons name (tok-hide t))))) + (%pp-emit-expanded bodies state out) + (cont rest2))))))))) + +(define (%pp-prepare-body body extra-hide) + (map (lambda (t) + (%pp-with-hide t (%pp-bv-union extra-hide (tok-hide t)))) + body)) + +;; Collect comma-separated args. `toks` starts AFTER `(`. Returns +;; (args . rest), where args is a list of token-lists. +(define (%pp-collect-args toks call-loc) + (let loop ((toks toks) (depth 0) (cur '()) (args '())) + (cond + ((null? toks) (die call-loc "macro call: unterminated argument list")) + ((%pp-eof? (car toks)) + (die call-loc "macro call: unterminated argument list")) + ((and (= depth 0) (%pp-punct? (car toks) 'rparen)) + (let ((args* + (cond + ((and (null? args) (null? cur)) '()) + (else (reverse (cons (reverse cur) args)))))) + (cons args* (cdr toks)))) + ((and (= depth 0) (%pp-punct? (car toks) 'comma)) + (loop (cdr toks) 0 '() (cons (reverse cur) args))) + ((%pp-punct? (car toks) 'lparen) + (loop (cdr toks) (+ depth 1) (cons (car toks) cur) args)) + ((%pp-punct? (car toks) 'rparen) + (loop (cdr toks) (- depth 1) (cons (car toks) cur) args)) + (else + (loop (cdr toks) depth (cons (car toks) cur) args))))) + +;; Bind formals → token-lists (alist by bv key). Variadic gathers +;; trailing actuals into __VA_ARGS__, joined with synthetic commas. +(define (%pp-bind-args params args variadic? call-loc) + (let* ((np (length params)) (na (length args))) + (cond + (variadic? + (cond + ((< na np) (die call-loc "macro call: too few arguments")) + (else + (let loop ((ps params) (as args) (acc '())) + (cond + ((null? ps) + (alist-set %pp-bv-VA-ARGS (%pp-join-comma as) acc)) + (else + (loop (cdr ps) (cdr as) + (alist-set (car ps) (car as) acc)))))))) + (else + (cond + ((and (= np 0) (= na 1) (null? (car args))) '()) + ((not (= np na)) (die call-loc "macro call: argument count mismatch")) + (else + (let loop ((ps params) (as args) (acc '())) + (cond + ((null? ps) acc) + (else (loop (cdr ps) (cdr as) + (alist-set (car ps) (car as) acc))))))))))) + +(define (%pp-join-comma argss) + (cond + ((null? argss) '()) + ((null? (cdr argss)) (car argss)) + (else + (append (car argss) + (cons (%pp-synth-comma) (%pp-join-comma (cdr argss))))))) + +(define (%pp-synth-comma) + (%tok 'PUNCT 'comma (%loc "<expand>" 0 0) '())) + +;; Body substitution: walk body; replace param IDENTs with arg toks, +;; handle `#param` (stringize) and `a##b` (paste). For v1 we do not +;; pre-expand args before substitution; the rescan after substitution +;; catches the same expansions in practice. +(define (%pp-substitute body env call-loc) + (let loop ((body body) (out '())) + (cond + ((null? body) (reverse out)) + (else + (let ((t (car body)) (rest (cdr body))) + (cond + ((%pp-punct? t 'hash) + (cond + ((or (null? rest) (not (%pp-ident? (car rest)))) + (die (tok-loc t) "stringize: '#' must precede a parameter name")) + (else + (let* ((id (car rest)) (pn (tok-value id)) + (pt (alist-ref pn env))) + (cond + ((not pt) + (die (tok-loc id) "stringize: '#' operand must be a parameter" pn)) + (else + (let ((s (%tok 'STR (%pp-toks->display pt) (tok-loc t) '()))) + (loop (cdr rest) (cons s out))))))))) + ((%pp-punct? t 'paste) + (cond + ((null? out) (die (tok-loc t) "paste: '##' cannot start a body")) + ((null? rest) (die (tok-loc t) "paste: '##' cannot end a body")) + (else + (let* ((lhs (car out)) + (rt (car rest)) + (rhs-list + (cond + ((and (%pp-ident? rt) (alist-ref (tok-value rt) env)) + (alist-ref (tok-value rt) env)) + (else (list rt))))) + (cond + ((null? rhs-list) (loop (cdr rest) out)) + (else + (let* ((p (%pp-paste-tokens lhs (car rhs-list))) + (after (append (cdr rhs-list) (cdr rest)))) + (loop after (cons p (cdr out)))))))))) + ((%pp-ident? t) + (let* ((pn (tok-value t)) (pt (alist-ref pn env))) + (cond + ((not pt) (loop rest (cons t out))) + ((and (not (null? rest)) (%pp-punct? (car rest) 'paste)) + (cond + ((null? pt) (loop (cdr rest) out)) + (else (loop rest (append (reverse pt) out))))) + (else (loop rest (append (reverse pt) out)))))) + (else (loop rest (cons t out))))))))) + +;; Paste two tokens textually; reparse the result. +(define (%pp-paste-tokens lhs rhs) + (let ((lk (tok-kind lhs)) (rk (tok-kind rhs))) + (cond + ((and (eq? lk 'IDENT) (eq? rk 'IDENT)) + (%tok 'IDENT (bytevector-append (tok-value lhs) (tok-value rhs)) + (tok-loc lhs) (%pp-bv-union (tok-hide lhs) (tok-hide rhs)))) + ((and (eq? lk 'IDENT) (eq? rk 'INT)) + (%tok 'IDENT (bytevector-append (tok-value lhs) (fixnum->bv (tok-value rhs) 10)) + (tok-loc lhs) (%pp-bv-union (tok-hide lhs) (tok-hide rhs)))) + ((and (eq? lk 'INT) (eq? rk 'INT)) + (let* ((s (bytevector-append (fixnum->bv (tok-value lhs) 10) + (fixnum->bv (tok-value rhs) 10))) + (pr (bv->fixnum s 10))) + (cond + ((not (car pr)) (die (tok-loc lhs) "paste: cannot reparse as integer" s)) + (else (%tok 'INT (cdr pr) (tok-loc lhs) + (%pp-bv-union (tok-hide lhs) (tok-hide rhs))))))) + (else (die (tok-loc lhs) "paste: unsupported token kinds" lk rk))))) + +(define (%pp-relocate t state) + (cond + ((and (= (pps-line-delta state) 0) (not (pps-cur-file state))) t) + (else + (let* ((l (tok-loc t)) + (f (or (pps-cur-file state) (loc-file l))) + (ln (+ (loc-line l) (pps-line-delta state))) + (c (loc-col l))) + (%pp-with-loc t (%loc f ln c)))))) + +;; --- pp-eval-cexpr: #if expression evaluator --- +;; Steps: resolve `defined NAME`, macro-expand the rest, treat any +;; remaining IDENT as 0, parse with recursive descent. +(define (pp-eval-cexpr toks macros) + (let* ((state (%pp-state macros '() #f 0)) + (s1 (%pp-resolve-defined toks state)) + (s2 (%pp-expand-line s1 state)) + (s3 (%pp-idents-as-zero s2))) + (let* ((p (%pp-cx-expr s3)) + (val (car p)) (rest (cdr p))) + (cond + ((null? rest) val) + (else (die (tok-loc (car rest)) "#if: garbage at end of expression" + (tok-kind (car rest)))))))) + +(define (%pp-expand-line toks state) + (let ((out (make-buf-list))) + (%pp-emit-expanded toks state out) + (buf-list-flush out))) + +(define (%pp-resolve-defined toks state) + (let loop ((toks toks) (acc '())) + (cond + ((null? toks) (reverse acc)) + ((%pp-ident-name? (car toks) %pp-bv-defined) + (let ((rest (cdr toks))) + (cond + ((null? rest) (die (tok-loc (car toks)) "defined: missing operand")) + ((%pp-ident? (car rest)) + (let ((v (if (%pp-defined? (tok-value (car rest)) state) 1 0))) + (loop (cdr rest) + (cons (%tok 'INT v (tok-loc (car toks)) '()) acc)))) + ((%pp-punct? (car rest) 'lparen) + (let ((after (cdr rest))) + (cond + ((or (null? after) (not (%pp-ident? (car after)))) + (die (tok-loc (car toks)) "defined: expected identifier")) + (else + (let ((aa (cdr after))) + (cond + ((or (null? aa) (not (%pp-punct? (car aa) 'rparen))) + (die (tok-loc (car toks)) "defined: expected ')'")) + (else + (let ((v (if (%pp-defined? (tok-value (car after)) state) 1 0))) + (loop (cdr aa) + (cons (%tok 'INT v (tok-loc (car toks)) '()) acc)))))))) )) + (else (die (tok-loc (car rest)) "defined: expected identifier or '('"))))) + (else (loop (cdr toks) (cons (car toks) acc)))))) + +(define (%pp-idents-as-zero toks) + (map (lambda (t) + (cond ((%pp-ident? t) (%tok 'INT 0 (tok-loc t) '())) + (else t))) + toks)) + +;; --- recursive-descent #if expression parser --- +;; Returns (value . rest). +(define (%pp-cx-expr toks) (%pp-cx-cond toks)) + +(define (%pp-cx-cond toks) + (let* ((p (%pp-cx-lor toks)) + (v (car p)) (rest (cdr p))) + (cond + ((and (not (null? rest)) (%pp-punct? (car rest) 'qmark)) + (let* ((p2 (%pp-cx-expr (cdr rest))) + (vt (car p2)) (after (cdr p2))) + (cond + ((or (null? after) (not (%pp-punct? (car after) 'colon))) + (die (if (null? after) #f (tok-loc (car after))) "?: missing ':'")) + (else + (let* ((p3 (%pp-cx-cond (cdr after))) + (vf (car p3)) (rest3 (cdr p3))) + (cons (if (not (= v 0)) vt vf) rest3)))))) + (else (cons v rest))))) + +(define (%pp-cx-binl next ops toks) + (let loop ((p (next toks))) + (let ((v (car p)) (rest (cdr p))) + (cond + ((null? rest) p) + (else + (let* ((tt (car rest)) + (hit (and (eq? (tok-kind tt) 'PUNCT) + (alist-ref/eq (tok-value tt) ops)))) + (cond + ((not hit) p) + (else + (let* ((p2 (next (cdr rest))) + (v2 (car p2)) (rest2 (cdr p2))) + (loop (cons (hit v v2) rest2))))))))))) + +(define (%pp-cx-lor toks) + (%pp-cx-binl %pp-cx-land + (list (cons 'lor (lambda (a b) (if (or (not (= a 0)) (not (= b 0))) 1 0)))) + toks)) +(define (%pp-cx-land toks) + (%pp-cx-binl %pp-cx-bor + (list (cons 'land (lambda (a b) (if (and (not (= a 0)) (not (= b 0))) 1 0)))) + toks)) +(define (%pp-cx-bor toks) (%pp-cx-binl %pp-cx-bxor (list (cons 'bar bit-or)) toks)) +(define (%pp-cx-bxor toks) (%pp-cx-binl %pp-cx-band (list (cons 'caret bit-xor)) toks)) +(define (%pp-cx-band toks) (%pp-cx-binl %pp-cx-eq (list (cons 'amp bit-and)) toks)) +(define (%pp-cx-eq toks) + (%pp-cx-binl %pp-cx-rel + (list (cons 'eq2 (lambda (a b) (if (= a b) 1 0))) + (cons 'ne (lambda (a b) (if (= a b) 0 1)))) + toks)) +(define (%pp-cx-rel toks) + (%pp-cx-binl %pp-cx-shift + (list (cons 'lt (lambda (a b) (if (< a b) 1 0))) + (cons 'le (lambda (a b) (if (<= a b) 1 0))) + (cons 'gt (lambda (a b) (if (> a b) 1 0))) + (cons 'ge (lambda (a b) (if (>= a b) 1 0)))) + toks)) +(define (%pp-cx-shift toks) + (%pp-cx-binl %pp-cx-add + (list (cons 'shl (lambda (a b) (arithmetic-shift a b))) + (cons 'shr (lambda (a b) (arithmetic-shift a (- 0 b))))) + toks)) +(define (%pp-cx-add toks) + (%pp-cx-binl %pp-cx-mul (list (cons 'plus +) (cons 'minus -)) toks)) +(define (%pp-cx-mul toks) + (%pp-cx-binl %pp-cx-unary + (list (cons 'star *) (cons 'slash quotient) (cons 'pct remainder)) + toks)) + +(define (%pp-cx-unary toks) + (cond + ((null? toks) (die #f "#if: unexpected end of expression")) + ((%pp-punct? (car toks) 'plus) (%pp-cx-unary (cdr toks))) + ((%pp-punct? (car toks) 'minus) + (let* ((p (%pp-cx-unary (cdr toks))) (v (car p)) (r (cdr p))) + (cons (- 0 v) r))) + ((%pp-punct? (car toks) 'bang) + (let* ((p (%pp-cx-unary (cdr toks))) (v (car p)) (r (cdr p))) + (cons (if (= v 0) 1 0) r))) + ((%pp-punct? (car toks) 'tilde) + (let* ((p (%pp-cx-unary (cdr toks))) (v (car p)) (r (cdr p))) + (cons (bit-not v) r))) + (else (%pp-cx-primary toks)))) + +(define (%pp-cx-primary toks) + (cond + ((null? toks) (die #f "#if: expected expression")) + ((%pp-int? (car toks)) (cons (tok-value (car toks)) (cdr toks))) + ((eq? (tok-kind (car toks)) 'CHAR) + (cons (tok-value (car toks)) (cdr toks))) + ((%pp-punct? (car toks) 'lparen) + (let* ((p (%pp-cx-expr (cdr toks))) (v (car p)) (r (cdr p))) + (cond + ((or (null? r) (not (%pp-punct? (car r) 'rparen))) + (die (if (null? r) #f (tok-loc (car r))) "#if: missing ')'")) + (else (cons v (cdr r)))))) + (else (die (tok-loc (car toks)) "#if: unexpected token" (tok-kind (car toks)))))) diff --git a/tests/cc-cg/00-fn-empty.expected b/tests/cc-cg/00-fn-empty.expected @@ -0,0 +1,14 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 16, { +%li(a0, 0) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/01-return-imm.expected b/tests/cc-cg/01-return-imm.expected @@ -0,0 +1,14 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 16, { +%li(a0, 42) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/01-return-imm.scm b/tests/cc-cg/01-return-imm.scm @@ -0,0 +1,9 @@ +;; tests/cc-cg/01-return-imm.scm — function returning a non-zero immediate. +;; Verifies cg-push-imm carries the literal value through to %li. + +(let ((cg (cg-init))) + (cg-fn-begin cg "main" '() %t-i32) + (cg-push-imm cg %t-i32 42) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/02-one-param.expected b/tests/cc-cg/02-one-param.expected @@ -0,0 +1,17 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 32, { +%st(a0, sp, (+ %main__SO 8)) +%ld(t0, sp, (+ %main__SO 8)) +%st(t0, sp, (+ %main__SO 16)) +%ld(a0, sp, (+ %main__SO 16)) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/02-one-param.scm b/tests/cc-cg/02-one-param.scm @@ -0,0 +1,13 @@ +;; tests/cc-cg/02-one-param.scm — function with one int param, return it. +;; Mirrors the Phase 1 milestone shape (int main(int argc) { return argc; }). +;; Exercises cg-fn-begin's param spill path and cg-push-sym → cg-load. + +(let* ((cg (cg-init)) + (argc (%sym "argc" 'param #f %t-i32 #f)) + (params (cg-fn-begin cg "main" (list argc) %t-i32)) + (argc* (car params))) + (cg-push-sym cg argc*) ; lval frame + (cg-load cg) ; rval i32 + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/03-two-params.expected b/tests/cc-cg/03-two-params.expected @@ -0,0 +1,18 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 32, { +%st(a0, sp, (+ %main__SO 8)) +%st(a1, sp, (+ %main__SO 16)) +%ld(t0, sp, (+ %main__SO 8)) +%st(t0, sp, (+ %main__SO 24)) +%ld(a0, sp, (+ %main__SO 24)) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/03-two-params.scm b/tests/cc-cg/03-two-params.scm @@ -0,0 +1,15 @@ +;; tests/cc-cg/03-two-params.scm — Phase 1 milestone shape: +;; int main(int argc, char **argv) { return argc; } +;; Both params spill, but only argc is loaded for return. + +(let* ((cg (cg-init)) + (argc (%sym "argc" 'param #f %t-i32 #f)) + (cpp (%ctype 'ptr 8 8 (%ctype 'ptr 8 8 %t-i8))) + (argv (%sym "argv" 'param #f cpp #f)) + (params (cg-fn-begin cg "main" (list argc argv) %t-i32)) + (argc* (car params))) + (cg-push-sym cg argc*) + (cg-load cg) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/04-binop-add.expected b/tests/cc-cg/04-binop-add.expected @@ -0,0 +1,18 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 16, { +%li(a0, 3) +%li(a1, 4) +%add(t0, a0, a1) +%st(t0, sp, (+ %main__SO 8)) +%ld(a0, sp, (+ %main__SO 8)) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/04-binop-add.scm b/tests/cc-cg/04-binop-add.scm @@ -0,0 +1,11 @@ +;; tests/cc-cg/04-binop-add.scm — return 3 + 4 as i32. +;; Exercises cg-binop on two imm rvals. + +(let ((cg (cg-init))) + (cg-fn-begin cg "main" '() %t-i32) + (cg-push-imm cg %t-i32 3) + (cg-push-imm cg %t-i32 4) + (cg-binop cg 'add) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/05-load-binop-store.expected b/tests/cc-cg/05-load-binop-store.expected @@ -0,0 +1,26 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 48, { +%st(a0, sp, (+ %main__SO 8)) +%ld(t0, sp, (+ %main__SO 8)) +%st(t0, sp, (+ %main__SO 16)) +%ld(a0, sp, (+ %main__SO 16)) +%li(a1, 5) +%add(t0, a0, a1) +%st(t0, sp, (+ %main__SO 24)) +%ld(a0, sp, (+ %main__SO 24)) +%st(a0, sp, (+ %main__SO 8)) +%st(a0, sp, (+ %main__SO 32)) +%ld(t0, sp, (+ %main__SO 8)) +%st(t0, sp, (+ %main__SO 40)) +%ld(a0, sp, (+ %main__SO 40)) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/05-load-binop-store.scm b/tests/cc-cg/05-load-binop-store.scm @@ -0,0 +1,22 @@ +;; tests/cc-cg/05-load-binop-store.scm — store rhs+lhs into an int param. +;; Models: int main(int x) { x = x + 5; return x; } +;; Exercises load → binop → assign → load → return on the same lval. + +(let* ((cg (cg-init)) + (x (%sym "x" 'param #f %t-i32 #f)) + (params (cg-fn-begin cg "main" (list x) %t-i32)) + (x* (car params))) + ;; x = x + 5 + (cg-push-sym cg x*) ; lval (lhs) + (cg-push-sym cg x*) ; lval + (cg-load cg) ; rval + (cg-push-imm cg %t-i32 5) + (cg-binop cg 'add) + (cg-assign cg) ; rhs := x+5; pushes value onto stack + (cg-pop cg) ; discard expression-statement result + ;; return x + (cg-push-sym cg x*) + (cg-load cg) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/06-if.expected b/tests/cc-cg/06-if.expected @@ -0,0 +1,20 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 16, { +%li(t0, 1) +%if_nez(t0, { +%li(a0, 7) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +}) +%li(a0, 0) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/06-if.scm b/tests/cc-cg/06-if.scm @@ -0,0 +1,14 @@ +;; tests/cc-cg/06-if.scm — `if (1) return 7;` shape. +;; Exercises cg-if + thunk-driven body emission. + +(let ((cg (cg-init))) + (cg-fn-begin cg "main" '() %t-i32) + (cg-push-imm cg %t-i32 1) + (cg-if cg + (lambda () + (cg-push-imm cg %t-i32 7) + (cg-return cg))) + (cg-push-imm cg %t-i32 0) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/07-ifelse.expected b/tests/cc-cg/07-ifelse.expected @@ -0,0 +1,21 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 16, { +%li(t0, 0) +%ifelse_nez(t0, { +%li(a0, 1) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +}, { +%li(a0, 2) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +}) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/07-ifelse.scm b/tests/cc-cg/07-ifelse.scm @@ -0,0 +1,14 @@ +;; tests/cc-cg/07-ifelse.scm — if/else returning different constants. + +(let ((cg (cg-init))) + (cg-fn-begin cg "main" '() %t-i32) + (cg-push-imm cg %t-i32 0) + (cg-ifelse cg + (lambda () + (cg-push-imm cg %t-i32 1) + (cg-return cg)) + (lambda () + (cg-push-imm cg %t-i32 2) + (cg-return cg))) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/08-while-break-continue.expected b/tests/cc-cg/08-while-break-continue.expected @@ -0,0 +1,20 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 16, { +%loop_tag(L0, { +%li(t0, 1) +%if_eqz(t0, { %break(L0) }) +}) +%continue(L0) +%break(L0) +%li(a0, 0) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/08-while-break-continue.scm b/tests/cc-cg/08-while-break-continue.scm @@ -0,0 +1,20 @@ +;; tests/cc-cg/08-while-break-continue.scm — loop with break and continue. +;; Models: while (n) { if (n) continue; break; } +;; Exercises cg-loop's tag return, cg-break, and cg-continue. + +(let ((cg (cg-init))) + (cg-fn-begin cg "main" '() %t-i32) + (let ((tag (cg-loop + cg + ;; head: condition is a literal 1 (infinite-ish loop). + (lambda () (cg-push-imm cg %t-i32 1)) + ;; body: continue and break referenced by tag below. + (lambda () 0)))) + ;; Demonstrate the helpers — emit one of each then loop-end. + (cg-continue cg tag) + (cg-break cg tag) + (cg-loop-end cg tag)) + (cg-push-imm cg %t-i32 0) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/09-call.expected b/tests/cc-cg/09-call.expected @@ -0,0 +1,17 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 16, { +%li(a0, 3) +%call(&cc__abs) +%st(a0, sp, (+ %main__SO 8)) +%ld(a0, sp, (+ %main__SO 8)) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/09-call.scm b/tests/cc-cg/09-call.scm @@ -0,0 +1,13 @@ +;; tests/cc-cg/09-call.scm — call an external one-arg function. +;; Models: extern int abs(int); int main(void) { return abs(3); } + +(let* ((cg (cg-init)) + (fn-ty (%ctype 'fn 8 8 (cons %t-i32 (cons (list %t-i32) #f)))) + (abs-sym (%sym "abs" 'fn 'extern fn-ty #f))) + (cg-fn-begin cg "main" '() %t-i32) + (cg-push-sym cg abs-sym) + (cg-push-imm cg %t-i32 3) + (cg-call cg 1 #t) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/10-string.expected b/tests/cc-cg/10-string.expected @@ -0,0 +1,22 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 16, { +%li(a0, 0) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) + +:cc__str_0 +"hello" +!(0) + +:cc__str_1 +"world" +!(0) diff --git a/tests/cc-cg/10-string.scm b/tests/cc-cg/10-string.scm @@ -0,0 +1,15 @@ +;; tests/cc-cg/10-string.scm — string literal interning. +;; Pushes "hello" twice; the second push should reuse the same str label. + +(let ((cg (cg-init))) + (cg-fn-begin cg "main" '() %t-i32) + (cg-push-string cg "hello") + (cg-pop cg) ; discard + (cg-push-string cg "hello") ; should intern to same label + (cg-pop cg) + (cg-push-string cg "world") ; new label + (cg-pop cg) + (cg-push-imm cg %t-i32 0) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/11-global-var.expected b/tests/cc-cg/11-global-var.expected @@ -0,0 +1,23 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 16, { +%la(t0, &cc__g) +%ld(t0, t0, 0) +%st(t0, sp, (+ %main__SO 8)) +%ld(a0, sp, (+ %main__SO 8)) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) + +:cc__g +!(0) +!(0) +!(0) +!(0) diff --git a/tests/cc-cg/11-global-var.scm b/tests/cc-cg/11-global-var.scm @@ -0,0 +1,12 @@ +;; tests/cc-cg/11-global-var.scm — emit a global int with no initializer (.bss), +;; then a function that reads it. + +(let* ((cg (cg-init)) + (g (%sym "g" 'var 'static %t-i32 #f))) + (cg-emit-global cg g #f) + (cg-fn-begin cg "main" '() %t-i32) + (cg-push-sym cg g) + (cg-load cg) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/12-entry-stub.expected b/tests/cc-cg/12-entry-stub.expected @@ -0,0 +1,4 @@ +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/12-entry-stub.scm b/tests/cc-cg/12-entry-stub.scm @@ -0,0 +1,5 @@ +;; tests/cc-cg/12-entry-stub.scm — bare cg-finish (no functions emitted) +;; should still produce just the entry stub block. + +(let ((cg (cg-init))) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/13-call-5args.expected b/tests/cc-cg/13-call-5args.expected @@ -0,0 +1,22 @@ +%macro main__SO() +8 +%endm +%fn(cc__main, 32, { +%li(a0, 1) +%li(a1, 2) +%li(a2, 3) +%li(a3, 4) +%li(t0, 5) +%st(t0, sp, 0) +%call(&cc__foo) +%st(a0, sp, (+ %main__SO 8)) +%ld(a0, sp, (+ %main__SO 8)) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/13-call-5args.scm b/tests/cc-cg/13-call-5args.scm @@ -0,0 +1,19 @@ +;; tests/cc-cg/13-call-5args.scm — 5-arg call exercises outgoing-arg +;; staging at [sp + 0*8] and frame size accounting for max-outgoing. + +(let* ((cg (cg-init)) + ;; fn type: 5-arg returning int. + (fn-ty (%ctype 'fn 8 8 + (cons %t-i32 (cons (list %t-i32 %t-i32 %t-i32 %t-i32 %t-i32) #f)))) + (foo (%sym "foo" 'fn 'extern fn-ty #f))) + (cg-fn-begin cg "main" '() %t-i32) + (cg-push-sym cg foo) + (cg-push-imm cg %t-i32 1) + (cg-push-imm cg %t-i32 2) + (cg-push-imm cg %t-i32 3) + (cg-push-imm cg %t-i32 4) + (cg-push-imm cg %t-i32 5) + (cg-call cg 5 #t) + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-cg/14-take-addr.expected b/tests/cc-cg/14-take-addr.expected @@ -0,0 +1,23 @@ +%macro main__SO() +0 +%endm +%fn(cc__main, 48, { +%st(a0, sp, (+ %main__SO 8)) +%mov(t0, sp) +%addi(t0, t0, (+ %main__SO 8)) +%st(t0, sp, (+ %main__SO 16)) +%ld(t0, sp, (+ %main__SO 16)) +%st(t0, sp, (+ %main__SO 24)) +%ld(t0, sp, (+ %main__SO 24)) +%ld(t0, t0, 0) +%st(t0, sp, (+ %main__SO 32)) +%ld(a0, sp, (+ %main__SO 32)) +%st(a0, sp, (+ %main__SO 0)) +%b(&::ret) +::ret +%ld(a0, sp, (+ %main__SO 0)) +}) +# entry stub +%fn(p1_main, 16, { +%call(&cc__main) +}) diff --git a/tests/cc-cg/14-take-addr.scm b/tests/cc-cg/14-take-addr.scm @@ -0,0 +1,16 @@ +;; tests/cc-cg/14-take-addr.scm — exercise cg-take-addr on an int param. +;; Models: int main(int x) { int *p = &x; return *p; } +;; (Without auto-emitted alloc for `p` since this is the cg API; we +;; just push x's lval, take-addr, push-deref, load, return.) + +(let* ((cg (cg-init)) + (x (%sym "x" 'param #f %t-i32 #f)) + (params (cg-fn-begin cg "main" (list x) %t-i32)) + (x* (car params))) + (cg-push-sym cg x*) ; lval frame + (cg-take-addr cg) ; rval ptr-to-int + (cg-push-deref cg) ; lval int (through pointer) + (cg-load cg) ; rval int + (cg-return cg) + (cg-fn-end cg) + (write-bv-fd 1 (cg-finish cg))) diff --git a/tests/cc-lex/01-keywords.expected-toks b/tests/cc-lex/01-keywords.expected-toks @@ -8,4 +8,5 @@ (INT 0 "01-keywords.c" 1 25) (PUNCT semi "01-keywords.c" 1 26) (PUNCT rbrace "01-keywords.c" 1 28) +(NL #f "01-keywords.c" 1 29) (EOF #f "01-keywords.c" 2 1) diff --git a/tests/cc-lex/02-integers.c b/tests/cc-lex/02-integers.c @@ -0,0 +1 @@ +42 0 7 100u 0L 0xff 0XFF 0755 1ULL diff --git a/tests/cc-lex/02-integers.expected-toks b/tests/cc-lex/02-integers.expected-toks @@ -0,0 +1,11 @@ +(INT 42 "02-integers.c" 1 1) +(INT 0 "02-integers.c" 1 4) +(INT 7 "02-integers.c" 1 6) +(INT 100 "02-integers.c" 1 8) +(INT 0 "02-integers.c" 1 13) +(INT 255 "02-integers.c" 1 16) +(INT 255 "02-integers.c" 1 21) +(INT 493 "02-integers.c" 1 26) +(INT 1 "02-integers.c" 1 31) +(NL #f "02-integers.c" 1 35) +(EOF #f "02-integers.c" 2 1) diff --git a/tests/cc-lex/03-strings.c b/tests/cc-lex/03-strings.c @@ -0,0 +1 @@ +"hello" "a\nb\tc" "\\\"" "\x41\x42" "\0end" diff --git a/tests/cc-lex/03-strings.expected-toks b/tests/cc-lex/03-strings.expected-toks @@ -0,0 +1,7 @@ +(STR "hello" "03-strings.c" 1 1) +(STR "a\nb\tc" "03-strings.c" 1 9) +(STR "\\\"" "03-strings.c" 1 19) +(STR "AB" "03-strings.c" 1 26) +(STR "\x00end" "03-strings.c" 1 37) +(NL #f "03-strings.c" 1 44) +(EOF #f "03-strings.c" 2 1) diff --git a/tests/cc-lex/04-chars.c b/tests/cc-lex/04-chars.c @@ -0,0 +1 @@ +'a' '\n' '\t' '\\' '\'' '\x41' '\0' '\077' diff --git a/tests/cc-lex/04-chars.expected-toks b/tests/cc-lex/04-chars.expected-toks @@ -0,0 +1,10 @@ +(CHAR 97 "04-chars.c" 1 1) +(CHAR 10 "04-chars.c" 1 5) +(CHAR 9 "04-chars.c" 1 10) +(CHAR 92 "04-chars.c" 1 15) +(CHAR 39 "04-chars.c" 1 20) +(CHAR 65 "04-chars.c" 1 25) +(CHAR 0 "04-chars.c" 1 32) +(CHAR 63 "04-chars.c" 1 37) +(NL #f "04-chars.c" 1 43) +(EOF #f "04-chars.c" 2 1) diff --git a/tests/cc-lex/05-comments.c b/tests/cc-lex/05-comments.c @@ -0,0 +1,4 @@ +int x; // line comment +int /* block */ y; +/* multi + line */ int z; diff --git a/tests/cc-lex/05-comments.expected-toks b/tests/cc-lex/05-comments.expected-toks @@ -0,0 +1,13 @@ +(KW int "05-comments.c" 1 1) +(IDENT "x" "05-comments.c" 1 5) +(PUNCT semi "05-comments.c" 1 6) +(NL #f "05-comments.c" 1 23) +(KW int "05-comments.c" 2 1) +(IDENT "y" "05-comments.c" 2 17) +(PUNCT semi "05-comments.c" 2 18) +(NL #f "05-comments.c" 2 19) +(KW int "05-comments.c" 4 12) +(IDENT "z" "05-comments.c" 4 16) +(PUNCT semi "05-comments.c" 4 17) +(NL #f "05-comments.c" 4 18) +(EOF #f "05-comments.c" 5 1) diff --git a/tests/cc-lex/06-line-splice.c b/tests/cc-lex/06-line-splice.c @@ -0,0 +1,2 @@ +int ab\ +cd = 1; diff --git a/tests/cc-lex/06-line-splice.expected-toks b/tests/cc-lex/06-line-splice.expected-toks @@ -0,0 +1,7 @@ +(KW int "06-line-splice.c" 1 1) +(IDENT "abcd" "06-line-splice.c" 1 5) +(PUNCT assign "06-line-splice.c" 2 4) +(INT 1 "06-line-splice.c" 2 6) +(PUNCT semi "06-line-splice.c" 2 7) +(NL #f "06-line-splice.c" 2 8) +(EOF #f "06-line-splice.c" 3 1) diff --git a/tests/cc-lex/07-punctuators.c b/tests/cc-lex/07-punctuators.c @@ -0,0 +1,3 @@ +[](){}.,;:?...-> +++ -- + - * / % & | ^ ~ ! < > = == != <= >= && || << >> += -= *= /= %= &= |= ^= <<= >>= +# ## diff --git a/tests/cc-lex/07-punctuators.expected-toks b/tests/cc-lex/07-punctuators.expected-toks @@ -0,0 +1,52 @@ +(PUNCT lbrack "07-punctuators.c" 1 1) +(PUNCT rbrack "07-punctuators.c" 1 2) +(PUNCT lparen "07-punctuators.c" 1 3) +(PUNCT rparen "07-punctuators.c" 1 4) +(PUNCT lbrace "07-punctuators.c" 1 5) +(PUNCT rbrace "07-punctuators.c" 1 6) +(PUNCT dot "07-punctuators.c" 1 7) +(PUNCT comma "07-punctuators.c" 1 8) +(PUNCT semi "07-punctuators.c" 1 9) +(PUNCT colon "07-punctuators.c" 1 10) +(PUNCT qmark "07-punctuators.c" 1 11) +(PUNCT ellipsis "07-punctuators.c" 1 12) +(PUNCT arrow "07-punctuators.c" 1 15) +(NL #f "07-punctuators.c" 1 17) +(PUNCT inc "07-punctuators.c" 2 1) +(PUNCT dec "07-punctuators.c" 2 4) +(PUNCT plus "07-punctuators.c" 2 7) +(PUNCT minus "07-punctuators.c" 2 9) +(PUNCT star "07-punctuators.c" 2 11) +(PUNCT slash "07-punctuators.c" 2 13) +(PUNCT pct "07-punctuators.c" 2 15) +(PUNCT amp "07-punctuators.c" 2 17) +(PUNCT bar "07-punctuators.c" 2 19) +(PUNCT caret "07-punctuators.c" 2 21) +(PUNCT tilde "07-punctuators.c" 2 23) +(PUNCT bang "07-punctuators.c" 2 25) +(PUNCT lt "07-punctuators.c" 2 27) +(PUNCT gt "07-punctuators.c" 2 29) +(PUNCT assign "07-punctuators.c" 2 31) +(PUNCT eq2 "07-punctuators.c" 2 33) +(PUNCT ne "07-punctuators.c" 2 36) +(PUNCT le "07-punctuators.c" 2 39) +(PUNCT ge "07-punctuators.c" 2 42) +(PUNCT land "07-punctuators.c" 2 45) +(PUNCT lor "07-punctuators.c" 2 48) +(PUNCT shl "07-punctuators.c" 2 51) +(PUNCT shr "07-punctuators.c" 2 54) +(PUNCT plus-eq "07-punctuators.c" 2 57) +(PUNCT minus-eq "07-punctuators.c" 2 60) +(PUNCT star-eq "07-punctuators.c" 2 63) +(PUNCT slash-eq "07-punctuators.c" 2 66) +(PUNCT pct-eq "07-punctuators.c" 2 69) +(PUNCT amp-eq "07-punctuators.c" 2 72) +(PUNCT bar-eq "07-punctuators.c" 2 75) +(PUNCT caret-eq "07-punctuators.c" 2 78) +(PUNCT shl-eq "07-punctuators.c" 2 81) +(PUNCT shr-eq "07-punctuators.c" 2 85) +(NL #f "07-punctuators.c" 2 88) +(PUNCT hash "07-punctuators.c" 3 1) +(PUNCT paste "07-punctuators.c" 3 3) +(NL #f "07-punctuators.c" 3 5) +(EOF #f "07-punctuators.c" 4 1) diff --git a/tests/cc-lex/08-digraphs.c b/tests/cc-lex/08-digraphs.c @@ -0,0 +1 @@ +int a<:5:> = <%1, 2%>; %: %:%: diff --git a/tests/cc-lex/08-digraphs.expected-toks b/tests/cc-lex/08-digraphs.expected-toks @@ -0,0 +1,16 @@ +(KW int "08-digraphs.c" 1 1) +(IDENT "a" "08-digraphs.c" 1 5) +(PUNCT lbrack "08-digraphs.c" 1 6) +(INT 5 "08-digraphs.c" 1 8) +(PUNCT rbrack "08-digraphs.c" 1 9) +(PUNCT assign "08-digraphs.c" 1 12) +(PUNCT lbrace "08-digraphs.c" 1 14) +(INT 1 "08-digraphs.c" 1 16) +(PUNCT comma "08-digraphs.c" 1 17) +(INT 2 "08-digraphs.c" 1 19) +(PUNCT rbrace "08-digraphs.c" 1 20) +(PUNCT semi "08-digraphs.c" 1 22) +(PUNCT hash "08-digraphs.c" 1 24) +(PUNCT paste "08-digraphs.c" 1 27) +(NL #f "08-digraphs.c" 1 31) +(EOF #f "08-digraphs.c" 2 1) diff --git a/tests/cc-lex/09-kw-vs-ident.c b/tests/cc-lex/09-kw-vs-ident.c @@ -0,0 +1 @@ +if ifx int integer void voidp _Bool _BoolX diff --git a/tests/cc-lex/09-kw-vs-ident.expected-toks b/tests/cc-lex/09-kw-vs-ident.expected-toks @@ -0,0 +1,10 @@ +(KW if "09-kw-vs-ident.c" 1 1) +(IDENT "ifx" "09-kw-vs-ident.c" 1 4) +(KW int "09-kw-vs-ident.c" 1 8) +(IDENT "integer" "09-kw-vs-ident.c" 1 12) +(KW void "09-kw-vs-ident.c" 1 20) +(IDENT "voidp" "09-kw-vs-ident.c" 1 25) +(KW _Bool "09-kw-vs-ident.c" 1 31) +(IDENT "_BoolX" "09-kw-vs-ident.c" 1 37) +(NL #f "09-kw-vs-ident.c" 1 43) +(EOF #f "09-kw-vs-ident.c" 2 1) diff --git a/tests/cc-lex/10-nl-tokens.c b/tests/cc-lex/10-nl-tokens.c @@ -0,0 +1,3 @@ +a + +b diff --git a/tests/cc-lex/10-nl-tokens.expected-toks b/tests/cc-lex/10-nl-tokens.expected-toks @@ -0,0 +1,6 @@ +(IDENT "a" "10-nl-tokens.c" 1 1) +(NL #f "10-nl-tokens.c" 1 2) +(NL #f "10-nl-tokens.c" 2 1) +(IDENT "b" "10-nl-tokens.c" 3 1) +(NL #f "10-nl-tokens.c" 3 2) +(EOF #f "10-nl-tokens.c" 4 1) diff --git a/tests/cc-lex/11-trigraphs.c b/tests/cc-lex/11-trigraphs.c @@ -0,0 +1 @@ +??=define x ??(0??) diff --git a/tests/cc-lex/11-trigraphs.expected-toks b/tests/cc-lex/11-trigraphs.expected-toks @@ -0,0 +1,8 @@ +(PUNCT hash "11-trigraphs.c" 1 1) +(IDENT "define" "11-trigraphs.c" 1 4) +(IDENT "x" "11-trigraphs.c" 1 11) +(PUNCT lbrack "11-trigraphs.c" 1 13) +(INT 0 "11-trigraphs.c" 1 16) +(PUNCT rbrack "11-trigraphs.c" 1 17) +(NL #f "11-trigraphs.c" 1 20) +(EOF #f "11-trigraphs.c" 2 1) diff --git a/tests/cc-lex/12-program.c b/tests/cc-lex/12-program.c @@ -0,0 +1,3 @@ +int main(int argc, char **argv) { + return argc; +} diff --git a/tests/cc-lex/12-program.expected-toks b/tests/cc-lex/12-program.expected-toks @@ -0,0 +1,20 @@ +(KW int "12-program.c" 1 1) +(IDENT "main" "12-program.c" 1 5) +(PUNCT lparen "12-program.c" 1 9) +(KW int "12-program.c" 1 10) +(IDENT "argc" "12-program.c" 1 14) +(PUNCT comma "12-program.c" 1 18) +(KW char "12-program.c" 1 20) +(PUNCT star "12-program.c" 1 25) +(PUNCT star "12-program.c" 1 26) +(IDENT "argv" "12-program.c" 1 27) +(PUNCT rparen "12-program.c" 1 31) +(PUNCT lbrace "12-program.c" 1 33) +(NL #f "12-program.c" 1 34) +(KW return "12-program.c" 2 5) +(IDENT "argc" "12-program.c" 2 12) +(PUNCT semi "12-program.c" 2 16) +(NL #f "12-program.c" 2 17) +(PUNCT rbrace "12-program.c" 3 1) +(NL #f "12-program.c" 3 2) +(EOF #f "12-program.c" 4 1) diff --git a/tests/cc-lex/13-keywords-all.c b/tests/cc-lex/13-keywords-all.c @@ -0,0 +1,6 @@ +auto register static extern typedef +const volatile restrict inline +short long signed unsigned +struct union enum +while do for switch case default break continue goto +sizeof diff --git a/tests/cc-lex/13-keywords-all.expected-toks b/tests/cc-lex/13-keywords-all.expected-toks @@ -0,0 +1,33 @@ +(KW auto "13-keywords-all.c" 1 1) +(KW register "13-keywords-all.c" 1 6) +(KW static "13-keywords-all.c" 1 15) +(KW extern "13-keywords-all.c" 1 22) +(KW typedef "13-keywords-all.c" 1 29) +(NL #f "13-keywords-all.c" 1 36) +(KW const "13-keywords-all.c" 2 1) +(KW volatile "13-keywords-all.c" 2 7) +(KW restrict "13-keywords-all.c" 2 16) +(KW inline "13-keywords-all.c" 2 25) +(NL #f "13-keywords-all.c" 2 31) +(KW short "13-keywords-all.c" 3 1) +(KW long "13-keywords-all.c" 3 7) +(KW signed "13-keywords-all.c" 3 12) +(KW unsigned "13-keywords-all.c" 3 19) +(NL #f "13-keywords-all.c" 3 27) +(KW struct "13-keywords-all.c" 4 1) +(KW union "13-keywords-all.c" 4 8) +(KW enum "13-keywords-all.c" 4 14) +(NL #f "13-keywords-all.c" 4 18) +(KW while "13-keywords-all.c" 5 1) +(KW do "13-keywords-all.c" 5 7) +(KW for "13-keywords-all.c" 5 10) +(KW switch "13-keywords-all.c" 5 14) +(KW case "13-keywords-all.c" 5 21) +(KW default "13-keywords-all.c" 5 26) +(KW break "13-keywords-all.c" 5 34) +(KW continue "13-keywords-all.c" 5 40) +(KW goto "13-keywords-all.c" 5 49) +(NL #f "13-keywords-all.c" 5 53) +(KW sizeof "13-keywords-all.c" 6 1) +(NL #f "13-keywords-all.c" 6 7) +(EOF #f "13-keywords-all.c" 7 1) diff --git a/tests/cc-lex/14-reject-float.c b/tests/cc-lex/14-reject-float.c @@ -0,0 +1 @@ +int x = 1.0; diff --git a/tests/cc-lex/14-reject-float.expected-exit b/tests/cc-lex/14-reject-float.expected-exit @@ -0,0 +1 @@ +1 diff --git a/tests/cc-lex/14-reject-float.expected-toks b/tests/cc-lex/14-reject-float.expected-toks @@ -0,0 +1,4 @@ +;; Negative test: lex must die on float literal "1.0". +;; Expected exit status: 1 (see 14-reject-float.expected-exit). +;; Expected stderr prefix: +;; 14-reject-float.c:1:9: error: floating-point literal not supported diff --git a/tests/cc-lex/15-reject-multichar.c b/tests/cc-lex/15-reject-multichar.c @@ -0,0 +1 @@ +int c = 'AB'; diff --git a/tests/cc-lex/15-reject-multichar.expected-exit b/tests/cc-lex/15-reject-multichar.expected-exit @@ -0,0 +1 @@ +1 diff --git a/tests/cc-lex/15-reject-multichar.expected-toks b/tests/cc-lex/15-reject-multichar.expected-toks @@ -0,0 +1,4 @@ +;; Negative test: lex must die on multi-character char constant 'AB'. +;; Expected exit status: 1 (see 15-reject-multichar.expected-exit). +;; Expected stderr prefix: +;; 15-reject-multichar.c:1:9: error: multi-character char constant not supported diff --git a/tests/cc-lex/run-lex.scm b/tests/cc-lex/run-lex.scm @@ -0,0 +1,117 @@ +;; tests/cc-lex/run-lex.scm — driver for cc-lex fixtures. +;; +;; Reads a fixture path from (argv) — argv[0] is scheme1's program +;; name; argv[1] is the combined source file (the test runner catm's +;; prelude+lex+driver into one); argv[2] is the .c fixture path. +;; +;; Calls lex-tokenize and prints one line per tok in CC-CONTRACTS §2.1 +;; serialization. Run: +;; scheme1 /tmp/c.scm <FIXTURE.c> + +(define (%hex-nibble n) + ;; n in 0..15 -> ASCII byte for the lowercase hex digit. + (if (< n 10) (+ n 48) (+ n 87))) + +(define (%bv-escape bv) + ;; Escape a bv per §2.1: surround with "...", and inside replace + ;; \n \t \r \\ \" plus non-ASCII bytes with \xNN. Returns a fresh bv. + (let* ((n (bytevector-length bv)) + (buf (make-buf))) + (buf-push! buf "\"") + (let loop ((i 0)) + (cond + ((= i n) + (buf-push! buf "\"") + (buf-flush buf)) + (else + (let ((b (bytevector-u8-ref bv i))) + (cond + ((= b 10) (buf-push! buf "\\n")) + ((= b 9) (buf-push! buf "\\t")) + ((= b 13) (buf-push! buf "\\r")) + ((= b 92) (buf-push! buf "\\\\")) + ((= b 34) (buf-push! buf "\\\"")) + ;; Printable ASCII: 32..126 inclusive (except 34/92 above) + ((and (>= b 32) (<= b 126)) + (buf-push! buf (bv-of-byte b))) + (else + (let* ((hi (%hex-nibble (bit-and (arithmetic-shift b -4) 15))) + (lo (%hex-nibble (bit-and b 15)))) + (buf-push! buf "\\x") + (buf-push! buf (bv-of-byte hi)) + (buf-push! buf (bv-of-byte lo))))) + (loop (+ i 1)))))))) + +(define (%fmt-value kind val) + ;; Stringify the tok value per §2.1. Returns a bv. + (cond + ((eq? kind 'IDENT) (%bv-escape val)) + ((eq? kind 'STR) (%bv-escape val)) + ((eq? kind 'INT) (format "~d" val)) + ((eq? kind 'CHAR) (format "~d" val)) + ((eq? kind 'KW) (format "~a" val)) + ((eq? kind 'PUNCT) (format "~a" val)) + ((eq? kind 'HASH) "#f") + ((eq? kind 'NL) "#f") + ((eq? kind 'EOF) "#f") + (else (format "~a" val)))) + +(define (%fmt-tok t) + ;; Returns a bv ending in '\n'. + (let* ((kind (tok-kind t)) + (val (tok-value t)) + (loc (tok-loc t)) + (file (loc-file loc)) + (line (loc-line loc)) + (col (loc-col loc))) + (bv-cat (list "(" (format "~a" kind) " " + (%fmt-value kind val) " " + (%bv-escape file) " " + (format "~d" line) " " + (format "~d" col) ")" + NL-BV)))) + +(define (%emit-toks toks) + (cond + ((null? toks) #t) + (else + (write-bv-fd 1 (%fmt-tok (car toks))) + (%emit-toks (cdr toks))))) + +(define (%basename path) + ;; Strip directory components from a bv path. Returns a fresh bv + ;; containing the trailing path component (after the last '/'). + (let* ((n (bytevector-length path))) + (let loop ((i (- n 1))) + (cond + ((< i 0) path) + ((= (bytevector-u8-ref path i) 47) + (bv-slice path (+ i 1) n)) + (else (loop (- i 1))))))) + +(define (%run-lex path) + (let ((op (open-input path))) + (if (not (car op)) + (begin + (write-bv-fd 2 "run-lex: cannot open ") + (write-bv-fd 2 path) + (write-bv-fd 2 NL-BV) + (sys-exit 2)) + (let* ((src (slurp-fd (port-fd (cdr op)))) + (file (%basename path)) + (toks (lex-tokenize src file))) + (sys-close (port-fd (cdr op))) + (%emit-toks toks) + (sys-exit 0))))) + +(let ((args (argv))) + ;; argv[0] = scheme1 program path + ;; argv[1] = combined source (the harness arranges this) + ;; argv[2] = .c fixture path + (cond + ((null? args) (sys-exit 2)) + ((null? (cdr args)) (sys-exit 2)) + ((null? (cdr (cdr args))) + (write-bv-fd 2 "run-lex: missing fixture path\n") + (sys-exit 2)) + (else (%run-lex (car (cdr (cdr args))))))) diff --git a/tests/cc-lex/run.sh b/tests/cc-lex/run.sh @@ -0,0 +1,104 @@ +#!/bin/sh +## tests/cc-lex/run.sh — fixture-by-fixture lex test loop. +## +## Usage: +## tests/cc-lex/run.sh # run all fixtures +## tests/cc-lex/run.sh 02 05 # run only matching prefixes +## +## Per-fixture protocol: +## - <NN>-<name>.c is the input. +## - <NN>-<name>.expected-toks is the golden token serialization +## (one tok per line, per CC-CONTRACTS §2.1). Lines beginning +## with `;;` are ignored, allowing negative tests to carry a +## human-readable note in their .expected-toks file. +## - <NN>-<name>.expected-exit overrides the expected exit status +## (default: 0). +## +## Each fixture is run inside boot2-busybox:aarch64 by catm'ing +## prelude+lex+data+util+driver, then invoking scheme1 on the combined +## file with the .c fixture as argv[2]. Stdout is diffed against the +## (filtered) .expected-toks; exit status is diffed against +## .expected-exit. stderr is suppressed (negative tests only check +## exit status). + +set -eu +REPO=$(cd "$(dirname "$0")/../.." && pwd) +cd "$REPO" + +ARCH=${ARCH:-aarch64} +case "$ARCH" in + aarch64) PLATFORM=linux/arm64 ;; + amd64) PLATFORM=linux/amd64 ;; + riscv64) PLATFORM=linux/riscv64 ;; + *) echo "unknown ARCH=$ARCH" >&2; exit 2 ;; +esac + +PASS=0 +FAIL=0 + +discover() { + ls tests/cc-lex \ + | sed -n 's/^\([^_][^.]*\)\.c$/\1/p' \ + | sort -u +} + +NAMES="$*" +[ -n "$NAMES" ] || NAMES=$(discover) + +# Strip ;; comment lines from a file; emit the rest. Used to filter +# expected-toks so negative tests can carry a human note. +strip_comments() { + sed -n '/^;;/d; p' "$1" +} + +for prefix in $NAMES; do + for c in tests/cc-lex/${prefix}*.c; do + [ -e "$c" ] || continue + name=$(basename "$c" .c) + expected_toks=tests/cc-lex/$name.expected-toks + expected_exit_file=tests/cc-lex/$name.expected-exit + + if [ ! -e "$expected_toks" ]; then + echo " SKIP $name (no .expected-toks)" + continue + fi + expected=$(strip_comments "$expected_toks") + if [ -e "$expected_exit_file" ]; then + expected_exit=$(cat "$expected_exit_file" | tr -d '\n') + else + expected_exit=0 + fi + + # Build the combined source and run. + actual=$(podman run --rm --pull=never --platform "$PLATFORM" \ + --tmpfs /tmp:size=512M -v "$REPO":/work -w /work \ + "boot2-busybox:$ARCH" \ + sh -c "build/$ARCH/tools/catm /tmp/c.scm \ + scheme1/prelude.scm cc/util.scm cc/data.scm \ + cc/lex.scm tests/cc-lex/run-lex.scm; \ + build/$ARCH/scheme1 /tmp/c.scm tests/cc-lex/$name.c" \ + 2>/dev/null) || actual_exit=$? + actual_exit=${actual_exit:-0} + + if [ "$actual" = "$expected" ] && [ "$actual_exit" = "$expected_exit" ]; then + PASS=$((PASS + 1)) + echo " PASS $name" + else + FAIL=$((FAIL + 1)) + echo " FAIL $name" + if [ "$actual" != "$expected" ]; then + echo " --- expected ---" + printf '%s\n' "$expected" | sed 's/^/ /' + echo " --- actual ---" + printf '%s\n' "$actual" | sed 's/^/ /' + fi + if [ "$actual_exit" != "$expected_exit" ]; then + echo " exit: expected $expected_exit, got $actual_exit" + fi + fi + unset actual_exit + done +done + +echo "$PASS passed, $FAIL failed" +[ "$FAIL" -eq 0 ] diff --git a/tests/cc-parse/00-empty-main.expected-trace b/tests/cc-parse/00-empty-main.expected-trace @@ -1,4 +1,5 @@ (fn-begin "main" () i32) (push-imm i32 0) +(cast i32) (return) (fn-end) diff --git a/tests/cc-parse/00-empty-main.scm b/tests/cc-parse/00-empty-main.scm @@ -0,0 +1,27 @@ +;; tests/cc-parse/00-empty-main.scm — driver for 00-empty-main.c. +;; +;; Hand-builds the tokens for: int main(void) { return 0; } +;; Runs parse-translation-unit against the cg-trace mock and prints +;; the trace. The expected-trace golden is diffed by the runner. + +(define %loc0 (%loc "00-empty-main.c" 1 1)) +(define (mk k v) (make-tok k v %loc0)) + +(define %toks + (list + (mk 'KW 'int) + (mk 'IDENT "main") + (mk 'PUNCT 'lparen) + (mk 'KW 'void) + (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) + (mk 'INT 0) + (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) + (mk 'EOF #f))) + +(let ((cg (cg-init))) + (let ((ps (make-pstate %toks cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/01-return-argc.c b/tests/cc-parse/01-return-argc.c @@ -0,0 +1 @@ +int main(int argc, char **argv) { return argc; } diff --git a/tests/cc-parse/01-return-argc.expected-trace b/tests/cc-parse/01-return-argc.expected-trace @@ -0,0 +1,6 @@ +(fn-begin "main" (("argc" i32) ("argv" (ptr (ptr i8)))) i32) +(push-sym ("argc" param)) +(load) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/01-return-argc.scm b/tests/cc-parse/01-return-argc.scm @@ -0,0 +1,17 @@ +;; int main(int argc, char **argv) { return argc; } +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "main") + (mk 'PUNCT 'lparen) + (mk 'KW 'int) (mk 'IDENT "argc") (mk 'PUNCT 'comma) + (mk 'KW 'char) (mk 'PUNCT 'star) (mk 'PUNCT 'star) + (mk 'IDENT "argv") (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) (mk 'IDENT "argc") (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/02-add-const.c b/tests/cc-parse/02-add-const.c @@ -0,0 +1 @@ +int main(void) { return 1 + 2; } diff --git a/tests/cc-parse/02-add-const.expected-trace b/tests/cc-parse/02-add-const.expected-trace @@ -0,0 +1,10 @@ +(fn-begin "main" () i32) +(push-imm i32 1) +(promote) +(push-imm i32 2) +(promote) +(arith-conv) +(binop add) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/02-add-const.scm b/tests/cc-parse/02-add-const.scm @@ -0,0 +1,15 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "main") + (mk 'PUNCT 'lparen) (mk 'KW 'void) (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) + (mk 'INT 1) (mk 'PUNCT 'plus) (mk 'INT 2) + (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/03-local-assign.c b/tests/cc-parse/03-local-assign.c @@ -0,0 +1 @@ +int main(void) { int x; x = 5; return x; } diff --git a/tests/cc-parse/03-local-assign.expected-trace b/tests/cc-parse/03-local-assign.expected-trace @@ -0,0 +1,10 @@ +(fn-begin "main" () i32) +(alloc-slot 4 4 0) +(push-sym ("x" var)) +(push-imm i32 5) +(assign) +(push-sym ("x" var)) +(load) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/03-local-assign.scm b/tests/cc-parse/03-local-assign.scm @@ -0,0 +1,16 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "main") + (mk 'PUNCT 'lparen) (mk 'KW 'void) (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'int) (mk 'IDENT "x") (mk 'PUNCT 'semi) + (mk 'IDENT "x") (mk 'PUNCT 'assign) (mk 'INT 5) + (mk 'PUNCT 'semi) + (mk 'KW 'return) (mk 'IDENT "x") (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/04-if-else.c b/tests/cc-parse/04-if-else.c @@ -0,0 +1 @@ +int f(int x) { if (x) return 1; else return 0; } diff --git a/tests/cc-parse/04-if-else.expected-trace b/tests/cc-parse/04-if-else.expected-trace @@ -0,0 +1,13 @@ +(fn-begin "f" (("x" i32)) i32) +(push-sym ("x" param)) +(load) +(ifelse-begin) +(push-imm i32 1) +(cast i32) +(return) +(ifelse-mid) +(push-imm i32 0) +(cast i32) +(return) +(ifelse-end) +(fn-end) diff --git a/tests/cc-parse/04-if-else.scm b/tests/cc-parse/04-if-else.scm @@ -0,0 +1,18 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "f") + (mk 'PUNCT 'lparen) + (mk 'KW 'int) (mk 'IDENT "x") (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'if) (mk 'PUNCT 'lparen) (mk 'IDENT "x") + (mk 'PUNCT 'rparen) + (mk 'KW 'return) (mk 'INT 1) (mk 'PUNCT 'semi) + (mk 'KW 'else) + (mk 'KW 'return) (mk 'INT 0) (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/05-while-break.c b/tests/cc-parse/05-while-break.c @@ -0,0 +1 @@ +int f(int x) { while (x) { break; } return x; } diff --git a/tests/cc-parse/05-while-break.expected-trace b/tests/cc-parse/05-while-break.expected-trace @@ -0,0 +1,11 @@ +(fn-begin "f" (("x" i32)) i32) +(loop-begin) +(push-sym ("x" param)) +(load) +(break "Lp0") +(loop-end) +(push-sym ("x" param)) +(load) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/05-while-break.scm b/tests/cc-parse/05-while-break.scm @@ -0,0 +1,19 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "f") + (mk 'PUNCT 'lparen) + (mk 'KW 'int) (mk 'IDENT "x") (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'while) (mk 'PUNCT 'lparen) (mk 'IDENT "x") + (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'break) (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) + (mk 'KW 'return) (mk 'IDENT "x") (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/06-call-no-args.c b/tests/cc-parse/06-call-no-args.c @@ -0,0 +1,2 @@ +int g(void); +int main(void) { return g(); } diff --git a/tests/cc-parse/06-call-no-args.expected-trace b/tests/cc-parse/06-call-no-args.expected-trace @@ -0,0 +1,6 @@ +(fn-begin "main" () i32) +(push-sym ("g" fn)) +(call 0 #t) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/06-call-no-args.scm b/tests/cc-parse/06-call-no-args.scm @@ -0,0 +1,18 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "g") + (mk 'PUNCT 'lparen) (mk 'KW 'void) (mk 'PUNCT 'rparen) + (mk 'PUNCT 'semi) + (mk 'KW 'int) (mk 'IDENT "main") + (mk 'PUNCT 'lparen) (mk 'KW 'void) (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) (mk 'IDENT "g") + (mk 'PUNCT 'lparen) (mk 'PUNCT 'rparen) + (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/07-call-with-args.c b/tests/cc-parse/07-call-with-args.c @@ -0,0 +1,2 @@ +int g(int, int); +int main(void) { return g(1, 2); } diff --git a/tests/cc-parse/07-call-with-args.expected-trace b/tests/cc-parse/07-call-with-args.expected-trace @@ -0,0 +1,8 @@ +(fn-begin "main" () i32) +(push-sym ("g" fn)) +(push-imm i32 1) +(push-imm i32 2) +(call 2 #t) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/07-call-with-args.scm b/tests/cc-parse/07-call-with-args.scm @@ -0,0 +1,19 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "g") + (mk 'PUNCT 'lparen) (mk 'KW 'int) (mk 'PUNCT 'comma) + (mk 'KW 'int) (mk 'PUNCT 'rparen) (mk 'PUNCT 'semi) + (mk 'KW 'int) (mk 'IDENT "main") + (mk 'PUNCT 'lparen) (mk 'KW 'void) (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) (mk 'IDENT "g") + (mk 'PUNCT 'lparen) + (mk 'INT 1) (mk 'PUNCT 'comma) (mk 'INT 2) + (mk 'PUNCT 'rparen) (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/08-pointer-deref.c b/tests/cc-parse/08-pointer-deref.c @@ -0,0 +1 @@ +int f(int *p) { return *p; } diff --git a/tests/cc-parse/08-pointer-deref.expected-trace b/tests/cc-parse/08-pointer-deref.expected-trace @@ -0,0 +1,8 @@ +(fn-begin "f" (("p" (ptr i32))) i32) +(push-sym ("p" param)) +(load) +(push-deref) +(load) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/08-pointer-deref.scm b/tests/cc-parse/08-pointer-deref.scm @@ -0,0 +1,16 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "f") + (mk 'PUNCT 'lparen) + (mk 'KW 'int) (mk 'PUNCT 'star) (mk 'IDENT "p") + (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) + (mk 'PUNCT 'star) (mk 'IDENT "p") (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/09-address-of.c b/tests/cc-parse/09-address-of.c @@ -0,0 +1 @@ +int *f(int x) { return &x; } diff --git a/tests/cc-parse/09-address-of.expected-trace b/tests/cc-parse/09-address-of.expected-trace @@ -0,0 +1,6 @@ +(fn-begin "f" (("x" i32)) (ptr i32)) +(push-sym ("x" param)) +(take-addr) +(cast (ptr i32)) +(return) +(fn-end) diff --git a/tests/cc-parse/09-address-of.scm b/tests/cc-parse/09-address-of.scm @@ -0,0 +1,15 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'PUNCT 'star) (mk 'IDENT "f") + (mk 'PUNCT 'lparen) + (mk 'KW 'int) (mk 'IDENT "x") (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) + (mk 'PUNCT 'amp) (mk 'IDENT "x") (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/10-typedef.c b/tests/cc-parse/10-typedef.c @@ -0,0 +1,2 @@ +typedef int myint; +myint f(myint x) { return x; } diff --git a/tests/cc-parse/10-typedef.expected-trace b/tests/cc-parse/10-typedef.expected-trace @@ -0,0 +1,6 @@ +(fn-begin "f" (("x" i32)) i32) +(push-sym ("x" param)) +(load) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/10-typedef.scm b/tests/cc-parse/10-typedef.scm @@ -0,0 +1,16 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'typedef) (mk 'KW 'int) (mk 'IDENT "myint") + (mk 'PUNCT 'semi) + (mk 'IDENT "myint") (mk 'IDENT "f") + (mk 'PUNCT 'lparen) + (mk 'IDENT "myint") (mk 'IDENT "x") (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) (mk 'IDENT "x") (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/11-two-params.c b/tests/cc-parse/11-two-params.c @@ -0,0 +1 @@ +int add(int a, int b) { return a + b; } diff --git a/tests/cc-parse/11-two-params.expected-trace b/tests/cc-parse/11-two-params.expected-trace @@ -0,0 +1,12 @@ +(fn-begin "add" (("a" i32) ("b" i32)) i32) +(push-sym ("a" param)) +(load) +(promote) +(push-sym ("b" param)) +(load) +(promote) +(arith-conv) +(binop add) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/11-two-params.scm b/tests/cc-parse/11-two-params.scm @@ -0,0 +1,17 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "add") + (mk 'PUNCT 'lparen) + (mk 'KW 'int) (mk 'IDENT "a") (mk 'PUNCT 'comma) + (mk 'KW 'int) (mk 'IDENT "b") (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) + (mk 'IDENT "a") (mk 'PUNCT 'plus) (mk 'IDENT "b") + (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/12-comparison.c b/tests/cc-parse/12-comparison.c @@ -0,0 +1 @@ +int f(int a, int b) { return a < b; } diff --git a/tests/cc-parse/12-comparison.expected-trace b/tests/cc-parse/12-comparison.expected-trace @@ -0,0 +1,12 @@ +(fn-begin "f" (("a" i32) ("b" i32)) i32) +(push-sym ("a" param)) +(load) +(promote) +(push-sym ("b" param)) +(load) +(promote) +(arith-conv) +(binop lt) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/12-comparison.scm b/tests/cc-parse/12-comparison.scm @@ -0,0 +1,17 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "f") + (mk 'PUNCT 'lparen) + (mk 'KW 'int) (mk 'IDENT "a") (mk 'PUNCT 'comma) + (mk 'KW 'int) (mk 'IDENT "b") (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) + (mk 'IDENT "a") (mk 'PUNCT 'lt) (mk 'IDENT "b") + (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/13-while-continue.c b/tests/cc-parse/13-while-continue.c @@ -0,0 +1 @@ +int f(int x) { while (x) { continue; } return 0; } diff --git a/tests/cc-parse/13-while-continue.expected-trace b/tests/cc-parse/13-while-continue.expected-trace @@ -0,0 +1,10 @@ +(fn-begin "f" (("x" i32)) i32) +(loop-begin) +(push-sym ("x" param)) +(load) +(continue "Lp0") +(loop-end) +(push-imm i32 0) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/13-while-continue.scm b/tests/cc-parse/13-while-continue.scm @@ -0,0 +1,19 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "f") + (mk 'PUNCT 'lparen) + (mk 'KW 'int) (mk 'IDENT "x") (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'while) (mk 'PUNCT 'lparen) (mk 'IDENT "x") + (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'continue) (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) + (mk 'KW 'return) (mk 'INT 0) (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/14-mul-paren.c b/tests/cc-parse/14-mul-paren.c @@ -0,0 +1 @@ +int f(int a, int b, int c) { return a * (b + c); } diff --git a/tests/cc-parse/14-mul-paren.expected-trace b/tests/cc-parse/14-mul-paren.expected-trace @@ -0,0 +1,18 @@ +(fn-begin "f" (("a" i32) ("b" i32) ("c" i32)) i32) +(push-sym ("a" param)) +(load) +(promote) +(push-sym ("b" param)) +(load) +(promote) +(push-sym ("c" param)) +(load) +(promote) +(arith-conv) +(binop add) +(promote) +(arith-conv) +(binop mul) +(cast i32) +(return) +(fn-end) diff --git a/tests/cc-parse/14-mul-paren.scm b/tests/cc-parse/14-mul-paren.scm @@ -0,0 +1,21 @@ +(define %L (%loc "x" 1 1)) +(define (mk k v) (make-tok k v %L)) + +(let ((cg (cg-init))) + (let ((ps (make-pstate + (list (mk 'KW 'int) (mk 'IDENT "f") + (mk 'PUNCT 'lparen) + (mk 'KW 'int) (mk 'IDENT "a") (mk 'PUNCT 'comma) + (mk 'KW 'int) (mk 'IDENT "b") (mk 'PUNCT 'comma) + (mk 'KW 'int) (mk 'IDENT "c") (mk 'PUNCT 'rparen) + (mk 'PUNCT 'lbrace) + (mk 'KW 'return) + (mk 'IDENT "a") (mk 'PUNCT 'star) + (mk 'PUNCT 'lparen) + (mk 'IDENT "b") (mk 'PUNCT 'plus) (mk 'IDENT "c") + (mk 'PUNCT 'rparen) + (mk 'PUNCT 'semi) + (mk 'PUNCT 'rbrace) (mk 'EOF #f)) + cg))) + (parse-translation-unit ps) + (cg-trace-print))) diff --git a/tests/cc-parse/cg-trace.scm b/tests/cc-parse/cg-trace.scm @@ -0,0 +1,153 @@ +;; tests/cc-parse/cg-trace.scm — swap-in mock for cc/cg.scm. +;; Provides every public cg-* entry point parse.scm calls; each call +;; appends a record to %cg-trace. Format follows CC-CONTRACTS §2.2. +;; cg-trace-print writes one Scheme list per line to fd 1. + +(define %cg-trace '()) +(define (%cg-emit! e) (set! %cg-trace (cons e %cg-trace))) +(define (cg-trace-get) (reverse %cg-trace)) + +(define (%render-ctype t) + (if (not t) 'NIL + (let ((k (ctype-kind t))) + (cond + ((eq? k 'ptr) (list 'ptr (%render-ctype (ctype-ext t)))) + ((eq? k 'arr) + (let ((e (ctype-ext t))) + (list 'arr (%render-ctype (car e)) + (if (< (cdr e) 0) '* (cdr e))))) + ((eq? k 'fn) + (let ((e (ctype-ext t))) + (list 'fn (%render-ctype (car e)) + (map %render-ctype (cadr e)) + (car (cddr e))))) + ((eq? k 'struct) (list 'struct (car (ctype-ext t)))) + ((eq? k 'union) (list 'union (car (ctype-ext t)))) + ((eq? k 'enum) (list 'enum (car (ctype-ext t)))) + (else k))))) + +(define (%render-sym s) (list (sym-name s) (sym-kind s))) + +(define (%fake-opnd t lv?) (%opnd 'frame (if t t %t-i32) 0 lv?)) + +(define %cg-stack '()) +(define (%push o) (set! %cg-stack (cons o %cg-stack)) o) +(define (%pop) + (cond ((null? %cg-stack) (%fake-opnd %t-i32 #f)) + (else (let ((o (car %cg-stack))) + (set! %cg-stack (cdr %cg-stack)) o)))) +(define (%top) + (cond ((null? %cg-stack) #f) (else (car %cg-stack)))) + +(define (cg-init) (set! %cg-trace '()) (set! %cg-stack '()) 'mock-cg) +(define (cg-finish cg) (%cg-emit! '(finish)) "") + +(define (cg-fn-begin cg name params return-type) + (let ((rp (map (lambda (p) (list (car p) (%render-ctype (cdr p)))) + params))) + (%cg-emit! (list 'fn-begin name rp (%render-ctype return-type)))) + (let loop ((ps params) (i 0) (acc '())) + (if (null? ps) (reverse acc) + (loop (cdr ps) (+ i 1) + (cons (cons (car (car ps)) + (%sym (car (car ps)) 'param 'auto + (cdr (car ps)) (* i 8))) + acc))))) + +(define (cg-fn-end cg) (%cg-emit! '(fn-end)) #t) + +(define (cg-push cg op) (%push op)) +(define (cg-pop cg) (%pop)) +(define (cg-top cg) (%top)) +(define (cg-depth cg) (length %cg-stack)) + +(define (cg-push-imm cg ct v) + (%cg-emit! (list 'push-imm (%render-ctype ct) v)) + (%push (%fake-opnd ct #f))) +(define (cg-push-string cg b) + (%cg-emit! (list 'push-string b)) + (%push (%fake-opnd %t-i64 #f))) +(define (cg-push-sym cg s) + (%cg-emit! (list 'push-sym (%render-sym s))) + (%push (%fake-opnd (sym-type s) + (if (or (eq? (sym-kind s) 'fn) + (eq? (sym-kind s) 'enum-const)) #f #t)))) +(define (cg-push-deref cg) + (%cg-emit! '(push-deref)) + (%pop) (%push (%fake-opnd %t-i32 #t))) + +(define (cg-take-addr cg) + (%cg-emit! '(take-addr)) + (%pop) (%push (%fake-opnd %t-i64 #f))) +(define (cg-load cg) + (%cg-emit! '(load)) + (let ((o (%pop))) + (%push (%fake-opnd (opnd-type o) #f)))) + +(define (cg-cast cg ty) + (%cg-emit! (list 'cast (%render-ctype ty))) + (%pop) (%push (%fake-opnd ty #f))) +(define (cg-promote cg) + (%cg-emit! '(promote)) + (%pop) (%push (%fake-opnd %t-i32 #f))) +(define (cg-arith-conv cg) (%cg-emit! '(arith-conv)) #t) + +(define (cg-binop cg op) + (%cg-emit! (list 'binop op)) + (%pop) (%pop) (%push (%fake-opnd %t-i32 #f))) +(define (cg-unop cg op) + (%cg-emit! (list 'unop op)) + (%pop) (%push (%fake-opnd %t-i32 #f))) +(define (cg-assign cg) + (%cg-emit! '(assign)) + (%pop) (%pop) (%push (%fake-opnd %t-i32 #f))) + +(define (cg-call cg arity has-result?) + (%cg-emit! (list 'call arity has-result?)) + (let lp ((n (+ arity 1))) + (cond ((zero? n) #t) (else (%pop) (lp (- n 1))))) + (cond (has-result? (%push (%fake-opnd %t-i32 #f))) + (else #f))) +(define (cg-return cg) (%cg-emit! '(return)) (%pop) #t) + +(define (cg-if cg th) + (%cg-emit! '(if-begin)) (th) (%cg-emit! '(if-end)) #t) +(define (cg-ifelse cg th eh) + (%cg-emit! '(ifelse-begin)) (th) + (%cg-emit! '(ifelse-mid)) (eh) + (%cg-emit! '(ifelse-end)) #t) + +(define (cg-loop cg head body) + (%cg-emit! '(loop-begin)) + (head) (body) + (%cg-emit! '(loop-end)) + "L0") +(define (cg-loop-end cg tag) (%cg-emit! (list 'loop-close tag)) #t) +(define (cg-break cg tag) (%cg-emit! (list 'break tag)) #t) +(define (cg-continue cg tag) (%cg-emit! (list 'continue tag)) #t) + +(define (cg-switch-begin cg) (%cg-emit! '(switch-begin)) 'mock-sw) +(define (cg-switch-case cg sw v) (%cg-emit! (list 'switch-case v)) #t) +(define (cg-switch-default cg sw) (%cg-emit! '(switch-default)) #t) +(define (cg-switch-end cg sw) (%cg-emit! '(switch-end)) #t) + +(define (cg-emit-global cg s init) + (%cg-emit! (list 'emit-global (%render-sym s) init)) #t) +(define (cg-emit-extern cg s) + (%cg-emit! (list 'emit-extern (%render-sym s))) #t) +(define (cg-intern-string cg b) + (%cg-emit! (list 'intern-string b)) b) + +(define %frame-hi 0) +(define (cg-alloc-slot cg sz al) + (let* ((a (align-up %frame-hi al)) (n (+ a sz))) + (set! %frame-hi n) + (%cg-emit! (list 'alloc-slot sz al a)) + a)) + +(define (cg-trace-print) + (let loop ((xs (cg-trace-get))) + (cond ((null? xs) #t) + (else (write (car xs)) + (write-bv-fd 1 NL-BV) + (loop (cdr xs)))))) diff --git a/tests/cc-parse/mini-prelude.scm b/tests/cc-parse/mini-prelude.scm @@ -0,0 +1,66 @@ +;; tests/cc-parse/mini-prelude.scm — minimal prelude for cc-parse fixtures. +;; Replaces scheme1/prelude.scm in the catm so the combined file fits in +;; scheme1's 64 KiB readbuf. Defines only what util.scm + parse.scm + +;; cg-trace.scm + the fixture driver actually use. + +(define (<= a b) (if (< b a) #f #t)) +(define (>= a b) (if (< a b) #f #t)) +(define (negative? n) (< n 0)) +(define (positive? n) (> n 0)) +(define number? integer?) +(define bytevector? string?) +(define (min a b) (if (< a b) a b)) +(define (max a b) (if (< a b) b a)) +(define (modulo a b) + (let ((r (remainder a b))) + (if (zero? r) 0 + (if (eq? (negative? r) (negative? b)) r (+ r b))))) + +(define (caar x) (car (car x))) +(define (cadr x) (car (cdr x))) +(define (cdar x) (cdr (car x))) +(define (cddr x) (cdr (cdr x))) +(define (caddr x) (car (cdr (cdr x)))) + +(define (list . xs) xs) + +(define (reverse xs) + (let loop ((xs xs) (a '())) + (if (null? xs) a (loop (cdr xs) (cons (car xs) a))))) + +(define (append-pair a b) + (if (null? a) b (cons (car a) (append-pair (cdr a) b)))) +(define (append . ls) + (cond ((null? ls) '()) + ((null? (cdr ls)) (car ls)) + (else (append-pair (car ls) (apply append (cdr ls)))))) + +(define (assoc k al) + (if (null? al) #f + (if (eq? (car (car al)) k) (car al) (assoc k (cdr al))))) + +(define (member x xs) + (if (null? xs) #f + (if (eq? (car xs) x) xs (member x (cdr xs))))) + +(define (map f xs) + (if (null? xs) '() (cons (f (car xs)) (map f (cdr xs))))) +(define (for-each f xs) + (if (null? xs) '() (begin (f (car xs)) (for-each f (cdr xs))))) +(define (fold f acc xs) + (if (null? xs) acc (fold f (f acc (car xs)) (cdr xs)))) + +(define BUFSIZE 4096) +(define NL-BV (make-bytevector 1 10)) + +(define (bv-concat-reverse cs) + (let* ((xs (reverse cs)) + (n (let s ((ys xs) (n 0)) + (if (null? ys) n + (s (cdr ys) (+ n (bytevector-length (car ys))))))) + (out (make-bytevector n))) + (let loop ((ys xs) (i 0)) + (if (null? ys) out + (let ((len (bytevector-length (car ys)))) + (bytevector-copy! out i (car ys) 0 len) + (loop (cdr ys) (+ i len))))))) diff --git a/tests/cc-pp/01-obj-macro.expected-exit b/tests/cc-pp/01-obj-macro.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/01-obj-macro.scm b/tests/cc-pp/01-obj-macro.scm @@ -0,0 +1,18 @@ +;; Object-like macro: `#define X 42` then `X` -> INT 42. +;; Hand-built tokens; does not depend on lex.scm. +(define l (%loc "t.c" 1 1)) +(define input + (list (%tok 'HASH #f l '()) + (%tok 'IDENT "define" l '()) + (%tok 'IDENT "X" l '()) + (%tok 'INT 42 l '()) + (%tok 'NL #f l '()) + (%tok 'IDENT "X" l '()) + (%tok 'NL #f l '()) + (%tok 'EOF #f l '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'INT) 0 (sys-exit 12)) +(if (= (tok-value (car out)) 42) 0 (sys-exit 13)) +(if (eq? (tok-kind (car (cdr out))) 'EOF) 0 (sys-exit 14)) +(sys-exit 0) diff --git a/tests/cc-pp/02-obj-macro-multi-tok.expected-exit b/tests/cc-pp/02-obj-macro-multi-tok.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/02-obj-macro-multi-tok.scm b/tests/cc-pp/02-obj-macro-multi-tok.scm @@ -0,0 +1,24 @@ +;; #define ADD1(no-paren-adjacent) - actually a multi-tok object macro: +;; #define X 1 + 2 +;; then `X` should produce three tokens: INT 1, PUNCT plus, INT 2. +(define l (%loc "t.c" 1 1)) +(define input + (list (%tok 'HASH #f l '()) + (%tok 'IDENT "define" l '()) + (%tok 'IDENT "X" l '()) + (%tok 'INT 1 l '()) + (%tok 'PUNCT 'plus l '()) + (%tok 'INT 2 l '()) + (%tok 'NL #f l '()) + (%tok 'IDENT "X" l '()) + (%tok 'NL #f l '()) + (%tok 'EOF #f l '()))) +(define out (pp-expand input '())) +(if (= (length out) 4) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'INT) 0 (sys-exit 12)) +(if (= (tok-value (car out)) 1) 0 (sys-exit 13)) +(if (eq? (tok-kind (car (cdr out))) 'PUNCT) 0 (sys-exit 14)) +(if (eq? (tok-value (car (cdr out))) 'plus) 0 (sys-exit 15)) +(if (eq? (tok-kind (car (cdr (cdr out)))) 'INT) 0 (sys-exit 16)) +(if (= (tok-value (car (cdr (cdr out)))) 2) 0 (sys-exit 17)) +(sys-exit 0) diff --git a/tests/cc-pp/03-fn-macro.expected-exit b/tests/cc-pp/03-fn-macro.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/03-fn-macro.scm b/tests/cc-pp/03-fn-macro.scm @@ -0,0 +1,26 @@ +;; Function-like macro: +;; #define ID(x) x +;; then ID(42) -> INT 42. +;; The `(` after `ID` must be column-adjacent to ID; we synthesize cols +;; so that holds. +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) ; col 1 + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "ID" (LL 9) '()) ; ID at col 9 (length 2) + (%tok 'PUNCT 'lparen (LL 11) '()) ; ( at col 11 = 9 + 2 + (%tok 'IDENT "x" (LL 12) '()) + (%tok 'PUNCT 'rparen (LL 13) '()) + (%tok 'IDENT "x" (LL 15) '()) + (%tok 'NL #f (LL 16) '()) + (%tok 'IDENT "ID" (LL 1) '()) + (%tok 'PUNCT 'lparen (LL 3) '()) + (%tok 'INT 42 (LL 4) '()) + (%tok 'PUNCT 'rparen (LL 6) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'EOF #f (LL 7) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'INT) 0 (sys-exit 12)) +(if (= (tok-value (car out)) 42) 0 (sys-exit 13)) +(sys-exit 0) diff --git a/tests/cc-pp/04-fn-macro-2args.expected-exit b/tests/cc-pp/04-fn-macro-2args.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/04-fn-macro-2args.scm b/tests/cc-pp/04-fn-macro-2args.scm @@ -0,0 +1,29 @@ +;; #define ADD(a,b) a+b then ADD(3,4) -> 3 + 4 +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "ADD" (LL 9) '()) + (%tok 'PUNCT 'lparen (LL 12) '()) ; col 9+3=12 + (%tok 'IDENT "a" (LL 13) '()) + (%tok 'PUNCT 'comma (LL 14) '()) + (%tok 'IDENT "b" (LL 15) '()) + (%tok 'PUNCT 'rparen (LL 16) '()) + (%tok 'IDENT "a" (LL 18) '()) + (%tok 'PUNCT 'plus (LL 19) '()) + (%tok 'IDENT "b" (LL 20) '()) + (%tok 'NL #f (LL 21) '()) + (%tok 'IDENT "ADD" (LL 1) '()) + (%tok 'PUNCT 'lparen (LL 4) '()) ; col 1+3=4 + (%tok 'INT 3 (LL 5) '()) + (%tok 'PUNCT 'comma (LL 6) '()) + (%tok 'INT 4 (LL 7) '()) + (%tok 'PUNCT 'rparen (LL 8) '()) + (%tok 'NL #f (LL 9) '()) + (%tok 'EOF #f (LL 9) '()))) +(define out (pp-expand input '())) +(if (= (length out) 4) 0 (sys-exit 11)) +(if (= (tok-value (car out)) 3) 0 (sys-exit 12)) +(if (eq? (tok-value (car (cdr out))) 'plus) 0 (sys-exit 13)) +(if (= (tok-value (car (cdr (cdr out)))) 4) 0 (sys-exit 14)) +(sys-exit 0) diff --git a/tests/cc-pp/05-variadic.expected-exit b/tests/cc-pp/05-variadic.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/05-variadic.scm b/tests/cc-pp/05-variadic.scm @@ -0,0 +1,35 @@ +;; Variadic macro: +;; #define LOG(fmt, ...) fmt __VA_ARGS__ +;; LOG("x", 1, 2) -> STR "x" INT 1 PUNCT comma INT 2 +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "LOG" (LL 9) '()) + (%tok 'PUNCT 'lparen (LL 12) '()) ; col 9+3=12 + (%tok 'IDENT "fmt" (LL 13) '()) + (%tok 'PUNCT 'comma (LL 16) '()) + (%tok 'PUNCT 'ellipsis (LL 17) '()) + (%tok 'PUNCT 'rparen (LL 20) '()) + (%tok 'IDENT "fmt" (LL 22) '()) + (%tok 'IDENT "__VA_ARGS__" (LL 26) '()) + (%tok 'NL #f (LL 36) '()) + (%tok 'IDENT "LOG" (LL 1) '()) + (%tok 'PUNCT 'lparen (LL 4) '()) ; col 1+3=4 + (%tok 'STR "x" (LL 5) '()) + (%tok 'PUNCT 'comma (LL 8) '()) + (%tok 'INT 1 (LL 9) '()) + (%tok 'PUNCT 'comma (LL 10) '()) + (%tok 'INT 2 (LL 11) '()) + (%tok 'PUNCT 'rparen (LL 12) '()) + (%tok 'NL #f (LL 13) '()) + (%tok 'EOF #f (LL 13) '()))) +(define out (pp-expand input '())) +;; Expected: STR "x", INT 1, PUNCT comma, INT 2, EOF. +(if (= (length out) 5) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'STR) 0 (sys-exit 12)) +(if (bytevector=? (tok-value (car out)) "x") 0 (sys-exit 13)) +(if (= (tok-value (car (cdr out))) 1) 0 (sys-exit 14)) +(if (eq? (tok-value (car (cdr (cdr out)))) 'comma) 0 (sys-exit 15)) +(if (= (tok-value (car (cdr (cdr (cdr out))))) 2) 0 (sys-exit 16)) +(sys-exit 0) diff --git a/tests/cc-pp/06-stringize.expected-exit b/tests/cc-pp/06-stringize.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/06-stringize.scm b/tests/cc-pp/06-stringize.scm @@ -0,0 +1,25 @@ +;; Stringize: +;; #define STR(x) #x +;; STR(hello) -> STR "hello" +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "STR" (LL 9) '()) + (%tok 'PUNCT 'lparen (LL 12) '()) + (%tok 'IDENT "x" (LL 13) '()) + (%tok 'PUNCT 'rparen (LL 14) '()) + (%tok 'PUNCT 'hash (LL 16) '()) + (%tok 'IDENT "x" (LL 17) '()) + (%tok 'NL #f (LL 18) '()) + (%tok 'IDENT "STR" (LL 1) '()) + (%tok 'PUNCT 'lparen (LL 4) '()) + (%tok 'IDENT "hello" (LL 5) '()) + (%tok 'PUNCT 'rparen (LL 10) '()) + (%tok 'NL #f (LL 11) '()) + (%tok 'EOF #f (LL 11) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'STR) 0 (sys-exit 12)) +(if (bytevector=? (tok-value (car out)) "hello") 0 (sys-exit 13)) +(sys-exit 0) diff --git a/tests/cc-pp/07-paste.expected-exit b/tests/cc-pp/07-paste.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/07-paste.scm b/tests/cc-pp/07-paste.scm @@ -0,0 +1,30 @@ +;; Token paste: +;; #define CAT(a,b) a##b +;; CAT(foo, bar) -> IDENT "foobar" +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "CAT" (LL 9) '()) + (%tok 'PUNCT 'lparen (LL 12) '()) + (%tok 'IDENT "a" (LL 13) '()) + (%tok 'PUNCT 'comma (LL 14) '()) + (%tok 'IDENT "b" (LL 15) '()) + (%tok 'PUNCT 'rparen (LL 16) '()) + (%tok 'IDENT "a" (LL 18) '()) + (%tok 'PUNCT 'paste (LL 19) '()) + (%tok 'IDENT "b" (LL 21) '()) + (%tok 'NL #f (LL 22) '()) + (%tok 'IDENT "CAT" (LL 1) '()) + (%tok 'PUNCT 'lparen (LL 4) '()) + (%tok 'IDENT "foo" (LL 5) '()) + (%tok 'PUNCT 'comma (LL 8) '()) + (%tok 'IDENT "bar" (LL 9) '()) + (%tok 'PUNCT 'rparen (LL 12) '()) + (%tok 'NL #f (LL 13) '()) + (%tok 'EOF #f (LL 13) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'IDENT) 0 (sys-exit 12)) +(if (bytevector=? (tok-value (car out)) "foobar") 0 (sys-exit 13)) +(sys-exit 0) diff --git a/tests/cc-pp/08-nested-expansion.expected-exit b/tests/cc-pp/08-nested-expansion.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/08-nested-expansion.scm b/tests/cc-pp/08-nested-expansion.scm @@ -0,0 +1,24 @@ +;; Nested expansion: +;; #define A 1 +;; #define B A +;; B -> 1 (B expands to A, A expands to 1) +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "A" (LL 9) '()) + (%tok 'INT 1 (LL 11) '()) + (%tok 'NL #f (LL 12) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "B" (LL 9) '()) + (%tok 'IDENT "A" (LL 11) '()) + (%tok 'NL #f (LL 12) '()) + (%tok 'IDENT "B" (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'EOF #f (LL 2) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'INT) 0 (sys-exit 12)) +(if (= (tok-value (car out)) 1) 0 (sys-exit 13)) +(sys-exit 0) diff --git a/tests/cc-pp/09-hideset-self.expected-exit b/tests/cc-pp/09-hideset-self.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/09-hideset-self.scm b/tests/cc-pp/09-hideset-self.scm @@ -0,0 +1,26 @@ +;; Hide-set self-reference: +;; #define X X +;; X -> X (single IDENT, NOT recursively expanded) +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "X" (LL 9) '()) + (%tok 'IDENT "X" (LL 11) '()) + (%tok 'NL #f (LL 12) '()) + (%tok 'IDENT "X" (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'EOF #f (LL 2) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'IDENT) 0 (sys-exit 12)) +(if (bytevector=? (tok-value (car out)) "X") 0 (sys-exit 13)) +;; Verify the emitted token has X in its hide-set (it must, otherwise +;; rescan would loop). Just assert the hide-set contains "X". +(define h (tok-hide (car out))) +(define (mem? x xs) + (cond ((null? xs) #f) + ((bytevector=? x (car xs)) #t) + (else (mem? x (cdr xs))))) +(if (mem? "X" h) 0 (sys-exit 14)) +(sys-exit 0) diff --git a/tests/cc-pp/10-if-defined.expected-exit b/tests/cc-pp/10-if-defined.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/10-if-defined.scm b/tests/cc-pp/10-if-defined.scm @@ -0,0 +1,30 @@ +;; #if defined(X) — emit body when X defined. +;; #define X 1 +;; #if defined(X) +;; INT 7 +;; #endif +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "X" (LL 9) '()) + (%tok 'INT 1 (LL 11) '()) + (%tok 'NL #f (LL 12) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "if" (LL 2) '()) + (%tok 'IDENT "defined" (LL 5) '()) + (%tok 'PUNCT 'lparen (LL 12) '()) + (%tok 'IDENT "X" (LL 13) '()) + (%tok 'PUNCT 'rparen (LL 14) '()) + (%tok 'NL #f (LL 15) '()) + (%tok 'INT 7 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'EOF #f (LL 7) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'INT) 0 (sys-exit 12)) +(if (= (tok-value (car out)) 7) 0 (sys-exit 13)) +(sys-exit 0) diff --git a/tests/cc-pp/11-if-arith.expected-exit b/tests/cc-pp/11-if-arith.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/11-if-arith.scm b/tests/cc-pp/11-if-arith.scm @@ -0,0 +1,21 @@ +;; #if 1+2 == 3 -> body emitted +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "if" (LL 2) '()) + (%tok 'INT 1 (LL 5) '()) + (%tok 'PUNCT 'plus (LL 6) '()) + (%tok 'INT 2 (LL 7) '()) + (%tok 'PUNCT 'eq2 (LL 9) '()) + (%tok 'INT 3 (LL 12) '()) + (%tok 'NL #f (LL 13) '()) + (%tok 'INT 42 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'EOF #f (LL 7) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (= (tok-value (car out)) 42) 0 (sys-exit 12)) +(sys-exit 0) diff --git a/tests/cc-pp/12-ifdef-ifndef.expected-exit b/tests/cc-pp/12-ifdef-ifndef.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/12-ifdef-ifndef.scm b/tests/cc-pp/12-ifdef-ifndef.scm @@ -0,0 +1,46 @@ +;; #define A +;; #ifdef A -> emit 1 +;; #endif +;; #ifndef B -> emit 2 +;; #endif +;; #ifdef B -> NOT emitted +;; #endif +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "A" (LL 9) '()) + (%tok 'NL #f (LL 10) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "ifdef" (LL 2) '()) + (%tok 'IDENT "A" (LL 8) '()) + (%tok 'NL #f (LL 9) '()) + (%tok 'INT 1 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "ifndef" (LL 2) '()) + (%tok 'IDENT "B" (LL 9) '()) + (%tok 'NL #f (LL 10) '()) + (%tok 'INT 2 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "ifdef" (LL 2) '()) + (%tok 'IDENT "B" (LL 8) '()) + (%tok 'NL #f (LL 9) '()) + (%tok 'INT 99 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'EOF #f (LL 7) '()))) +(define out (pp-expand input '())) +(if (= (length out) 3) 0 (sys-exit 11)) ; INT 1, INT 2, EOF +(if (= (tok-value (car out)) 1) 0 (sys-exit 12)) +(if (= (tok-value (car (cdr out))) 2) 0 (sys-exit 13)) +(sys-exit 0) diff --git a/tests/cc-pp/13-elif-chain.expected-exit b/tests/cc-pp/13-elif-chain.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/13-elif-chain.scm b/tests/cc-pp/13-elif-chain.scm @@ -0,0 +1,38 @@ +;; #if 0 -> skipped +;; #elif 1 -> taken (emit 5) +;; #elif 1 -> NOT taken (already taken) +;; #else -> NOT taken +;; #endif +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "if" (LL 2) '()) + (%tok 'INT 0 (LL 5) '()) + (%tok 'NL #f (LL 6) '()) + (%tok 'INT 1 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "elif" (LL 2) '()) + (%tok 'INT 1 (LL 7) '()) + (%tok 'NL #f (LL 8) '()) + (%tok 'INT 5 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "elif" (LL 2) '()) + (%tok 'INT 1 (LL 7) '()) + (%tok 'NL #f (LL 8) '()) + (%tok 'INT 3 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "else" (LL 2) '()) + (%tok 'NL #f (LL 6) '()) + (%tok 'INT 4 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'EOF #f (LL 7) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) ; INT 5, EOF +(if (= (tok-value (car out)) 5) 0 (sys-exit 12)) +(sys-exit 0) diff --git a/tests/cc-pp/14-nested-if.expected-exit b/tests/cc-pp/14-nested-if.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/14-nested-if.scm b/tests/cc-pp/14-nested-if.scm @@ -0,0 +1,43 @@ +;; Nested #if: +;; #if 1 +;; #if 0 -> skip +;; 3 +;; #else -> emit 4 +;; 4 +;; #endif +;; #else -> outer skip +;; 5 +;; #endif +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "if" (LL 2) '()) + (%tok 'INT 1 (LL 5) '()) + (%tok 'NL #f (LL 6) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "if" (LL 2) '()) + (%tok 'INT 0 (LL 5) '()) + (%tok 'NL #f (LL 6) '()) + (%tok 'INT 3 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "else" (LL 2) '()) + (%tok 'NL #f (LL 6) '()) + (%tok 'INT 4 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "else" (LL 2) '()) + (%tok 'NL #f (LL 6) '()) + (%tok 'INT 5 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'EOF #f (LL 7) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) ; INT 4, EOF +(if (= (tok-value (car out)) 4) 0 (sys-exit 12)) +(sys-exit 0) diff --git a/tests/cc-pp/15-undef.expected-exit b/tests/cc-pp/15-undef.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/15-undef.scm b/tests/cc-pp/15-undef.scm @@ -0,0 +1,28 @@ +;; #define X 1 +;; X -> INT 1 +;; #undef X +;; X -> IDENT X (no longer expanded) +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "define" (LL 2) '()) + (%tok 'IDENT "X" (LL 9) '()) + (%tok 'INT 1 (LL 11) '()) + (%tok 'NL #f (LL 12) '()) + (%tok 'IDENT "X" (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "undef" (LL 2) '()) + (%tok 'IDENT "X" (LL 8) '()) + (%tok 'NL #f (LL 9) '()) + (%tok 'IDENT "X" (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'EOF #f (LL 2) '()))) +(define out (pp-expand input '())) +;; Expected: INT 1, IDENT X, EOF +(if (= (length out) 3) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'INT) 0 (sys-exit 12)) +(if (= (tok-value (car out)) 1) 0 (sys-exit 13)) +(if (eq? (tok-kind (car (cdr out))) 'IDENT) 0 (sys-exit 14)) +(if (bytevector=? (tok-value (car (cdr out))) "X") 0 (sys-exit 15)) +(sys-exit 0) diff --git a/tests/cc-pp/16-error.expected-exit b/tests/cc-pp/16-error.expected-exit @@ -0,0 +1 @@ +1 diff --git a/tests/cc-pp/16-error.scm b/tests/cc-pp/16-error.scm @@ -0,0 +1,10 @@ +;; #error msg -> die with exit 1. +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "error" (LL 2) '()) + (%tok 'IDENT "boom" (LL 8) '()) + (%tok 'NL #f (LL 12) '()) + (%tok 'EOF #f (LL 12) '()))) +(pp-expand input '()) ; should die before this returns +(sys-exit 0) diff --git a/tests/cc-pp/17-include-rejected.expected-exit b/tests/cc-pp/17-include-rejected.expected-exit @@ -0,0 +1 @@ +1 diff --git a/tests/cc-pp/17-include-rejected.scm b/tests/cc-pp/17-include-rejected.scm @@ -0,0 +1,10 @@ +;; #include "x" must die. +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "include" (LL 2) '()) + (%tok 'STR "x" (LL 10) '()) + (%tok 'NL #f (LL 14) '()) + (%tok 'EOF #f (LL 14) '()))) +(pp-expand input '()) +(sys-exit 0) diff --git a/tests/cc-pp/18-builtin-stdc.expected-exit b/tests/cc-pp/18-builtin-stdc.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/18-builtin-stdc.scm b/tests/cc-pp/18-builtin-stdc.scm @@ -0,0 +1,12 @@ +;; Built-in macros __STDC__ -> 1, __LISPCC__ -> 1. +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'IDENT "__STDC__" (LL 1) '()) + (%tok 'IDENT "__LISPCC__" (LL 11) '()) + (%tok 'NL #f (LL 22) '()) + (%tok 'EOF #f (LL 22) '()))) +(define out (pp-expand input '())) +(if (= (length out) 3) 0 (sys-exit 11)) ; INT 1, INT 1, EOF +(if (= (tok-value (car out)) 1) 0 (sys-exit 12)) +(if (= (tok-value (car (cdr out))) 1) 0 (sys-exit 13)) +(sys-exit 0) diff --git a/tests/cc-pp/19-pragma-dropped.expected-exit b/tests/cc-pp/19-pragma-dropped.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/19-pragma-dropped.scm b/tests/cc-pp/19-pragma-dropped.scm @@ -0,0 +1,14 @@ +;; #pragma is accepted and dropped. +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "pragma" (LL 2) '()) + (%tok 'IDENT "once" (LL 9) '()) + (%tok 'NL #f (LL 13) '()) + (%tok 'INT 7 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'EOF #f (LL 2) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) ; INT 7, EOF +(if (= (tok-value (car out)) 7) 0 (sys-exit 12)) +(sys-exit 0) diff --git a/tests/cc-pp/20-cexpr-ops.expected-exit b/tests/cc-pp/20-cexpr-ops.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/20-cexpr-ops.scm b/tests/cc-pp/20-cexpr-ops.scm @@ -0,0 +1,26 @@ +;; #if (1+2)*3 == 9 -> body emitted +;; Exercises parens, mul, ==. +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "if" (LL 2) '()) + (%tok 'PUNCT 'lparen (LL 5) '()) + (%tok 'INT 1 (LL 6) '()) + (%tok 'PUNCT 'plus (LL 7) '()) + (%tok 'INT 2 (LL 8) '()) + (%tok 'PUNCT 'rparen (LL 9) '()) + (%tok 'PUNCT 'star (LL 10) '()) + (%tok 'INT 3 (LL 11) '()) + (%tok 'PUNCT 'eq2 (LL 13) '()) + (%tok 'INT 9 (LL 16) '()) + (%tok 'NL #f (LL 17) '()) + (%tok 'INT 77 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'EOF #f (LL 7) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (= (tok-value (car out)) 77) 0 (sys-exit 12)) +(sys-exit 0) diff --git a/tests/cc-pp/21-undefined-id-zero.expected-exit b/tests/cc-pp/21-undefined-id-zero.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/21-undefined-id-zero.scm b/tests/cc-pp/21-undefined-id-zero.scm @@ -0,0 +1,20 @@ +;; In #if, an undefined identifier counts as 0: +;; #if UNKNOWN == 0 -> emit body. +(define (LL c) (%loc "t.c" 1 c)) +(define input + (list (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "if" (LL 2) '()) + (%tok 'IDENT "UNKNOWN" (LL 5) '()) + (%tok 'PUNCT 'eq2 (LL 13) '()) + (%tok 'INT 0 (LL 16) '()) + (%tok 'NL #f (LL 17) '()) + (%tok 'INT 42 (LL 1) '()) + (%tok 'NL #f (LL 2) '()) + (%tok 'HASH #f (LL 1) '()) + (%tok 'IDENT "endif" (LL 2) '()) + (%tok 'NL #f (LL 7) '()) + (%tok 'EOF #f (LL 7) '()))) +(define out (pp-expand input '())) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (= (tok-value (car out)) 42) 0 (sys-exit 12)) +(sys-exit 0) diff --git a/tests/cc-pp/22-initial-defines.expected-exit b/tests/cc-pp/22-initial-defines.expected-exit @@ -0,0 +1 @@ +0 diff --git a/tests/cc-pp/22-initial-defines.scm b/tests/cc-pp/22-initial-defines.scm @@ -0,0 +1,15 @@ +;; Initial macros from -D flags: pass an alist as initial-defines. +;; #define-equivalent: BAR maps to INT 99. +(define (LL c) (%loc "t.c" 1 c)) +(define bar-macro (%macro 'obj '() (list (%tok 'INT 99 (LL 0) '())))) +(define defs (list (cons "BAR" bar-macro))) + +(define input + (list (%tok 'IDENT "BAR" (LL 1) '()) + (%tok 'NL #f (LL 4) '()) + (%tok 'EOF #f (LL 4) '()))) +(define out (pp-expand input defs)) +(if (= (length out) 2) 0 (sys-exit 11)) +(if (eq? (tok-kind (car out)) 'INT) 0 (sys-exit 12)) +(if (= (tok-value (car out)) 99) 0 (sys-exit 13)) +(sys-exit 0) diff --git a/tests/cc-pp/30-define-end-to-end.c b/tests/cc-pp/30-define-end-to-end.c @@ -0,0 +1,2 @@ +#define X 42 +X diff --git a/tests/cc-pp/30-define-end-to-end.expected-toks b/tests/cc-pp/30-define-end-to-end.expected-toks @@ -0,0 +1,5 @@ +;; END-TO-END fixture (requires cc/lex.scm). Aspirational: the actual +;; column/line numbers depend on the lexer's `__LINE__`-replacement +;; semantics; refine this golden once lex.scm lands. +(INT 42 "30-define-end-to-end.c" 2 1) +(EOF #f "30-define-end-to-end.c" 3 1) diff --git a/tests/cc-pp/run-pp.scm b/tests/cc-pp/run-pp.scm @@ -0,0 +1,59 @@ +;; tests/cc-pp/run-pp.scm — end-to-end driver. +;; +;; REQUIRES: cc/lex.scm. Build with: +;; catm prelude.scm cc/util.scm cc/data.scm cc/lex.scm cc/pp.scm \ +;; tests/cc-pp/run-pp.scm > prog +;; scheme1 prog INPUT.c > toks-actual +;; +;; Reads the .c path from argv[1], lexes, expands, prints one tok per +;; line in CC-CONTRACTS §2.1 format: +;; (KIND VALUE FILE LINE COL) + +(define (main argv) + (cond + ((or (null? argv) (null? (cdr argv))) + (write-bv-fd 2 "usage: run-pp.scm INPUT.c\n") + (sys-exit 2)) + (else + (let* ((path (car (cdr argv))) + (r (open-input path))) + (cond + ((not (car r)) + (write-bv-fd 2 "run-pp: open failed\n") + (sys-exit 1)) + (else + (let* ((p (cdr r)) + (src-r (read-all p)) + (src (cdr src-r)) + (toks (lex-tokenize src path)) + (out (pp-expand toks '()))) + (close p) + (for-each %print-tok out) + (sys-exit 0)))))))) + +(define (%print-tok t) + (let ((k (tok-kind t)) (v (tok-value t)) (l (tok-loc t))) + ;; (KIND VALUE FILE LINE COL)\n + (write-bv-fd 1 + (bytevector-append "(" + (bytevector-append (symbol->string k) + (bytevector-append " " + (bytevector-append (%val->bv k v) + (bytevector-append " \"" + (bytevector-append (loc-file l) + (bytevector-append "\" " + (bytevector-append (fixnum->bv (loc-line l) 10) + (bytevector-append " " + (bytevector-append (fixnum->bv (loc-col l) 10) + ")\n")))))))))))))) + +(define (%val->bv k v) + (cond + ((or (eq? k 'HASH) (eq? k 'NL) (eq? k 'EOF)) "#f") + ((or (eq? k 'IDENT) (eq? k 'STR)) + (bytevector-append "\"" (bytevector-append v "\""))) + ((or (eq? k 'INT) (eq? k 'CHAR)) (fixnum->bv v 10)) + ((or (eq? k 'KW) (eq? k 'PUNCT)) (symbol->string v)) + (else "?"))) + +(main (sys-argv))