boot2

Playing with the boostrap
git clone https://git.ryansepassi.com/git/boot2.git
Log | Files | Refs | README

commit 481cee79fd06c2cb076c4e5fe3d74e297081af68
parent 32f8a7a95aac37696b844bb9530d5e52650cc5a5
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Mon,  4 May 2026 11:40:38 -0700

tcc 0.9.26: phase-2 aarch64 assembler

Broadens the in-tree arm64 assembler to riscv64-parity coverage:
full DP-imm/DP-reg families (incl. shifted/extended-reg, logical
imm via arm64_encode_bimm64, mul/madd/smull family, csel + cset/
cinc aliases, sdiv/udiv), register-offset + pre/post-indexed
loads/stores with sized variants, ldp/stp full forms, b.cond/cbz/
cbnz/tbz/tbnz (in-section only), br/blr, hint/barrier mnemonics,
and the ldr Xn,=imm64 / =sym pseudo (movz/movk chain via
arm64_movimm and the 4-insn MOVW_UABS_G* reloc chain). Adds the
condition-code, shift, extend, and register-alias tokens the new
operand parser needs.

The static helpers arm64_movimm and arm64_encode_bimm64 in
arm64-gen.c are called directly: under ONE_SOURCE both .c files
land in the same TU sequentially, so promotion to ST_FUNC is
unnecessary. Phase-1 mnemonics and the existing tcc-cc/tcc-libc
.S files round-trip unchanged.

Diffstat:
Mdocs/TCC-ARM64-ASM.md | 50+++++++++++++++++++++++++++++++++++++++++++-------
Mscripts/simple-patches/tcc-0.9.26/files/arm64-asm.c | 1542++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------
Mscripts/simple-patches/tcc-0.9.26/files/arm64-tok.h | 179++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
3 files changed, 1556 insertions(+), 215 deletions(-)

diff --git a/docs/TCC-ARM64-ASM.md b/docs/TCC-ARM64-ASM.md @@ -220,13 +220,49 @@ all integer-register operand kinds restricted to OP_REG/OP_IMM/OP_MEM assemble through `tcc-boot2`; Makefile drops `TCC_ASM` dance for ARCH=aarch64. -**Phase 2** — broaden mnemonic coverage to riscv64 parity: the rest -of dp-imm / dp-reg / ldp-stp / cbz/cbnz / b.cond, full -shift+extend operand forms, `ldr Xn, =imm64`/`=sym` inline -lowering. Lifts `arm64_encode_bimm64` and `arm64_movimm` from -`arm64-gen.c` to shared `ST_FUNC`s for the logical-imm and -`=imm64` paths. Validates against `tests2/73_arm64.c` (already in -upstream). +**Phase 2 (implemented)** — broadens to riscv64-parity coverage. +Surface added: +- DP-imm: `add`/`sub`/`adds`/`subs`/`cmp`/`cmn`/`neg`/`negs`, + `and`/`orr`/`eor`/`ands`/`tst` (logical-imm), `movz`/`movn`/`movk`, + `sbfm`/`ubfm`/`bfm` + `lsl`/`lsr`/`asr`/`sxtb`/`sxth`/`sxtw`/ + `uxtb`/`uxth` immediate aliases. +- DP-reg: shifted-reg `add`/`sub` (and set-flags variants), + extended-reg form when one operand is `sp`, logical-reg + `and`/`orr`/`eor`/`bic`/`orn`/`eon`/`bics`/`mvn`, variable shifts + `lslv`/`lsrv`/`asrv`/`rorv` with `lsl`/`lsr`/`asr`/`ror` reg + aliases, `mul`/`mneg`/`madd`/`msub`/`smull`/`umull`/`smaddl`/ + `umaddl`/`smsubl`/`umsubl`/`smulh`/`umulh`/`udiv`/`sdiv`, + `csel`/`csinc`/`csinv`/`csneg` + `cset`/`csetm`/`cinc`/`cinv`/ + `cneg` aliases. +- Mem: `ldr`/`str` register-offset (with optional `lsl`/extend + shift) and pre/post-indexed forms; `ldrb`/`ldrh`/`ldrsb`/`ldrsh`/ + `ldrsw`/`strb`/`strh`; `ldp`/`stp` X- and W-forms with all index + modes. +- Branches: `b.cond` and `cbz`/`cbnz`/`tbz`/`tbnz` (in-section + targets only — no `R_AARCH64_CONDBR19`/`TSTBR14` reloc handlers + in `arm64-link.c`, so extern targets error out), `br`/`blr`, + full `ret`. +- Pseudo: `ldr Xn, =imm64` lowers via `arm64_movimm`; `ldr Xn, + =sym` lowers to the 4-insn `MOVW_UABS_G{0..3}` reloc chain. +- System: `svc`/`hvc`/`smc`/`brk`/`hlt`, `nop`/`yield`/`wfe`/`wfi`/ + `sev`/`sevl`/`hint`, `dsb`/`dmb`/`isb`. + +`arm64_encode_bimm64` and `arm64_movimm` from `arm64-gen.c` are +called directly: under `ONE_SOURCE` (the bootstrap pipeline) +both `.c` files inhabit the same TU sequentially, so the static +helpers are visible to `arm64-asm.c` without ST_FUNC promotion. + +Pre-existing limitation: `mes-libc`'s `strtoull` truncates via +`strtol`, so 64-bit hex literals (e.g. `#0xff00ff00ff00ff00`) get +clamped at parse-time. Computed expressions (`#1<<32`) work +around it. Out of scope to fix in this phase — surfaces in any +asm path that takes wide immediates and is unrelated to the +encoder logic. + +`tests2/73_arm64.c` does not exist upstream — the doc was +speculative. Validation instead is the in-tree `.S` round-trip +plus a hand-checked `build/phase2-test/test.S` fixture covering +each new family. **Phase 3** — full inline-asm constraint surface (`subst_asm_operand` + `asm_compute_constraints`). Ports the diff --git a/scripts/simple-patches/tcc-0.9.26/files/arm64-asm.c b/scripts/simple-patches/tcc-0.9.26/files/arm64-asm.c @@ -1,15 +1,33 @@ /*************************************************************/ /* - * ARM64 (AArch64) assembler for TCC — phase 1. + * ARM64 (AArch64) assembler for TCC — phase 2. * - * Covers the mnemonic surface needed by the in-tree .S inputs: - * tcc-cc/aarch64/start.S - * tcc-libc/aarch64/start.S - * tcc-libc/aarch64/sys_stubs.S + * Phase 1 covered the in-tree .S surface (mov/add/ldr/str/ldp/stp/ + * b/bl/ret/svc, register and simple immediate operands only). Phase 2 + * broadens to roughly riscv64-asm.c parity: + * DP-imm: add/sub/cmp/cmn (+set-flags), and/orr/eor/tst (logical), + * movz/movn/movk, sbfm/ubfm/bfm + lsl/lsr/asr/sxtb/sxth/ + * sxtw/uxtb/uxth aliases. + * DP-reg: add/sub/adds/subs (shifted + extended), cmp/cmn/neg/mvn + * aliases, and/orr/eor/bic/orn (shifted), lslv/lsrv/asrv/rorv + * (with lsl/lsr/asr/ror reg aliases), mul/mneg/madd/msub + + * smull/umull family, csel/csinc/csinv/csneg + cset/cinc + * aliases, sdiv/udiv. + * Mem: ldr/str/ldrb/ldrh/ldrsb/ldrsh/ldrsw/strb/strh + register- + * offset and pre/post-indexed forms; ldp/stp full forms. + * Branch: b/bl/ret/br/blr + b.cond/cbz/cbnz/tbz/tbnz (in-section), + * plus full SVC/BRK/HLT/HVC/SMC/HINT (nop/yield/wfe/wfi…). + * Pseudo: ldr Xn, =imm64 → arm64_movimm chain; ldr Xn, =sym → + * 4× MOVW_UABS_G* reloc chain. * - * Mnemonics: mov, add, sub, ldr, str, ldp, stp, b, bl, ret, svc. - * Inline-__asm__ constraint plumbing is stubbed in the same shape - * riscv64-asm.c used at its first cut. See docs/TCC-ARM64-ASM.md. + * Inline-__asm__ constraint plumbing remains stubbed in the riscv64-asm + * shape; .S input + top-level __asm__("…") works, constraint-driven + * asm gen is phase 3. See docs/TCC-ARM64-ASM.md. + * + * arm64_movimm() and arm64_encode_bimm64() live as static helpers in + * arm64-gen.c — under ONE_SOURCE both arm64-gen.c and this file are + * pulled into one TU (tcc.h includes them sequentially under the + * TCC_TARGET_ARM64 block), so we call them directly. */ #ifdef TARGET_DEFS_ONLY @@ -66,25 +84,101 @@ ST_FUNC void gen_expr32(ExprValue *pe) /* ---- operand model ------------------------------------------------ */ -#define OP_REG (1 << 0) -#define OP_IMM (1 << 1) -#define OP_MEM (1 << 2) +#define OP_NONE 0 +#define OP_REG (1 << 0) +#define OP_IMM (1 << 1) +#define OP_MEM (1 << 2) +#define OP_COND (1 << 3) +#define OP_LITERAL (1 << 4) /* `=imm` / `=sym` after ldr */ + +/* Shift kinds (also the 2-bit shift field for shifted-reg ops). */ +#define SH_LSL 0 +#define SH_LSR 1 +#define SH_ASR 2 +#define SH_ROR 3 +#define SH_NONE 4 + +/* Extend kinds (also the 3-bit option field for extended-reg ops). */ +#define EXT_UXTB 0 +#define EXT_UXTH 1 +#define EXT_UXTW 2 +#define EXT_UXTX 3 +#define EXT_SXTB 4 +#define EXT_SXTH 5 +#define EXT_SXTW 6 +#define EXT_SXTX 7 +#define EXT_NONE 8 -#define IDX_OFFSET 0 /* [Xn] or [Xn, #imm] */ -#define IDX_PREIDX 1 /* [Xn, #imm]! */ -#define IDX_POSTIDX 2 /* [Xn], #imm */ +/* Memory addressing modes. */ +#define IDX_OFFSET 0 /* [Xn] / [Xn, #imm] */ +#define IDX_PREIDX 1 /* [Xn, #imm]! */ +#define IDX_POSTIDX 2 /* [Xn], #imm */ +#define IDX_REGOFF 3 /* [Xn, Xm{,LSL #s|UXTW #s|…}] */ typedef struct AArch64Op { - uint32_t kind; /* OP_REG | OP_IMM | OP_MEM */ - uint8_t reg; /* register number 0..31 (31 = sp / xzr) */ - uint8_t is_w; /* 1 = W (32-bit), 0 = X (64-bit) */ - uint8_t is_sp; /* 1 if sp/wsp; 0 if zr or general */ - uint8_t base; /* memory base register */ + uint32_t kind; + uint8_t reg; /* register number (also Rt for OP_MEM Rt forms) */ + uint8_t is_w; /* 1 = W-form, 0 = X-form */ + uint8_t is_sp; /* 1 if textual form was sp/wsp (vs zr) */ + + /* Shifted/extended register operand decorations. */ + uint8_t shift_kind; /* SH_*; SH_NONE if none */ + uint8_t shift_amt; + uint8_t ext_kind; /* EXT_*; EXT_NONE if none */ + uint8_t ext_amt; /* shift after extend; 0 if absent */ + + /* Memory operand fields (kind == OP_MEM). */ + uint8_t base; uint8_t base_is_sp; - uint8_t indexing; /* IDX_* */ - ExprValue e; /* immediate value or label expression */ + uint8_t idx_reg; + uint8_t idx_is_w; + uint8_t indexing; /* IDX_* */ + uint8_t mem_ext_kind; /* EXT_*; EXT_NONE for plain LSL or no shift */ + uint8_t mem_ext_amt; /* lsl/extend amount on register-offset form */ + uint8_t mem_has_shift; /* 1 if [Xn, Xm, lsl #s] / extend present */ + + uint8_t cond; /* condition code 0..15 */ + ExprValue e; /* immediate value or label expression */ } AArch64Op; +/* ---- forward declarations ---------------------------------------- */ + +static int arm64_parse_reg(int t, uint8_t *preg, uint8_t *pis_w, uint8_t *pis_sp); + +/* ---- token classification helpers -------------------------------- */ + +/* Translate a token to a 4-bit cond code, or -1 if not a cond. */ +static int tok_to_cond(int t) +{ + if (t >= TOK_ASM_eq && t <= TOK_ASM_nv) + return t - TOK_ASM_eq; + if (t == TOK_ASM_hs) return 2; /* alias of cs */ + if (t == TOK_ASM_lo) return 3; /* alias of cc */ + return -1; +} + +static int tok_to_shift(int t) +{ + if (t == TOK_ASM_lsl) return SH_LSL; + if (t == TOK_ASM_lsr) return SH_LSR; + if (t == TOK_ASM_asr) return SH_ASR; + if (t == TOK_ASM_ror) return SH_ROR; + return -1; +} + +static int tok_to_extend(int t) +{ + if (t == TOK_ASM_uxtb) return EXT_UXTB; + if (t == TOK_ASM_uxth) return EXT_UXTH; + if (t == TOK_ASM_uxtw) return EXT_UXTW; + if (t == TOK_ASM_uxtx) return EXT_UXTX; + if (t == TOK_ASM_sxtb) return EXT_SXTB; + if (t == TOK_ASM_sxth) return EXT_SXTH; + if (t == TOK_ASM_sxtw) return EXT_SXTW; + if (t == TOK_ASM_sxtx) return EXT_SXTX; + return -1; +} + /* Recognise a register-name token. Returns 1 if matched. */ static int arm64_parse_reg(int t, uint8_t *preg, uint8_t *pis_w, uint8_t *pis_sp) { @@ -98,6 +192,10 @@ static int arm64_parse_reg(int t, uint8_t *preg, uint8_t *pis_w, uint8_t *pis_sp } if (t == TOK_ASM_wsp) { *preg = 31; *pis_w = 1; *pis_sp = 1; return 1; } if (t == TOK_ASM_wzr) { *preg = 31; *pis_w = 1; *pis_sp = 0; return 1; } + if (t == TOK_ASM_lr) { *preg = 30; *pis_w = 0; *pis_sp = 0; return 1; } + if (t == TOK_ASM_fp) { *preg = 29; *pis_w = 0; *pis_sp = 0; return 1; } + if (t == TOK_ASM_ip0) { *preg = 16; *pis_w = 0; *pis_sp = 0; return 1; } + if (t == TOK_ASM_ip1) { *preg = 17; *pis_w = 0; *pis_sp = 0; return 1; } return 0; } @@ -120,33 +218,119 @@ static void asm_skip_hash(void) if (tok == '#') next(); } -/* Parse [Xn] / [Xn, #imm] / [Xn, #imm]! / [Xn], #imm. */ +static int at_end_of_insn(void) +{ + return tok == ';' || tok == TOK_LINEFEED || tok == TOK_EOF; +} + +/* Parse `, lsl #n` or `, lsr #n` etc. after a register operand. + * On entry the leading comma has been consumed; tok is the first + * token of the shift specifier. */ +static void parse_reg_shift_or_extend(TCCState *s1, AArch64Op *op) +{ + int sh, ex; + ExprValue e; + if ((sh = tok_to_shift(tok)) >= 0) { + next(); + e.v = 0; e.sym = NULL; e.pcrel = 0; + if (!at_end_of_insn() && tok != ',') { + asm_skip_hash(); + asm_expr(s1, &e); + } + if (e.sym) + tcc_error("shift amount must be a constant"); + op->shift_kind = sh; + op->shift_amt = (uint8_t)e.v; + return; + } + if ((ex = tok_to_extend(tok)) >= 0) { + next(); + e.v = 0; e.sym = NULL; e.pcrel = 0; + if (!at_end_of_insn() && tok != ',') { + if (tok == '#') asm_skip_hash(); + asm_expr(s1, &e); + } + if (e.sym) + tcc_error("extend amount must be a constant"); + op->ext_kind = ex; + op->ext_amt = (uint8_t)e.v; + return; + } + expect("shift / extend specifier"); +} + +/* Parse [Xn], [Xn, #imm], [Xn, #imm]!, [Xn], #imm, [Xn, Xm{,extend}]. */ static void parse_mem(TCCState *s1, AArch64Op *op) { uint8_t r, w, sp; skip('['); if (!arm64_parse_reg(tok, &r, &w, &sp) || w) expect("64-bit base register"); - op->kind = OP_MEM; - op->base = r; - op->base_is_sp = sp; - op->indexing = IDX_OFFSET; - op->e.v = 0; - op->e.sym = NULL; - op->e.pcrel = 0; + op->kind = OP_MEM; + op->base = r; + op->base_is_sp = sp; + op->indexing = IDX_OFFSET; + op->mem_ext_kind = EXT_NONE; + op->mem_ext_amt = 0; + op->mem_has_shift = 0; + op->idx_reg = 0; + op->idx_is_w = 0; + op->e.v = 0; op->e.sym = NULL; op->e.pcrel = 0; next(); if (tok == ',') { next(); - asm_skip_hash(); - asm_expr(s1, &op->e); + if (arm64_parse_reg(tok, &r, &w, &sp) && !sp) { + /* register-offset form */ + op->indexing = IDX_REGOFF; + op->idx_reg = r; + op->idx_is_w = w; + next(); + if (tok == ',') { + next(); + /* Either lsl #imm, or one of the extend keywords (uxtw/sxtw/sxtx). */ + int sh = tok_to_shift(tok); + int ex = tok_to_extend(tok); + if (sh == SH_LSL) { + next(); + asm_skip_hash(); + { + ExprValue e = {0}; + asm_expr(s1, &e); + if (e.sym) tcc_error("lsl amount must be constant"); + op->mem_ext_kind = EXT_NONE; + op->mem_ext_amt = (uint8_t)e.v; + op->mem_has_shift = 1; + } + } else if (ex >= 0) { + next(); + op->mem_ext_kind = (uint8_t)ex; + op->mem_ext_amt = 0; + op->mem_has_shift = 1; + if (tok == '#') { + next(); + { + ExprValue e = {0}; + asm_expr(s1, &e); + if (e.sym) tcc_error("extend amount must be constant"); + op->mem_ext_amt = (uint8_t)e.v; + } + } + } else { + expect("lsl / extend specifier"); + } + } + } else { + asm_skip_hash(); + asm_expr(s1, &op->e); + } } skip(']'); if (tok == '!') { next(); op->indexing = IDX_PREIDX; - } else if (tok == ',' && op->e.v == 0 && op->e.sym == NULL) { - /* post-indexed form: [Xn], #imm. - Only recognise if no in-bracket disp was set. */ + } else if (tok == ',' && op->indexing == IDX_OFFSET + && op->e.v == 0 && op->e.sym == NULL) { + /* post-indexed form: [Xn], #imm — only if no in-bracket disp. */ next(); asm_skip_hash(); asm_expr(s1, &op->e); @@ -154,19 +338,32 @@ static void parse_mem(TCCState *s1, AArch64Op *op) } } +/* Parse one operand. */ static void parse_operand(TCCState *s1, AArch64Op *op) { uint8_t r, w, sp; - op->kind = 0; - op->e.v = 0; - op->e.sym = NULL; - op->e.pcrel = 0; + int c; + + op->kind = OP_NONE; + op->reg = 0; + op->is_w = 0; + op->is_sp = 0; + op->shift_kind = SH_NONE; + op->shift_amt = 0; + op->ext_kind = EXT_NONE; + op->ext_amt = 0; + op->cond = 0; + op->e.v = 0; op->e.sym = NULL; op->e.pcrel = 0; if (arm64_parse_reg(tok, &r, &w, &sp)) { op->kind = OP_REG; - op->reg = r; - op->is_w = w; - op->is_sp = sp; + op->reg = r; op->is_w = w; op->is_sp = sp; + next(); + return; + } + if ((c = tok_to_cond(tok)) >= 0) { + op->kind = OP_COND; + op->cond = (uint8_t)c; next(); return; } @@ -174,114 +371,292 @@ static void parse_operand(TCCState *s1, AArch64Op *op) parse_mem(s1, op); return; } + if (tok == '=') { + /* ldr Xn, =imm or ldr Xn, =sym */ + next(); + asm_expr(s1, &op->e); + op->kind = OP_LITERAL; + return; + } asm_skip_hash(); asm_expr(s1, &op->e); op->kind = OP_IMM; } +/* ---- bit-field encoding helper ---------------------------------- */ +/* arm64_encode_bimm64() is provided by arm64-gen.c (static, same TU). */ + /* ---- encoders ----------------------------------------------------- */ -/* MOVZ/MOVN/MOVK chain to load an X-form 64-bit immediate into Xd. */ -static void emit_movimm_x(int rd, uint64_t x) -{ - int i, z = 0, m = 0, emitted = 0; - uint64_t x1 = x; - uint32_t mov1 = 0xd2800000; /* MOVZ X */ - for (i = 0; i < 64; i += 16) { - if (((x >> i) & 0xffff) == 0) z++; - if (((~x >> i) & 0xffff) == 0) m++; - } - if (m > z) { - x1 = ~x; - mov1 = 0x92800000; /* MOVN X */ - } - for (i = 0; i < 64; i += 16) { - if (((x1 >> i) & 0xffff) != 0) { - gen_le32(mov1 | rd | ((x1 >> i) & 0xffff) << 5 | (i << 17)); - emitted = 1; - i += 16; - break; - } - } - if (!emitted) { - /* x is all zeros (MOVZ) or all ones (MOVN selected on ~0): emit one insn. */ - gen_le32(mov1 | rd); - return; +static uint32_t sf_bit(int is_w) { return is_w ? 0u : (1u << 31); } + +/* ADD/SUB (immediate). is_sub flips the polarity; set_flags sets the S bit. */ +static void emit_addsub_imm(int rd, int rn, int64_t imm, int is_w, + int is_sub, int set_flags) +{ + uint32_t op; + if (imm < 0) { + imm = -imm; + is_sub = !is_sub; } - for (; i < 64; i += 16) { - if (((x1 >> i) & 0xffff) != 0) - gen_le32(0xf2800000 | rd | ((x >> i) & 0xffff) << 5 | (i << 17)); + op = 0x11000000u; /* ADD imm base */ + op |= sf_bit(is_w); + if (is_sub) op |= (1u << 30); + if (set_flags) op |= (1u << 29); + if (imm >= 0 && imm < 4096) { + gen_le32(op | (((uint32_t)imm) << 10) | (rn << 5) | rd); + } else if (imm >= 0 && (imm & 0xfff) == 0 && (imm >> 12) < 4096) { + gen_le32(op | (1u << 22) | (((uint32_t)(imm >> 12)) << 10) | (rn << 5) | rd); + } else { + tcc_error("add/sub immediate out of range"); } } -/* MOV (register): ORR Xd, XZR, Xm (when neither is SP). */ -static void emit_mov_reg_orr(int rd, int rm, int is_w) +/* ADD/SUB (shifted register). */ +static void emit_addsub_reg(int rd, int rn, int rm, int is_w, + int is_sub, int set_flags, + int shift_kind, int shift_amt) { - uint32_t insn = 0xaa0003e0 | (rm << 16) | rd; - if (is_w) insn &= 0x7fffffff; - gen_le32(insn); + uint32_t op = 0x0b000000u; /* base ADD shift-reg */ + op |= sf_bit(is_w); + if (is_sub) op |= (1u << 30); + if (set_flags) op |= (1u << 29); + if (shift_kind == SH_NONE) shift_kind = SH_LSL; + if (shift_kind == SH_ROR) + tcc_error("add/sub: ROR shift not allowed"); + if (shift_amt < 0 || shift_amt > (is_w ? 31 : 63)) + tcc_error("add/sub: shift amount out of range"); + op |= (uint32_t)shift_kind << 22; + op |= (rm << 16) | ((uint32_t)shift_amt << 10) | (rn << 5) | rd; + gen_le32(op); } -/* MOV (to/from SP): ADD Xd, Xn, #0. */ -static void emit_mov_sp_add(int rd, int rn, int is_w) +/* ADD/SUB (extended register). */ +static void emit_addsub_ext(int rd, int rn, int rm, int is_w, + int is_sub, int set_flags, + int ext_kind, int ext_amt) { - uint32_t insn = 0x91000000 | (rn << 5) | rd; - if (is_w) insn &= 0x7fffffff; + uint32_t op = 0x0b200000u; /* base ADD extended-reg */ + op |= sf_bit(is_w); + if (is_sub) op |= (1u << 30); + if (set_flags) op |= (1u << 29); + if (ext_amt < 0 || ext_amt > 4) + tcc_error("add/sub extend: shift out of range"); + op |= ((uint32_t)ext_kind & 7u) << 13; + op |= (rm << 16) | ((uint32_t)ext_amt << 10) | (rn << 5) | rd; + gen_le32(op); +} + +/* Logical (immediate): AND/ORR/EOR/ANDS. */ +static void emit_log_imm(int rd, int rn, uint64_t imm, int is_w, int op2) +{ + /* op2: 0=AND, 1=ORR, 2=EOR, 3=ANDS */ + uint32_t insn; + int e; + uint64_t v = is_w ? (imm | imm << 32) : imm; /* widen for encoder */ + e = arm64_encode_bimm64(v); + if (e < 0) + tcc_error("logical immediate not encodable"); + insn = 0x12000000u | sf_bit(is_w) | ((uint32_t)op2 << 29) | + ((uint32_t)e << 10) | (rn << 5) | rd; + /* arm64_encode_bimm64 sets bit12 (=N) appropriately for 64-bit; + for 32-bit ops the N bit must be clear, but the widened value + above forces a 32-bit pattern with N=0 already. */ gen_le32(insn); } -/* ADD/SUB (immediate). is_sub = 1 emits SUB; negative imm flips polarity. */ -static void emit_addsub_imm(int rd, int rn, int64_t imm, int is_w, int is_sub) +/* Logical (shifted register): AND/ORR/EOR/ANDS with optional invert (BIC/ORN/EON/BICS). */ +static void emit_log_reg(int rd, int rn, int rm, int is_w, + int op2, int invert, int shift_kind, int shift_amt) { - uint32_t base; - if (imm < 0) { - imm = -imm; - is_sub = !is_sub; - } - base = is_sub ? 0xd1000000 : 0x91000000; /* X-form */ - if (is_w) base &= 0x7fffffff; - if (imm >= 0 && imm < 4096) { - gen_le32(base | (((uint32_t)imm) << 10) | (rn << 5) | rd); - } else if (imm >= 0 && imm < (4096 << 12) && (imm & 0xfff) == 0) { - gen_le32(base | (1u << 22) | (((uint32_t)(imm >> 12)) << 10) | (rn << 5) | rd); - } else { - tcc_error("add/sub immediate out of range"); + uint32_t op = 0x0a000000u | sf_bit(is_w) | ((uint32_t)op2 << 29); + if (shift_kind == SH_NONE) shift_kind = SH_LSL; + if (shift_amt < 0 || shift_amt > (is_w ? 31 : 63)) + tcc_error("logical: shift amount out of range"); + op |= (uint32_t)shift_kind << 22; + if (invert) op |= (1u << 21); + op |= (rm << 16) | ((uint32_t)shift_amt << 10) | (rn << 5) | rd; + gen_le32(op); +} + +/* MOVZ/MOVN/MOVK (single 16-bit hword + LSL). */ +static void emit_movw(int rd, int hw_imm, int hw_shift, int is_w, int op2) +{ + /* op2: 0=MOVN, 2=MOVZ, 3=MOVK */ + uint32_t op; + if (hw_imm < 0 || hw_imm > 0xffff) + tcc_error("movz/movn/movk: imm16 out of range"); + if ((hw_shift & 0xf) != 0 || hw_shift < 0 || hw_shift > (is_w ? 16 : 48)) + tcc_error("movz/movn/movk: shift must be 0/16/32/48"); + op = 0x12800000u | sf_bit(is_w) | ((uint32_t)op2 << 29) | + (((uint32_t)hw_shift / 16) << 21) | + ((uint32_t)hw_imm << 5) | rd; + gen_le32(op); +} + +/* SBFM/BFM/UBFM. */ +static void emit_bfm(int rd, int rn, int immr, int imms, int is_w, int op2) +{ + /* op2: 0=SBFM, 1=BFM, 2=UBFM */ + int width = is_w ? 31 : 63; + uint32_t op; + if (immr < 0 || immr > width || imms < 0 || imms > width) + tcc_error("bfm: bit positions out of range"); + op = 0x13000000u | sf_bit(is_w) | ((uint32_t)op2 << 29); + if (!is_w) op |= (1u << 22); /* N bit follows sf in 64-bit forms */ + op |= ((uint32_t)immr << 16) | ((uint32_t)imms << 10) | (rn << 5) | rd; + gen_le32(op); +} + +/* Variable-shift (LSLV/LSRV/ASRV/RORV). op2: 8=LSLV, 9=LSRV, 10=ASRV, 11=RORV. */ +static void emit_shift_reg(int rd, int rn, int rm, int is_w, int op2) +{ + uint32_t op = 0x1ac02000u | sf_bit(is_w) | (rm << 16) | + ((uint32_t)(op2 & 0xf) << 10) | (rn << 5) | rd; + gen_le32(op); +} + +/* SDIV/UDIV. is_signed=1 => SDIV. */ +static void emit_div(int rd, int rn, int rm, int is_w, int is_signed) +{ + uint32_t op = 0x1ac00800u | sf_bit(is_w) | (rm << 16) | + (rn << 5) | rd; + if (is_signed) op |= (1u << 10); + gen_le32(op); +} + +/* MADD/MSUB (32 or 64 bit). is_sub flips the o0 bit. */ +static void emit_madd(int rd, int rn, int rm, int ra, int is_w, int is_sub) +{ + uint32_t op = 0x1b000000u | sf_bit(is_w) | (rm << 16) | + (ra << 10) | (rn << 5) | rd; + if (is_sub) op |= (1u << 15); + gen_le32(op); +} + +/* SMADDL/UMADDL/SMSUBL/UMSUBL/SMULL/UMULL (long mul). */ +static void emit_madd_long(int rd, int rn, int rm, int ra, + int is_unsigned, int is_sub) +{ + uint32_t op = 0x9b200000u | (rm << 16) | (ra << 10) | (rn << 5) | rd; + if (is_unsigned) op |= (1u << 23); + if (is_sub) op |= (1u << 15); + gen_le32(op); +} + +/* SMULH / UMULH. */ +static void emit_mulh(int rd, int rn, int rm, int is_unsigned) +{ + uint32_t op = 0x9b407c00u | (rm << 16) | (rn << 5) | rd; + if (is_unsigned) op |= (1u << 23); + gen_le32(op); +} + +/* CSEL/CSINC/CSINV/CSNEG. op2: 00=SEL, 01=INC, 10=INV, 11=NEG (bit15 invert, bit10 inc/neg). */ +static void emit_csel(int rd, int rn, int rm, int cond, int is_w, + int invert, int inc_neg) +{ + uint32_t op = 0x1a800000u | sf_bit(is_w) | (rm << 16) | + ((uint32_t)(cond & 0xf) << 12) | + (rn << 5) | rd; + if (invert) op |= (1u << 30); + if (inc_neg) op |= (1u << 10); + gen_le32(op); +} + +/* LDR/STR (immediate, unsigned offset / unscaled / pre-/post-indexed). + * size: 0=byte,1=halfword,2=word,3=dword. opc encodes load/store/sign: + * STR=0, LDR=1, LDRSx 64-target=2, LDRSx 32-target=3 (size<3 only). + * For size=3 + opc=0/1 = STR/LDR X. + */ +static void emit_ldst_imm(int opc, int size, int rt, int rn, + int64_t imm, int indexing) +{ + uint32_t op; + if (indexing == IDX_OFFSET) { + /* unsigned-offset, scaled by 1<<size, range 0..4095 */ + int64_t scale = (int64_t)1 << size; + int64_t scaled; + if (imm < 0 || (imm & (scale - 1))) + tcc_error("ldr/str: immediate offset must be unsigned & scaled"); + scaled = imm >> size; + if (scaled < 0 || scaled > 4095) + tcc_error("ldr/str: unsigned offset out of range"); + op = 0x39000000u | ((uint32_t)size << 30) | ((uint32_t)opc << 22) | + (((uint32_t)scaled) << 10) | (rn << 5) | rt; + gen_le32(op); + return; } + /* signed 9-bit forms: post-index, pre-index, or unscaled (LDUR/STUR). */ + if (imm < -256 || imm > 255) + tcc_error("ldr/str: signed-9 offset out of range"); + op = 0x38000000u | ((uint32_t)size << 30) | ((uint32_t)opc << 22) | + (((uint32_t)imm & 0x1ff) << 12) | (rn << 5) | rt; + if (indexing == IDX_POSTIDX) op |= (1u << 10); + else if (indexing == IDX_PREIDX) op |= (3u << 10); + /* IDX_OFFSET with unscaled (LDUR/STUR) leaves these bits zero. */ + gen_le32(op); } -/* LDR/STR (immediate, unsigned offset, X-form). */ -static void emit_ldst_imm_unsigned(int is_load, int rt, int rn, int64_t imm) +/* LDR/STR (register offset, with optional shift/extend). */ +static void emit_ldst_reg(int opc, int size, int rt, int rn, int rm, + int idx_is_w, int ext_kind, int ext_amt, + int has_shift) { - uint32_t base = is_load ? 0xf9400000 : 0xf9000000; - int64_t scaled = imm / 8; - if ((imm & 7) || scaled < 0 || scaled > 4095) - tcc_error("ldr/str immediate offset out of range"); - gen_le32(base | (((uint32_t)scaled) << 10) | (rn << 5) | rt); + uint32_t option; + uint32_t s_bit = 0; + if (ext_kind == EXT_NONE) { + /* implicit LSL — option=011 (UXTX) for X-form, 010 (UXTW) for W-form */ + option = idx_is_w ? 2 : 3; + if (has_shift) { + if (ext_amt != 0 && ext_amt != size) + tcc_error("ldr/str: lsl amount must be 0 or %d", size); + if (ext_amt == size && size > 0) s_bit = 1; + } + } else { + option = (uint32_t)ext_kind & 7u; + if (has_shift && ext_amt != 0) { + if (ext_amt != size) + tcc_error("ldr/str: extend shift must be 0 or %d", size); + s_bit = 1; + } + } + { + uint32_t op = 0x38200800u | ((uint32_t)size << 30) | ((uint32_t)opc << 22) | + (rm << 16) | (option << 13) | (s_bit << 12) | + (rn << 5) | rt; + gen_le32(op); + } } -/* LDP/STP (X-form). indexing = IDX_OFFSET / IDX_PREIDX / IDX_POSTIDX. */ +/* LDP/STP (X-form / W-form). */ static void emit_ldst_pair(int is_load, int rt1, int rt2, int rn, - int64_t imm, int indexing) + int64_t imm, int indexing, int is_w) { - uint32_t op = is_load ? 0xa8400000 : 0xa8000000; - int64_t scaled = imm / 8; - if ((imm & 7) || scaled < -64 || scaled > 63) - tcc_error("ldp/stp offset out of range"); - if (indexing == IDX_POSTIDX) op |= 0x00800000; - else if (indexing == IDX_PREIDX) op |= 0x01800000; - else op |= 0x01000000; - gen_le32(op | (((uint32_t)(scaled & 0x7f)) << 15) | - (rt2 << 10) | (rn << 5) | rt1); + uint32_t op = 0x28000000u | ((uint32_t)(is_w ? 0 : 2) << 30); + int shift = is_w ? 2 : 3; + int64_t scaled; + if (imm & ((1 << shift) - 1)) + tcc_error("ldp/stp: misaligned offset"); + scaled = imm >> shift; + if (scaled < -64 || scaled > 63) + tcc_error("ldp/stp: offset out of range"); + if (is_load) op |= (1u << 22); + if (indexing == IDX_POSTIDX) op |= (1u << 23); + else if (indexing == IDX_PREIDX) op |= (3u << 23); + else op |= (2u << 23); + op |= (((uint32_t)scaled & 0x7f) << 15) | + (rt2 << 10) | (rn << 5) | rt1; + gen_le32(op); } -/* B / BL with a label or in-section offset. is_call = 1 emits BL. */ +/* B / BL with a label or in-section offset. */ static void emit_branch_imm(AArch64Op *op, int is_call) { - uint32_t base = is_call ? 0x94000000 : 0x14000000; + uint32_t base = is_call ? 0x94000000u : 0x14000000u; Sym *sym = op->e.sym; if (sym && sym->r == cur_text_section->sh_num && !(sym->type.t & VT_EXTERN)) { - /* In-section, defined: compute the offset directly. */ int64_t target = (int64_t)sym->jnext + (int64_t)op->e.v; int64_t off = target - ind; if (off & 3) tcc_error("branch target not 4-byte aligned"); @@ -294,7 +669,6 @@ static void emit_branch_imm(AArch64Op *op, int is_call) greloca(cur_text_section, sym, ind, reloc, op->e.v); gen_le32(base); } else { - /* Pure immediate offset (rare in source asm, but support `b 0`). */ int64_t off = (int64_t)op->e.v; if (off & 3) tcc_error("branch target not 4-byte aligned"); off >>= 2; @@ -304,19 +678,104 @@ static void emit_branch_imm(AArch64Op *op, int is_call) } } -static void emit_ret(int rn) +/* Compute a signed PC-relative offset from a same-section, non-extern + * symbol expression. Errors on extern reference. */ +static int64_t sec_local_offset(AArch64Op *op, const char *what) +{ + Sym *sym = op->e.sym; + int64_t off; + if (sym) { + if (sym->r != cur_text_section->sh_num || (sym->type.t & VT_EXTERN)) + tcc_error("%s: extern target needs CONDBR19 reloc (unsupported)", what); + off = (int64_t)sym->jnext + (int64_t)op->e.v - (int64_t)ind; + } else { + off = (int64_t)op->e.v; + } + if (off & 3) tcc_error("%s: target not 4-byte aligned", what); + return off; +} + +/* B.cond (in-section only). */ +static void emit_branch_cond(int cond, AArch64Op *target) +{ + int64_t off = sec_local_offset(target, "b.cond"); + int64_t imm = off >> 2; + if (imm < -(1 << 18) || imm >= (1 << 18)) + tcc_error("b.cond: target out of 19-bit range"); + gen_le32(0x54000000u | (((uint32_t)imm & 0x7ffffu) << 5) | (uint32_t)cond); +} + +/* CBZ/CBNZ: op=0/1. */ +static void emit_branch_cmp(int rt, AArch64Op *target, int is_w, int op_cbnz) +{ + int64_t off = sec_local_offset(target, "cbz/cbnz"); + int64_t imm = off >> 2; + if (imm < -(1 << 18) || imm >= (1 << 18)) + tcc_error("cbz/cbnz: target out of 19-bit range"); + { + uint32_t op = 0x34000000u | sf_bit(is_w) | (((uint32_t)imm & 0x7ffffu) << 5) | rt; + if (op_cbnz) op |= (1u << 24); + gen_le32(op); + } +} + +/* TBZ/TBNZ: op=0/1. bit_pos in 0..63 (bit5 = b5, bits 4..0 = b40). */ +static void emit_branch_test(int rt, int bit_pos, AArch64Op *target, int op_tbnz) +{ + int64_t off = sec_local_offset(target, "tbz/tbnz"); + int64_t imm = off >> 2; + int b5; + if (bit_pos < 0 || bit_pos > 63) + tcc_error("tbz/tbnz: bit position out of range"); + if (imm < -(1 << 13) || imm >= (1 << 13)) + tcc_error("tbz/tbnz: target out of 14-bit range"); + b5 = (bit_pos >> 5) & 1; + { + uint32_t op = 0x36000000u | ((uint32_t)b5 << 31) | + (((uint32_t)bit_pos & 0x1fu) << 19) | + (((uint32_t)imm & 0x3fffu) << 5) | rt; + if (op_tbnz) op |= (1u << 24); + gen_le32(op); + } +} + +/* BR/BLR/RET — register-indirect branches. op2: 0=BR, 1=BLR, 2=RET. */ +static void emit_branch_reg(int rn, int op2) { - gen_le32(0xd65f0000u | (rn << 5)); + static const uint32_t base[3] = { + 0xd61f0000u, 0xd63f0000u, 0xd65f0000u + }; + gen_le32(base[op2] | (rn << 5)); } -static void emit_svc(int64_t imm) +/* SVC/HVC/SMC/BRK/HLT — exception-generating with imm16. */ +static void emit_excgen(uint32_t base, int64_t imm) { if (imm < 0 || imm > 0xffff) - tcc_error("svc immediate out of range"); - gen_le32(0xd4000001u | (((uint32_t)imm) << 5)); + tcc_error("svc/brk imm out of range"); + gen_le32(base | (((uint32_t)imm) << 5)); } -/* ---- mnemonic dispatch -------------------------------------------- */ +/* Lower `ldr Xn, =imm64` and `ldr Xn, =sym` to a movz/movk chain. */ +static void emit_ldr_literal(int rd, AArch64Op *src) +{ + if (src->e.sym) { + Sym *sym = src->e.sym; + /* MOVW_UABS_G0_NC, then G1_NC, G2_NC, G3 — full 64-bit address. */ + greloca(cur_text_section, sym, ind, R_AARCH64_MOVW_UABS_G0_NC, src->e.v); + gen_le32(0xd2800000u | rd); /* movz */ + greloca(cur_text_section, sym, ind, R_AARCH64_MOVW_UABS_G1_NC, src->e.v); + gen_le32(0xf2a00000u | rd); /* movk lsl#16 */ + greloca(cur_text_section, sym, ind, R_AARCH64_MOVW_UABS_G2_NC, src->e.v); + gen_le32(0xf2c00000u | rd); /* movk lsl#32 */ + greloca(cur_text_section, sym, ind, R_AARCH64_MOVW_UABS_G3, src->e.v); + gen_le32(0xf2e00000u | rd); /* movk lsl#48 */ + } else { + arm64_movimm(rd, (uint64_t)src->e.v); + } +} + +/* ---- mnemonic dispatch helpers ----------------------------------- */ static void need_xreg(AArch64Op *op, const char *what) { @@ -324,103 +783,713 @@ static void need_xreg(AArch64Op *op, const char *what) tcc_error("%s: expected 64-bit register", what); } -ST_FUNC void asm_opcode(TCCState *s1, int token) +static void need_reg(AArch64Op *op, const char *what) { - AArch64Op a, b, c; + if (op->kind != OP_REG) + tcc_error("%s: expected register", what); +} - switch (token) { +static void check_size_match(AArch64Op *a, AArch64Op *b, const char *what) +{ + if (a->is_w != b->is_w) + tcc_error("%s: register size mismatch", what); +} - case TOK_ASM_mov: - parse_operand(s1, &a); +/* Lookup table: addsub mnemonics → (is_sub, set_flags). */ +static int is_sub_token(int t) +{ + return t == TOK_ASM_sub || t == TOK_ASM_subs || + t == TOK_ASM_cmp || t == TOK_ASM_neg || t == TOK_ASM_negs; +} + +static int sets_flags_token(int t) +{ + return t == TOK_ASM_adds || t == TOK_ASM_subs || + t == TOK_ASM_cmp || t == TOK_ASM_cmn || t == TOK_ASM_negs; +} + +/* Logical op2 from token (for plain logical-imm/reg). */ +static int log_op2(int t) +{ + switch (t) { + case TOK_ASM_and: return 0; + case TOK_ASM_orr: case TOK_ASM_mvn: return 1; + case TOK_ASM_eor: return 2; + case TOK_ASM_ands: case TOK_ASM_tst: return 3; + case TOK_ASM_bic: return 0; + case TOK_ASM_orn: return 1; + case TOK_ASM_eon: return 2; + case TOK_ASM_bics: return 3; + } + return -1; +} + +static int log_inverts(int t) +{ + return t == TOK_ASM_bic || t == TOK_ASM_orn || + t == TOK_ASM_eon || t == TOK_ASM_bics || + t == TOK_ASM_mvn; +} + +/* ---- per-mnemonic handlers --------------------------------------- */ + +static void do_addsub(TCCState *s1, int token) +{ + AArch64Op a, b, c; + int is_sub = is_sub_token(token); + int set_flg = sets_flags_token(token); + int has_dst = !(token == TOK_ASM_cmp || token == TOK_ASM_cmn); + int is_neg = (token == TOK_ASM_neg || token == TOK_ASM_negs); + + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + if (!has_dst) { + /* cmp/cmn: a=Rn, b=Rm/imm; encode subs/adds with rd=31 */ + c = b; + b = a; + a.kind = OP_REG; a.reg = 31; a.is_w = b.is_w; a.is_sp = 0; + } else if (is_neg) { + /* neg/negs Rd, Rm[, shift]: sub Rd, xzr, Rm ... */ + c = b; + b.kind = OP_REG; b.reg = 31; b.is_w = a.is_w; b.is_sp = 0; + } else { asm_skip_comma(); - parse_operand(s1, &b); - if (a.kind != OP_REG) - tcc_error("mov: destination must be a register"); - if (b.kind == OP_REG) { - if (a.is_w != b.is_w) - tcc_error("mov: register size mismatch"); - if (a.is_sp || b.is_sp) - emit_mov_sp_add(a.reg, b.reg, a.is_w); - else - emit_mov_reg_orr(a.reg, b.reg, a.is_w); - } else if (b.kind == OP_IMM) { - if (b.e.sym) - tcc_error("mov: symbol immediate not supported (phase 1)"); - if (a.is_sp) - tcc_error("mov sp, #imm: use add"); - if (a.is_w) { - /* W-form: emit MOVZ/MOVN/MOVK with sf=0. Kept narrow: - only single-hword positive values seen in-tree. */ - uint32_t v = (uint32_t)b.e.v; - if (v <= 0xffff) { - gen_le32(0x52800000u | a.reg | (v << 5)); - } else if ((~v) <= 0xffff) { - gen_le32(0x12800000u | a.reg | (((~v) & 0xffff) << 5)); - } else { - tcc_error("mov W#imm: only 16-bit values supported (phase 1)"); + parse_operand(s1, &c); + } + + need_reg(&a, "add/sub"); + need_reg(&b, "add/sub"); + check_size_match(&a, &b, "add/sub"); + + if (c.kind == OP_REG) { + check_size_match(&a, &c, "add/sub"); + if (tok == ',') { + next(); + parse_reg_shift_or_extend(s1, &c); + } + if (c.ext_kind != EXT_NONE) { + emit_addsub_ext(a.reg, b.reg, c.reg, a.is_w, is_sub, set_flg, + c.ext_kind, c.ext_amt); + } else if (a.is_sp || b.is_sp) { + /* sp uses the extended-reg encoding with default UXTX/UXTW. */ + int ext = a.is_w ? EXT_UXTW : EXT_UXTX; + emit_addsub_ext(a.reg, b.reg, c.reg, a.is_w, is_sub, set_flg, + ext, c.shift_amt); + } else { + emit_addsub_reg(a.reg, b.reg, c.reg, a.is_w, is_sub, set_flg, + c.shift_kind, c.shift_amt); + } + } else if (c.kind == OP_IMM && !c.e.sym) { + if (tok == ',') { + /* allow `, lsl #12` after the immediate */ + next(); + if (tok == TOK_ASM_lsl) { + next(); + asm_skip_hash(); + { + ExprValue se = {0}; + asm_expr(s1, &se); + if (se.v == 12) c.e.v <<= 12; + else if (se.v != 0) tcc_error("add/sub: lsl must be 0 or 12"); } } else { - emit_movimm_x(a.reg, b.e.v); + expect("lsl"); } - } else { - tcc_error("mov: unsupported source operand"); } - return; + emit_addsub_imm(a.reg, b.reg, (int64_t)c.e.v, a.is_w, is_sub, set_flg); + } else { + tcc_error("add/sub: unsupported operand"); + } +} - case TOK_ASM_add: - case TOK_ASM_sub: - parse_operand(s1, &a); - asm_skip_comma(); - parse_operand(s1, &b); +static void do_logical(TCCState *s1, int token) +{ + AArch64Op a, b, c; + int op2 = log_op2(token); + int invt = log_inverts(token); + int has_dst = !(token == TOK_ASM_tst); + int is_mvn = (token == TOK_ASM_mvn); + + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + if (!has_dst) { + /* tst Rn, Op2 => ands xzr, Rn, Op2 */ + c = b; + b = a; + a.kind = OP_REG; a.reg = 31; a.is_w = b.is_w; a.is_sp = 0; + } else if (is_mvn) { + /* mvn Rd, Rm => orn Rd, xzr, Rm */ + c = b; + b.kind = OP_REG; b.reg = 31; b.is_w = a.is_w; b.is_sp = 0; + } else { asm_skip_comma(); parse_operand(s1, &c); - if (a.kind != OP_REG || b.kind != OP_REG) - tcc_error("add/sub: expected register operands"); - if (c.kind == OP_IMM) { - if (c.e.sym) - tcc_error("add/sub: symbol immediate not supported"); - if (a.is_w != b.is_w) - tcc_error("add/sub: register size mismatch"); - emit_addsub_imm(a.reg, b.reg, (int64_t)c.e.v, a.is_w, - token == TOK_ASM_sub); + } + + need_reg(&a, "logical"); + need_reg(&b, "logical"); + check_size_match(&a, &b, "logical"); + + if (c.kind == OP_REG) { + check_size_match(&a, &c, "logical"); + if (tok == ',') { + next(); + parse_reg_shift_or_extend(s1, &c); + if (c.ext_kind != EXT_NONE) + tcc_error("logical: extend not supported"); + } + emit_log_reg(a.reg, b.reg, c.reg, a.is_w, op2, invt, + c.shift_kind, c.shift_amt); + } else if (c.kind == OP_IMM && !c.e.sym) { + if (invt) + tcc_error("logical: invert form requires a register"); + emit_log_imm(a.reg, b.reg, (uint64_t)c.e.v, a.is_w, op2); + } else { + tcc_error("logical: unsupported operand"); + } +} + +static void do_movw(TCCState *s1, int token) +{ + AArch64Op a, b; + int op2 = (token == TOK_ASM_movn) ? 0 : + (token == TOK_ASM_movz) ? 2 : 3; /* movk */ + int hw_shift = 0; + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + need_reg(&a, "movz/movn/movk"); + if (b.kind != OP_IMM) + tcc_error("movz/movn/movk: expected immediate"); + if (b.e.sym) + tcc_error("movz/movn/movk: symbol immediate not supported"); + if (tok == ',') { + next(); + if (tok == TOK_ASM_lsl) { + next(); + asm_skip_hash(); + { + ExprValue se = {0}; + asm_expr(s1, &se); + hw_shift = (int)se.v; + } } else { - tcc_error("add/sub: only immediate form supported in phase 1"); + expect("lsl"); } - return; + } + emit_movw(a.reg, (int)b.e.v, hw_shift, a.is_w, op2); +} - case TOK_ASM_ldr: - case TOK_ASM_str: - parse_operand(s1, &a); - asm_skip_comma(); - parse_operand(s1, &b); - need_xreg(&a, "ldr/str data register"); - if (b.kind != OP_MEM) - tcc_error("ldr/str: expected memory operand"); - if (b.indexing != IDX_OFFSET) - tcc_error("ldr/str: pre/post-indexed form not supported in phase 1"); +/* mov — phase 2 expanded handler. */ +static void do_mov(TCCState *s1) +{ + AArch64Op a, b; + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + if (a.kind != OP_REG) + tcc_error("mov: destination must be a register"); + if (b.kind == OP_REG) { + check_size_match(&a, &b, "mov"); + if (a.is_sp || b.is_sp) { + /* mov sp/Rn, sp/Rn => add Rd, Rn, #0 */ + emit_addsub_imm(a.reg, b.reg, 0, a.is_w, 0, 0); + } else { + /* mov Rd, Rm => orr Rd, xzr, Rm */ + emit_log_reg(a.reg, 31, b.reg, a.is_w, 1 /*ORR*/, 0, + SH_LSL, 0); + } + } else if (b.kind == OP_IMM) { if (b.e.sym) - tcc_error("ldr/str: symbolic offset not supported"); - emit_ldst_imm_unsigned(token == TOK_ASM_ldr, a.reg, b.base, - (int64_t)b.e.v); + tcc_error("mov: symbol immediate not supported (use ldr =sym)"); + if (a.is_sp) + tcc_error("mov sp, #imm: use add"); + if (a.is_w) { + uint64_t v = (uint32_t)b.e.v; + uint32_t insn; + int e; + /* try movz w(r),#x; movn w(r),#~x; orr w(r), wzr, #imm */ + if (!(v & ~0xffffull)) { + gen_le32(0x52800000u | a.reg | ((uint32_t)v << 5)); + return; + } + if (!((~v & 0xffffffffu) & ~0xffffull)) { + gen_le32(0x12800000u | a.reg | ((uint32_t)(~v & 0xffff) << 5)); + return; + } + e = arm64_encode_bimm64((v & 0xffffffffu) | (v << 32)); + if (e >= 0) { + insn = 0x320003e0u | a.reg | ((uint32_t)e << 10); + gen_le32(insn); + return; + } + tcc_error("mov w#imm: value not encodable"); + } else { + arm64_movimm(a.reg, (uint64_t)b.e.v); + } + } else { + tcc_error("mov: unsupported source operand"); + } +} + +/* shift-imm aliases (lsl/lsr/asr/ror imm) and shift-reg aliases. + * Detected when the third operand is OP_IMM (alias to bfm) or OP_REG + * (alias to lslv/lsrv/asrv/rorv). */ +static void do_shift(TCCState *s1, int token) +{ + AArch64Op a, b, c; + int sh = (token == TOK_ASM_lsl) ? SH_LSL : + (token == TOK_ASM_lsr) ? SH_LSR : + (token == TOK_ASM_asr) ? SH_ASR : SH_ROR; + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + asm_skip_comma(); + parse_operand(s1, &c); + need_reg(&a, "shift"); + need_reg(&b, "shift"); + check_size_match(&a, &b, "shift"); + if (c.kind == OP_REG) { + check_size_match(&a, &c, "shift"); + emit_shift_reg(a.reg, b.reg, c.reg, a.is_w, + sh == SH_LSL ? 8 : + sh == SH_LSR ? 9 : + sh == SH_ASR ? 10 : 11); return; + } + if (c.kind != OP_IMM || c.e.sym) + tcc_error("shift: expected reg or immediate"); + { + int width = a.is_w ? 32 : 64; + int shamt = (int)c.e.v; + int immr, imms, op2; + if (sh == SH_ROR) { + /* extr Rd, Rn, Rn, #imm — encode via EXTR (we don't have a + separate emitter, so error out for now) */ + tcc_error("ror imm: not supported (use rorv)"); + } + if (shamt < 0 || shamt >= width) + tcc_error("shift amount out of range"); + if (sh == SH_LSL) { + immr = (-shamt) & (width - 1); + imms = (width - 1) - shamt; + op2 = 2; /* UBFM */ + } else if (sh == SH_LSR) { + immr = shamt; + imms = width - 1; + op2 = 2; /* UBFM */ + } else /* SH_ASR */ { + immr = shamt; + imms = width - 1; + op2 = 0; /* SBFM */ + } + emit_bfm(a.reg, b.reg, immr, imms, a.is_w, op2); + } +} - case TOK_ASM_ldp: - case TOK_ASM_stp: - parse_operand(s1, &a); +/* Bitfield mnemonic handler (sbfm/ubfm/bfm). */ +static void do_bfm(TCCState *s1, int token) +{ + AArch64Op a, b, c, d; + int op2 = (token == TOK_ASM_sbfm) ? 0 : + (token == TOK_ASM_bfm) ? 1 : 2; + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + asm_skip_comma(); + parse_operand(s1, &c); + asm_skip_comma(); + parse_operand(s1, &d); + need_reg(&a, "bfm"); + need_reg(&b, "bfm"); + check_size_match(&a, &b, "bfm"); + if (c.kind != OP_IMM || d.kind != OP_IMM || c.e.sym || d.e.sym) + tcc_error("bfm: immr/imms must be constants"); + emit_bfm(a.reg, b.reg, (int)c.e.v, (int)d.e.v, a.is_w, op2); +} + +/* Sign/zero-extend aliases (sxtb/sxth/sxtw/uxtb/uxth) when used as mnemonics. */ +static void do_extend_alias(TCCState *s1, int token) +{ + AArch64Op a, b; + int immr = 0, imms, op2, is_w; + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + need_reg(&a, "ext alias"); + need_reg(&b, "ext alias"); + /* sxtw is only valid for X-form. uxtb/uxth/sxtb/sxth follow Rd. */ + is_w = a.is_w; + switch (token) { + case TOK_ASM_sxtb: imms = 7; op2 = 0; break; + case TOK_ASM_sxth: imms = 15; op2 = 0; break; + case TOK_ASM_sxtw: imms = 31; op2 = 0; is_w = 0; break; + case TOK_ASM_uxtb: imms = 7; op2 = 2; break; + case TOK_ASM_uxth: imms = 15; op2 = 2; break; + default: tcc_error("internal: bad extend alias"); return; + } + emit_bfm(a.reg, b.reg, immr, imms, is_w, op2); +} + +/* mul/mneg/madd/msub family. */ +static void do_mul(TCCState *s1, int token) +{ + AArch64Op a, b, c, d; + int is_sub = (token == TOK_ASM_msub || token == TOK_ASM_mneg); + int has_ra = (token == TOK_ASM_madd || token == TOK_ASM_msub); + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + asm_skip_comma(); + parse_operand(s1, &c); + if (has_ra) { asm_skip_comma(); - parse_operand(s1, &b); + parse_operand(s1, &d); + need_reg(&d, "madd/msub"); + } else { + d.kind = OP_REG; d.reg = 31; d.is_w = a.is_w; d.is_sp = 0; + } + need_reg(&a, "mul"); need_reg(&b, "mul"); need_reg(&c, "mul"); + check_size_match(&a, &b, "mul"); + check_size_match(&a, &c, "mul"); + emit_madd(a.reg, b.reg, c.reg, d.reg, a.is_w, is_sub); +} + +/* smull/umull/smnegl/umnegl/smaddl/umaddl/smsubl/umsubl. */ +static void do_mul_long(TCCState *s1, int token) +{ + AArch64Op a, b, c, d; + int is_unsigned, is_sub, has_ra; + is_unsigned = (token == TOK_ASM_umull || token == TOK_ASM_umnegl || + token == TOK_ASM_umaddl || token == TOK_ASM_umsubl); + is_sub = (token == TOK_ASM_smnegl || token == TOK_ASM_umnegl || + token == TOK_ASM_smsubl || token == TOK_ASM_umsubl); + has_ra = (token == TOK_ASM_smaddl || token == TOK_ASM_umaddl || + token == TOK_ASM_smsubl || token == TOK_ASM_umsubl); + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + asm_skip_comma(); + parse_operand(s1, &c); + if (has_ra) { asm_skip_comma(); - parse_operand(s1, &c); - need_xreg(&a, "ldp/stp first register"); - need_xreg(&b, "ldp/stp second register"); - if (c.kind != OP_MEM) - tcc_error("ldp/stp: expected memory operand"); - if (c.e.sym) - tcc_error("ldp/stp: symbolic offset not supported"); - emit_ldst_pair(token == TOK_ASM_ldp, a.reg, b.reg, c.base, - (int64_t)c.e.v, c.indexing); + parse_operand(s1, &d); + need_xreg(&d, "smaddl/umaddl"); + } else { + d.kind = OP_REG; d.reg = 31; d.is_w = 0; d.is_sp = 0; + } + need_xreg(&a, "smull/umull"); + if (b.kind != OP_REG || !b.is_w) + tcc_error("smull/umull: source must be W"); + if (c.kind != OP_REG || !c.is_w) + tcc_error("smull/umull: source must be W"); + emit_madd_long(a.reg, b.reg, c.reg, d.reg, is_unsigned, is_sub); +} + +/* CSEL/CSINC/CSINV/CSNEG and aliases (cset/cinc/cinv/cneg/csetm). */ +static void do_csel(TCCState *s1, int token) +{ + AArch64Op a, b, c, d; + int invert = (token == TOK_ASM_csinv || token == TOK_ASM_csneg || + token == TOK_ASM_csetm || token == TOK_ASM_cinv); + int inc_neg = (token == TOK_ASM_csinc || token == TOK_ASM_csneg || + token == TOK_ASM_cset || token == TOK_ASM_csetm || + token == TOK_ASM_cinc || token == TOK_ASM_cneg); + int alias_dst_only = (token == TOK_ASM_cset || token == TOK_ASM_csetm); + int alias_two_src = (token == TOK_ASM_cinc || token == TOK_ASM_cinv || + token == TOK_ASM_cneg); + + parse_operand(s1, &a); + asm_skip_comma(); + if (alias_dst_only) { + /* cset Rd, cond => csinc Rd, xzr, xzr, !cond */ + parse_operand(s1, &d); /* d = cond */ + need_reg(&a, "cset/csetm"); + if (d.kind != OP_COND) tcc_error("cset: expected cond"); + b.kind = OP_REG; b.reg = 31; b.is_w = a.is_w; b.is_sp = 0; + c = b; + emit_csel(a.reg, b.reg, c.reg, d.cond ^ 1, a.is_w, invert, inc_neg); return; + } + parse_operand(s1, &b); + asm_skip_comma(); + if (alias_two_src) { + /* cinc Rd, Rn, cond => csinc Rd, Rn, Rn, !cond (also cinv/cneg) */ + parse_operand(s1, &d); + if (d.kind != OP_COND) tcc_error("cinc/cinv/cneg: expected cond"); + need_reg(&a, "cinc"); need_reg(&b, "cinc"); + check_size_match(&a, &b, "cinc"); + emit_csel(a.reg, b.reg, b.reg, d.cond ^ 1, a.is_w, invert, inc_neg); + return; + } + parse_operand(s1, &c); + asm_skip_comma(); + parse_operand(s1, &d); + need_reg(&a, "csel"); need_reg(&b, "csel"); need_reg(&c, "csel"); + if (d.kind != OP_COND) tcc_error("csel: expected cond"); + check_size_match(&a, &b, "csel"); check_size_match(&a, &c, "csel"); + emit_csel(a.reg, b.reg, c.reg, d.cond, a.is_w, invert, inc_neg); +} + +/* sdiv/udiv. */ +static void do_div(TCCState *s1, int token) +{ + AArch64Op a, b, c; + parse_operand(s1, &a); asm_skip_comma(); + parse_operand(s1, &b); asm_skip_comma(); + parse_operand(s1, &c); + need_reg(&a, "div"); need_reg(&b, "div"); need_reg(&c, "div"); + check_size_match(&a, &b, "div"); check_size_match(&a, &c, "div"); + emit_div(a.reg, b.reg, c.reg, a.is_w, token == TOK_ASM_sdiv); +} +/* Generic ldr/str dispatcher. opc/size encode the variant. */ +static void do_ldst(TCCState *s1, int token, int opc, int size) +{ + AArch64Op a, b; + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + need_reg(&a, "ldr/str"); + + /* `ldr Xn, =imm` / `ldr Xn, =sym` */ + if (b.kind == OP_LITERAL) { + if (token != TOK_ASM_ldr) + tcc_error("=literal only valid with ldr"); + emit_ldr_literal(a.reg, &b); + return; + } + if (b.kind != OP_MEM) + tcc_error("ldr/str: expected memory operand"); + if (b.indexing == IDX_REGOFF) { + emit_ldst_reg(opc, size, a.reg, b.base, b.idx_reg, + b.idx_is_w, b.mem_ext_kind, b.mem_ext_amt, + b.mem_has_shift); + return; + } + if (b.e.sym) + tcc_error("ldr/str: symbolic offset not supported"); + emit_ldst_imm(opc, size, a.reg, b.base, (int64_t)b.e.v, b.indexing); +} + +static void do_ldp_stp(TCCState *s1, int is_load) +{ + AArch64Op a, b, c; + parse_operand(s1, &a); + asm_skip_comma(); + parse_operand(s1, &b); + asm_skip_comma(); + parse_operand(s1, &c); + need_reg(&a, "ldp/stp"); need_reg(&b, "ldp/stp"); + check_size_match(&a, &b, "ldp/stp"); + if (c.kind != OP_MEM) + tcc_error("ldp/stp: expected memory operand"); + if (c.indexing == IDX_REGOFF) + tcc_error("ldp/stp: register-offset not supported"); + if (c.e.sym) + tcc_error("ldp/stp: symbolic offset not supported"); + emit_ldst_pair(is_load, a.reg, b.reg, c.base, + (int64_t)c.e.v, c.indexing, a.is_w); +} + +/* CBZ/CBNZ. */ +static void do_cbz(TCCState *s1, int token) +{ + AArch64Op a, b; + parse_operand(s1, &a); asm_skip_comma(); + parse_operand(s1, &b); + need_reg(&a, "cbz/cbnz"); + if (b.kind != OP_IMM) + tcc_error("cbz/cbnz: expected label"); + emit_branch_cmp(a.reg, &b, a.is_w, token == TOK_ASM_cbnz); +} + +/* TBZ/TBNZ. */ +static void do_tbz(TCCState *s1, int token) +{ + AArch64Op a, b, c; + parse_operand(s1, &a); asm_skip_comma(); + parse_operand(s1, &b); asm_skip_comma(); + parse_operand(s1, &c); + need_reg(&a, "tbz/tbnz"); + if (b.kind != OP_IMM || b.e.sym) + tcc_error("tbz/tbnz: expected bit constant"); + if (c.kind != OP_IMM) + tcc_error("tbz/tbnz: expected label"); + emit_branch_test(a.reg, (int)b.e.v, &c, token == TOK_ASM_tbnz); +} + +/* HINT-encoded mnemonics. */ +static void do_hint(int token, int hint_arg) +{ + static const uint32_t base = 0xd503201fu; + int crm = (hint_arg >> 3) & 0xf; + int op2 = hint_arg & 7; + (void)token; + gen_le32(base | ((uint32_t)crm << 8) | ((uint32_t)op2 << 5)); +} + +/* DSB/DMB/ISB. */ +static void do_barrier(TCCState *s1, int token) +{ + int crm = 0xf; /* default sy */ + if (!at_end_of_insn()) { + AArch64Op a; + parse_operand(s1, &a); + if (a.kind == OP_IMM && !a.e.sym) crm = (int)a.e.v & 0xf; + else tcc_error("dsb/dmb/isb: expected #imm option"); + } + { + uint32_t base = 0xd503309fu; /* DSB sy */ + if (token == TOK_ASM_dmb) base = 0xd50330bfu; + if (token == TOK_ASM_isb) base = 0xd50330dfu; + gen_le32((base & 0xfffff0ffu) | ((uint32_t)crm << 8)); + } +} + +/* ---- top-level dispatch ----------------------------------------- */ + +ST_FUNC void asm_opcode(TCCState *s1, int token) +{ + AArch64Op a, b; + + switch (token) { + + case TOK_ASM_mov: + do_mov(s1); + return; + + case TOK_ASM_movz: + case TOK_ASM_movn: + case TOK_ASM_movk: + do_movw(s1, token); + return; + + case TOK_ASM_add: + case TOK_ASM_adds: + case TOK_ASM_sub: + case TOK_ASM_subs: + case TOK_ASM_cmp: + case TOK_ASM_cmn: + case TOK_ASM_neg: + case TOK_ASM_negs: + do_addsub(s1, token); + return; + + case TOK_ASM_and: + case TOK_ASM_orr: + case TOK_ASM_eor: + case TOK_ASM_ands: + case TOK_ASM_tst: + case TOK_ASM_bic: + case TOK_ASM_orn: + case TOK_ASM_eon: + case TOK_ASM_bics: + case TOK_ASM_mvn: + do_logical(s1, token); + return; + + case TOK_ASM_sbfm: + case TOK_ASM_ubfm: + case TOK_ASM_bfm: + do_bfm(s1, token); + return; + + case TOK_ASM_lsl: + case TOK_ASM_lsr: + case TOK_ASM_asr: + case TOK_ASM_ror: + do_shift(s1, token); + return; + + case TOK_ASM_sxtb: + case TOK_ASM_sxth: + case TOK_ASM_sxtw: + case TOK_ASM_uxtb: + case TOK_ASM_uxth: + do_extend_alias(s1, token); + return; + + case TOK_ASM_lslv: + case TOK_ASM_lsrv: + case TOK_ASM_asrv: + case TOK_ASM_rorv: { + AArch64Op p, q, r; + parse_operand(s1, &p); asm_skip_comma(); + parse_operand(s1, &q); asm_skip_comma(); + parse_operand(s1, &r); + need_reg(&p, "lslv"); need_reg(&q, "lslv"); need_reg(&r, "lslv"); + check_size_match(&p, &q, "lslv"); check_size_match(&p, &r, "lslv"); + emit_shift_reg(p.reg, q.reg, r.reg, p.is_w, + token == TOK_ASM_lslv ? 8 : + token == TOK_ASM_lsrv ? 9 : + token == TOK_ASM_asrv ? 10 : 11); + return; + } + + case TOK_ASM_mul: + case TOK_ASM_mneg: + case TOK_ASM_madd: + case TOK_ASM_msub: + do_mul(s1, token); + return; + + case TOK_ASM_smull: + case TOK_ASM_umull: + case TOK_ASM_smnegl: + case TOK_ASM_umnegl: + case TOK_ASM_smaddl: + case TOK_ASM_umaddl: + case TOK_ASM_smsubl: + case TOK_ASM_umsubl: + do_mul_long(s1, token); + return; + + case TOK_ASM_smulh: + case TOK_ASM_umulh: { + AArch64Op p, q, r; + parse_operand(s1, &p); asm_skip_comma(); + parse_operand(s1, &q); asm_skip_comma(); + parse_operand(s1, &r); + need_xreg(&p, "smulh"); need_xreg(&q, "smulh"); need_xreg(&r, "smulh"); + emit_mulh(p.reg, q.reg, r.reg, token == TOK_ASM_umulh); + return; + } + + case TOK_ASM_sdiv: + case TOK_ASM_udiv: + do_div(s1, token); + return; + + case TOK_ASM_csel: + case TOK_ASM_csinc: + case TOK_ASM_csinv: + case TOK_ASM_csneg: + case TOK_ASM_cset: + case TOK_ASM_csetm: + case TOK_ASM_cinc: + case TOK_ASM_cinv: + case TOK_ASM_cneg: + do_csel(s1, token); + return; + + /* ----- loads / stores ----- */ + case TOK_ASM_ldr: do_ldst(s1, token, 1, 3); return; + case TOK_ASM_str: do_ldst(s1, token, 0, 3); return; + case TOK_ASM_ldrb: do_ldst(s1, token, 1, 0); return; + case TOK_ASM_strb: do_ldst(s1, token, 0, 0); return; + case TOK_ASM_ldrh: do_ldst(s1, token, 1, 1); return; + case TOK_ASM_strh: do_ldst(s1, token, 0, 1); return; + case TOK_ASM_ldrsb: do_ldst(s1, token, 2, 0); return; /* opc=2 (X-target) */ + case TOK_ASM_ldrsh: do_ldst(s1, token, 2, 1); return; + case TOK_ASM_ldrsw: do_ldst(s1, token, 2, 2); return; + + case TOK_ASM_ldp: do_ldp_stp(s1, 1); return; + case TOK_ASM_stp: do_ldp_stp(s1, 0); return; + + /* ----- branches ----- */ case TOK_ASM_b: case TOK_ASM_bl: parse_operand(s1, &a); @@ -429,24 +1498,93 @@ ST_FUNC void asm_opcode(TCCState *s1, int token) emit_branch_imm(&a, token == TOK_ASM_bl); return; + case TOK_ASM_br: + case TOK_ASM_blr: + parse_operand(s1, &a); + need_xreg(&a, "br/blr"); + emit_branch_reg(a.reg, token == TOK_ASM_br ? 0 : 1); + return; + case TOK_ASM_ret: - if (tok != ';' && tok != TOK_LINEFEED && tok != TOK_EOF) { + if (!at_end_of_insn()) { parse_operand(s1, &a); need_xreg(&a, "ret"); - emit_ret(a.reg); + emit_branch_reg(a.reg, 2); } else { - emit_ret(30); + emit_branch_reg(30, 2); } return; + case TOK_ASM_cbz: + case TOK_ASM_cbnz: + do_cbz(s1, token); + return; + + case TOK_ASM_tbz: + case TOK_ASM_tbnz: + do_tbz(s1, token); + return; + + /* ----- system ----- */ case TOK_ASM_svc: parse_operand(s1, &a); if (a.kind != OP_IMM || a.e.sym) tcc_error("svc: expected immediate"); - emit_svc((int64_t)a.e.v); + emit_excgen(0xd4000001u, (int64_t)a.e.v); + return; + case TOK_ASM_hvc: + parse_operand(s1, &a); + if (a.kind != OP_IMM || a.e.sym) tcc_error("hvc imm"); + emit_excgen(0xd4000002u, (int64_t)a.e.v); return; + case TOK_ASM_smc: + parse_operand(s1, &a); + if (a.kind != OP_IMM || a.e.sym) tcc_error("smc imm"); + emit_excgen(0xd4000003u, (int64_t)a.e.v); return; + case TOK_ASM_brk: + parse_operand(s1, &a); + if (a.kind != OP_IMM || a.e.sym) tcc_error("brk imm"); + emit_excgen(0xd4200000u, (int64_t)a.e.v); return; + case TOK_ASM_hlt: + parse_operand(s1, &a); + if (a.kind != OP_IMM || a.e.sym) tcc_error("hlt imm"); + emit_excgen(0xd4400000u, (int64_t)a.e.v); return; + + case TOK_ASM_nop: do_hint(token, 0); return; + case TOK_ASM_yield: do_hint(token, 1); return; + case TOK_ASM_wfe: do_hint(token, 2); return; + case TOK_ASM_wfi: do_hint(token, 3); return; + case TOK_ASM_sev: do_hint(token, 4); return; + case TOK_ASM_sevl: do_hint(token, 5); return; + case TOK_ASM_hint: + parse_operand(s1, &a); + if (a.kind != OP_IMM || a.e.sym) tcc_error("hint imm"); + do_hint(token, (int)a.e.v); + return; + + case TOK_ASM_dmb: + case TOK_ASM_dsb: + case TOK_ASM_isb: + do_barrier(s1, token); return; + /* ----- conditional-branch family (b.eq..b.nv + aliases) ----- */ default: + if (token >= TOK_ASM_b_eq && token <= TOK_ASM_b_nv) { + int cond = token - TOK_ASM_b_eq; + parse_operand(s1, &a); + if (a.kind != OP_IMM) + tcc_error("b.cond: expected label"); + emit_branch_cond(cond, &a); + return; + } + if (token == TOK_ASM_b_hs || token == TOK_ASM_b_lo) { + int cond = (token == TOK_ASM_b_hs) ? 2 : 3; + parse_operand(s1, &a); + if (a.kind != OP_IMM) + tcc_error("b.cond: expected label"); + emit_branch_cond(cond, &a); + return; + } expect("known instruction"); } } diff --git a/scripts/simple-patches/tcc-0.9.26/files/arm64-tok.h b/scripts/simple-patches/tcc-0.9.26/files/arm64-tok.h @@ -1,9 +1,9 @@ -/* ARM64 assembler tokens. +/* ARM64 assembler tokens * - * Phase 1 surface — registers and the mnemonic set required by the - * .S inputs in tcc-cc/aarch64/ and tcc-libc/aarch64/. Order matters - * for the contiguous-range checks in arm64-asm.c (TOK_ASM_x0..xzr, - * TOK_ASM_w0..wzr). + * Order matters for the contiguous-range checks in arm64-asm.c + * (TOK_ASM_x0..xzr, TOK_ASM_w0..wzr, TOK_ASM_eq..nv, TOK_ASM_b_eq..b_nv). + * Aliases (lr/fp/ip0/ip1/wsp) and shift/extend keywords are matched + * by exact token, not by range. */ /* X (64-bit) integer registers. Must be contiguous, x0 first. */ @@ -76,7 +76,77 @@ DEF_ASM(wsp) DEF_ASM(wzr) -/* Mnemonics — phase 1 set. */ +/* Register aliases. */ + DEF_ASM(lr) /* x30 */ + DEF_ASM(fp) /* x29 */ + DEF_ASM(ip0) /* x16 */ + DEF_ASM(ip1) /* x17 */ + +/* Condition codes. Order must match the 4-bit cond field + * (eq=0..nv=15) for the contiguous-range check in arm64-asm.c. */ + DEF_ASM(eq) + DEF_ASM(ne) + DEF_ASM(cs) + DEF_ASM(cc) + DEF_ASM(mi) + DEF_ASM(pl) + DEF_ASM(vs) + DEF_ASM(vc) + DEF_ASM(hi) + DEF_ASM(ls) + DEF_ASM(ge) + DEF_ASM(lt) + DEF_ASM(gt) + DEF_ASM(le) + DEF_ASM(al) + DEF_ASM(nv) +/* Aliases for cond codes — matched by exact token, mapped to numeric values. */ + DEF_ASM(hs) /* alias of cs (=2) */ + DEF_ASM(lo) /* alias of cc (=3) */ + +/* Shift specifiers. Used in operand position after a comma. */ + DEF_ASM(lsl) + DEF_ASM(lsr) + DEF_ASM(asr) + DEF_ASM(ror) + +/* Extend specifiers. Used in operand position after a comma. */ + DEF_ASM(uxtb) + DEF_ASM(uxth) + DEF_ASM(uxtw) + DEF_ASM(uxtx) + DEF_ASM(sxtb) + DEF_ASM(sxth) + DEF_ASM(sxtw) + DEF_ASM(sxtx) + +/* Helper for "b.cond"-style mnemonics: produce a token whose name + * is the C identifier ## but whose source text is "b.<cond>". */ +#define DEF_ASM_DOT(x, y) DEF(TOK_ASM_ ## x ## _ ## y, #x "." #y) + +/* Conditional branches — one token per cond. Order must match + * the cond numeric (eq=0..nv=15) so b_<cond> - b_eq == cond. */ + DEF_ASM_DOT(b, eq) + DEF_ASM_DOT(b, ne) + DEF_ASM_DOT(b, cs) + DEF_ASM_DOT(b, cc) + DEF_ASM_DOT(b, mi) + DEF_ASM_DOT(b, pl) + DEF_ASM_DOT(b, vs) + DEF_ASM_DOT(b, vc) + DEF_ASM_DOT(b, hi) + DEF_ASM_DOT(b, ls) + DEF_ASM_DOT(b, ge) + DEF_ASM_DOT(b, lt) + DEF_ASM_DOT(b, gt) + DEF_ASM_DOT(b, le) + DEF_ASM_DOT(b, al) + DEF_ASM_DOT(b, nv) +/* Aliases (b.hs = b.cs, b.lo = b.cc). */ + DEF_ASM_DOT(b, hs) + DEF_ASM_DOT(b, lo) + +/* Mnemonics */ DEF_ASM(mov) DEF_ASM(add) DEF_ASM(sub) @@ -88,3 +158,100 @@ DEF_ASM(bl) DEF_ASM(ret) DEF_ASM(svc) + +/* DP-immediate. */ + DEF_ASM(adds) + DEF_ASM(subs) + DEF_ASM(cmp) + DEF_ASM(cmn) + DEF_ASM(neg) + DEF_ASM(negs) + DEF_ASM(and) + DEF_ASM(orr) + DEF_ASM(eor) + DEF_ASM(ands) + DEF_ASM(tst) + DEF_ASM(movz) + DEF_ASM(movn) + DEF_ASM(movk) + DEF_ASM(sbfm) + DEF_ASM(ubfm) + DEF_ASM(bfm) + DEF_ASM(sbfiz) + DEF_ASM(sbfx) + DEF_ASM(ubfiz) + DEF_ASM(ubfx) + DEF_ASM(bfi) + DEF_ASM(bfxil) +/* lsl/lsr/asr/ror tokens are declared once above as shift specifiers; + they double-duty as mnemonics for the bitfield aliases. Same for + sxtb/sxth/sxtw/uxtb/uxth — single token, dual context. */ + +/* DP-register. */ + DEF_ASM(bic) + DEF_ASM(orn) + DEF_ASM(eon) + DEF_ASM(bics) + DEF_ASM(mvn) + DEF_ASM(lslv) + DEF_ASM(lsrv) + DEF_ASM(asrv) + DEF_ASM(rorv) + DEF_ASM(mul) + DEF_ASM(mneg) + DEF_ASM(madd) + DEF_ASM(msub) + DEF_ASM(smull) + DEF_ASM(umull) + DEF_ASM(smnegl) + DEF_ASM(umnegl) + DEF_ASM(smaddl) + DEF_ASM(umaddl) + DEF_ASM(smsubl) + DEF_ASM(umsubl) + DEF_ASM(smulh) + DEF_ASM(umulh) + DEF_ASM(udiv) + DEF_ASM(sdiv) + DEF_ASM(csel) + DEF_ASM(csinc) + DEF_ASM(csinv) + DEF_ASM(csneg) + DEF_ASM(cset) + DEF_ASM(csetm) + DEF_ASM(cinc) + DEF_ASM(cinv) + DEF_ASM(cneg) + +/* load/store extras. */ + DEF_ASM(ldrb) + DEF_ASM(ldrh) + DEF_ASM(ldrsb) + DEF_ASM(ldrsh) + DEF_ASM(ldrsw) + DEF_ASM(strb) + DEF_ASM(strh) + DEF_ASM(ldur) + DEF_ASM(stur) + +/* branches & system. */ + DEF_ASM(cbz) + DEF_ASM(cbnz) + DEF_ASM(tbz) + DEF_ASM(tbnz) + DEF_ASM(br) + DEF_ASM(blr) + DEF_ASM(brk) + DEF_ASM(hlt) + DEF_ASM(hvc) + DEF_ASM(smc) + DEF_ASM(nop) + DEF_ASM(yield) + DEF_ASM(wfe) + DEF_ASM(wfi) + DEF_ASM(sev) + DEF_ASM(sevl) + DEF_ASM(hint) + DEF_ASM(isb) + DEF_ASM(dsb) + DEF_ASM(dmb)