kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 178cb2b75ea76bdc88b8bd5c92774b2225a50f73
parent 5c0845b9c3a5d4033bf3b849a2cd31559f798cdf
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Tue,  2 Jun 2026 08:33:15 -0700

cg: make CmpOp FP-compare-lossless and reach all 12 FP predicates from frontends

Replace the lossy floating portion of the internal CmpOp with a disjoint,
IEEE-complete 12-member FP block (CMP_OEQ_F..CMP_UGE_F) laid out after the
integer block in CfreeCgFpCmpOp order, so the ordered/unordered distinction the
public CfreeCgFpCmpOp promised now reaches every backend instead of being
collapsed to 6 predicates in api_map_fp_cmp.

Core:
- cgtarget.h: disjoint int+fp CmpOp blocks; CMP_OEQ_F is the FP boundary.
- value.c: api_map_fp_cmp is now 1:1.
- fold.c: api_invert_cmp flips ordered<->unordered AND negates the relation
  (OLT<->UGE, OEQ<->UNE, ONE<->UEQ, ...), NaN-correct.
- ir_dump.c: 12 FP opcode name strings.
- Backends + interp emit all 12 predicates: x64 (ucomisd flag formulas built
  explicitly via De Morgan), rv64 (ordered feq/flt/fle + xori; OR of strict
  relations for ONE/UEQ), aa64 (12-entry FP cond table; ONE/UEQ synthesized
  with fcmp+cset+csel/csinc and as paired conditional branches), c_target
  (compound !(...) / || / && expressions), wasm (all 12 added from scratch;
  the float eq/ne arms were absent), interp engine.
- arith.c: lossless f128/long-double soft-float compares (single libcall per
  predicate exploiting the compiler-rt NaN sign convention; UEQ/ONE add a
  second __unordtf2 call).

Coupled frontend correctness fix: float `!=` now maps to UNE (unordered, true
on NaN), not ONE — required once the map is lossless, and what
__builtin_isnan (x != x) relies on.

Reach all 12 predicates from the C and toy frontends:
- Route A (C99 compare builtins): __builtin_isless/islessequal/isgreater/
  isgreaterequal/islessgreater/isunordered in C, and @isless/.../@isunordered
  in toy. islessgreater -> ONE (reachable only here); isunordered synthesized
  as (a!=a)||(b!=b).
- Route B (negation fold): FP compares are now delayed SV_CMP values, so
  api_cg_unop (UO_NOT) and api_branch_if invert them via api_invert_cmp,
  making `!(a < b)` etc. emit the NaN-correct unordered duals (UGE/UGT/ULE/
  ULT/UEQ). pass_jump.c invert_cmp gained the matching FP arms.

Tests: new public cg_fp_cmp API test (all 12 predicates x operand combos via
the interpreter + emission across aa64/x64/rv64/wasm); extended interp
spec_fp_cmp_nan; C parse cases and toy cases for Route A and Route B. FP-
compare cases are wasm-skipped pending NaN-correct lang/wasm re-lowering (a
pre-existing gap; rv64_fp_nan_compare also fails that lane).

Verified: test-cg-api/test-interp/test-opt/test-isa/test-asm, smoke x64+rv64,
full parse and toy suites, and `make bootstrap` reproduces byte-identical at
-O0 and -O1.

Diffstat:
Mlang/c/parse/cg_adapter.c | 30++++++++++++++++++++++++++++--
Mlang/c/parse/cg_adapter.h | 21+++++++++++++++++++++
Mlang/c/parse/parse.c | 12++++++++++++
Mlang/c/parse/parse_expr.c | 88+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mlang/c/parse/parse_priv.h | 6++++++
Mlang/toy/builtins.c | 53+++++++++++++++++++++++++++++++++++++++++++++++++++++
Mlang/toy/expr.c | 3++-
Msrc/arch/aa64/native.c | 69++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
Msrc/arch/c_target/c_emit.c | 88++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
Msrc/arch/rv64/native.c | 88+++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------
Msrc/arch/wasm/emit.c | 88++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------
Msrc/arch/x64/native.c | 42++++++++++++++++++++++++++++--------------
Msrc/cg/arith.c | 143+++++++++++++++++++++++++++++++++++++++++++++++--------------------------------
Msrc/cg/cgtarget.h | 34++++++++++++++++++++++++----------
Msrc/cg/fold.c | 35+++++++++++++++++++++++++++--------
Msrc/cg/ir_dump.c | 16++++++++++++----
Msrc/cg/value.c | 32++++++++++++++++++++------------
Msrc/interp/engine.c | 29+++++++++++++++++++----------
Msrc/opt/pass_jump.c | 39+++++++++++++++++++++++++++++++++++++++
Atest/api/cg_fp_cmp_test.c | 290+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mtest/interp/interp_smoke_test.c | 46++++++++++++++++++++++++++++++++++++++++------
Mtest/lib/unit.mk | 3++-
Atest/parse/cases/builtin_fp_cmp_negation.c | 54++++++++++++++++++++++++++++++++++++++++++++++++++++++
Atest/parse/cases/builtin_fp_cmp_negation.expected | 1+
Atest/parse/cases/builtin_islessgreater_unordered.c | 48++++++++++++++++++++++++++++++++++++++++++++++++
Atest/parse/cases/builtin_islessgreater_unordered.expected | 1+
Atest/parse/cases/builtin_islessgreater_unordered.wasm.skip | 1+
Mtest/test.mk | 4+++-
Atest/toy/cases/152_fp_cmp_builtins_a.expected | 1+
Atest/toy/cases/152_fp_cmp_builtins_a.toy | 62++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Atest/toy/cases/152_fp_cmp_builtins_a.wasm.skip | 1+
Atest/toy/cases/153_fp_cmp_negation_b.expected | 1+
Atest/toy/cases/153_fp_cmp_negation_b.toy | 45+++++++++++++++++++++++++++++++++++++++++++++
Atest/toy/cases/153_fp_cmp_negation_b.wasm.skip | 1+
34 files changed, 1290 insertions(+), 185 deletions(-)

diff --git a/lang/c/parse/cg_adapter.c b/lang/c/parse/cg_adapter.c @@ -341,15 +341,37 @@ CfreeCgFpCmpOp pcg_fp_cmp(CmpOp op) { case CMP_EQ: return CFREE_CG_FP_OEQ; case CMP_NE: - return CFREE_CG_FP_ONE; + /* C `!=` on floats is *unordered* not-equal: `NaN != x` is true (and + * `__builtin_isnan(x)` lowers to `x != x`). Map to UNE, not ONE. */ + return CFREE_CG_FP_UNE; case CMP_LT_F: + case CMP_OLT_F: return CFREE_CG_FP_OLT; case CMP_LE_F: + case CMP_OLE_F: return CFREE_CG_FP_OLE; case CMP_GT_F: + case CMP_OGT_F: return CFREE_CG_FP_OGT; case CMP_GE_F: + case CMP_OGE_F: return CFREE_CG_FP_OGE; + case CMP_OEQ_F: + return CFREE_CG_FP_OEQ; + case CMP_ONE_F: + return CFREE_CG_FP_ONE; + case CMP_UEQ_F: + return CFREE_CG_FP_UEQ; + case CMP_UNE_F: + return CFREE_CG_FP_UNE; + case CMP_ULT_F: + return CFREE_CG_FP_ULT; + case CMP_ULE_F: + return CFREE_CG_FP_ULE; + case CMP_UGT_F: + return CFREE_CG_FP_UGT; + case CMP_UGE_F: + return CFREE_CG_FP_UGE; default: return CFREE_CG_FP_OEQ; } @@ -824,7 +846,11 @@ void pcg_unop(Parser* p, UnOp op) { } void pcg_cmp(Parser* p, CmpOp op) { - if (op == CMP_LT_F || op == CMP_LE_F || op == CMP_GT_F || op == CMP_GE_F || + /* The FP block starts at CMP_LT_F (relational operator markers) and runs + * through the ordered/unordered predicate members; everything below is + * integer. CMP_EQ/CMP_NE additionally route to FP when the operand type is + * floating (C `==`/`!=` on floats is ordered-equal / unordered-not-equal). */ + if (op >= CMP_LT_F || ((op == CMP_EQ || op == CMP_NE) && pcg_type_is_fp(pcg_top_type(p)))) { if (pcg_emit_enabled(p)) cfree_cg_fp_cmp(p->cg, pcg_fp_cmp(op)); } else { diff --git a/lang/c/parse/cg_adapter.h b/lang/c/parse/cg_adapter.h @@ -85,6 +85,12 @@ typedef enum UnOp { UO_BNOT, } UnOp; +/* Compare markers used by the C parser. The integer block plus the four + * relational FP markers (CMP_LT_F..CMP_GE_F) cover the bare C operators; the + * latter map to the *ordered* FP predicates in pcg_fp_cmp. The trailing + * ordered/unordered block mirrors the global CmpOp / public CfreeCgFpCmpOp so + * the C99 comparison builtins (islessgreater -> CMP_ONE_F) and any + * negation-driven unordered duals are reachable from the frontend 1:1. */ typedef enum CmpOp { CMP_EQ, CMP_NE, @@ -96,10 +102,25 @@ typedef enum CmpOp { CMP_LE_U, CMP_GT_U, CMP_GE_U, + /* Relational FP operator markers (bare `< <= > >=` on floats) -> ordered. */ CMP_LT_F, CMP_LE_F, CMP_GT_F, CMP_GE_F, + /* Ordered FP relationals (NaN -> false). */ + CMP_OEQ_F, + CMP_ONE_F, + CMP_OLT_F, + CMP_OLE_F, + CMP_OGT_F, + CMP_OGE_F, + /* Unordered FP relationals (NaN -> true). */ + CMP_UEQ_F, + CMP_UNE_F, + CMP_ULT_F, + CMP_ULE_F, + CMP_UGT_F, + CMP_UGE_F, } CmpOp; typedef enum AtomicOp { diff --git a/lang/c/parse/parse.c b/lang/c/parse/parse.c @@ -1559,6 +1559,18 @@ void parse_c(Compiler* c, Pool* pool, Pp* pp, DeclTable* decls, CG* cg, cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__builtin_huge_valf")); p.sym_b_huge_vall = cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__builtin_huge_vall")); + p.sym_b_isless = + cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__builtin_isless")); + p.sym_b_islessequal = + cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__builtin_islessequal")); + p.sym_b_isgreater = + cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__builtin_isgreater")); + p.sym_b_isgreaterequal = + cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__builtin_isgreaterequal")); + p.sym_b_islessgreater = + cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__builtin_islessgreater")); + p.sym_b_isunordered = + cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__builtin_isunordered")); p.sym_func = cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__func__")); p.sym_func_gcc = cfree_sym_intern(p.pool->c, CFREE_SLICE_LIT("__FUNCTION__")); p.sym_pretty_func_gcc = diff --git a/lang/c/parse/parse_expr.c b/lang/c/parse/parse_expr.c @@ -468,6 +468,8 @@ CfreeCgSym emit_string_literal_to_rodata(Parser* p, const u8* bytes, static CConstInt cexpr_unary(Parser* p, SrcLoc loc); static CConstInt cexpr_cond(Parser* p, SrcLoc loc); static const Type* offsetof_designator(Parser* p, const Type* base, u32* off); +static const Type* common_fp_type(Parser* p, const Type* a, const Type* b); +static void coerce_fp_cmp_operands(Parser* p, const Type* common); static u32 cint_bits(Parser* p, const Type* ty) { u32 sz = ty ? c_abi_sizeof(p->abi, ty) : 8u; @@ -1529,6 +1531,91 @@ static int parse_builtin_isnan_call(Parser* p, Sym name, SrcLoc loc) { return 1; } +/* C99 floating comparison builtins (Route A). The five relational forms map + * directly to ordered FP predicates (NaN -> false): isless/islessequal/ + * isgreater/isgreaterequal -> OLT/OLE/OGT/OGE (the same predicates the bare + * `< <= > >=` operators produce, but as the explicit quiet macros), and + * islessgreater -> ONE (ordered-and-not-equal). isunordered has no standalone + * predicate in the CmpOp enum, so it is synthesized as (a != a) || (b != b) + * using the unordered self-compare that __builtin_isnan already relies on. */ +static int parse_builtin_fp_cmp_call(Parser* p, Sym name, SrcLoc loc) { + CmpOp cop; + int is_unordered = 0; + const Type* common; + + if (name == p->sym_b_isless) { + cop = CMP_OLT_F; + } else if (name == p->sym_b_islessequal) { + cop = CMP_OLE_F; + } else if (name == p->sym_b_isgreater) { + cop = CMP_OGT_F; + } else if (name == p->sym_b_isgreaterequal) { + cop = CMP_OGE_F; + } else if (name == p->sym_b_islessgreater) { + cop = CMP_ONE_F; + } else if (name == p->sym_b_isunordered) { + cop = CMP_OEQ_F; /* unused; isunordered is synthesized below */ + is_unordered = 1; + } else { + return 0; + } + + advance(p); /* IDENT */ + expect_punct(p, '(', "'(' after floating comparison builtin"); + parse_assign_expr(p); + to_rvalue(p); + if (!type_is_fp(pcg_top_type(p))) { + perr(p, "floating comparison builtin requires floating arguments"); + } + expect_punct(p, ',', "',' between floating comparison builtin arguments"); + parse_assign_expr(p); + to_rvalue(p); + if (!type_is_fp(pcg_top_type(p))) { + perr(p, "floating comparison builtin requires floating arguments"); + } + expect_punct(p, ')', "')' after floating comparison builtin"); + + /* Bring both operands to a common floating type so the compare (and the + * synthesized self-compares) see matching widths, mirroring the bare + * relational path (parse_rel). */ + common = common_fp_type(p, pcg_top2_type(p), pcg_top_type(p)); + coerce_fp_cmp_operands(p, common); + + pcg_set_loc(p, loc); + if (!is_unordered) { + pcg_cmp(p, cop); + return 1; + } + + /* isunordered(a, b) == isnan(a) || isnan(b). Stash both operands, then OR the + * two unordered self-compares (each yields the 0/1 int isnan result). */ + { + FrameSlot slot_a = builtin_tmp_slot(p, common); + FrameSlot slot_b = builtin_tmp_slot(p, common); + /* stack: [a, b] */ + pcg_push_local_typed(p, slot_b, common); + pcg_swap(p); + pcg_store(p); + pcg_drop(p); + /* stack: [a] */ + pcg_push_local_typed(p, slot_a, common); + pcg_swap(p); + pcg_store(p); + pcg_drop(p); + /* stack: [] */ + pcg_push_local_typed(p, slot_a, common); + pcg_load(p); + pcg_dup(p); + pcg_cmp(p, CMP_NE); /* isnan(a): unordered not-equal */ + pcg_push_local_typed(p, slot_b, common); + pcg_load(p); + pcg_dup(p); + pcg_cmp(p, CMP_NE); /* isnan(b) */ + pcg_binop(p, BO_OR); + } + return 1; +} + static const Type* builtin_math_fp_type(Parser* p, Sym name) { if (name == p->sym_b_fabsf || name == p->sym_b_inff || name == p->sym_b_huge_valf) { @@ -1665,6 +1752,7 @@ static int try_parse_builtin_call(Parser* p) { if (parse_builtin_overflow_call(p, name, loc)) return 1; if (parse_builtin_isnan_call(p, name, loc)) return 1; + if (parse_builtin_fp_cmp_call(p, name, loc)) return 1; if (parse_builtin_inf_call(p, name, loc)) return 1; if (parse_builtin_fabs_call(p, name, loc)) return 1; if (parse_builtin_abs_call(p, name, loc)) return 1; diff --git a/lang/c/parse/parse_priv.h b/lang/c/parse/parse_priv.h @@ -247,6 +247,12 @@ typedef struct Parser { Sym sym_b_huge_val; Sym sym_b_huge_valf; Sym sym_b_huge_vall; + Sym sym_b_isless; /* __builtin_isless */ + Sym sym_b_islessequal; /* __builtin_islessequal */ + Sym sym_b_isgreater; /* __builtin_isgreater */ + Sym sym_b_isgreaterequal; /* __builtin_isgreaterequal */ + Sym sym_b_islessgreater; /* __builtin_islessgreater */ + Sym sym_b_isunordered; /* __builtin_isunordered */ Sym sym_func; /* __func__ */ Sym sym_func_gcc; /* __FUNCTION__ */ Sym sym_pretty_func_gcc; /* __PRETTY_FUNCTION__ */ diff --git a/lang/toy/builtins.c b/lang/toy/builtins.c @@ -500,6 +500,59 @@ CfreeCgTypeId toy_parse_builtin_call(ToyParser* p, CfreeSym name, return toy_builtin_type(p, CFREE_CG_BUILTIN_VOID); } + /* C99-style floating comparison builtins (Route A). Five map directly to an + * ordered FP predicate (NaN -> false); @islessgreater is ordered-and-not- + * equal (ONE), reachable only here (toy `!=` is the unordered UNE). The + * unordered duals are reached separately via `!(a < b)` etc. (Route B). + * @isunordered has no standalone predicate, so it is synthesized as + * isnan(a) || isnan(b), where UNE(x, x) is isnan(x). */ + if (toy_sym_is(p, name, "isless") || toy_sym_is(p, name, "islessequal") || + toy_sym_is(p, name, "isgreater") || + toy_sym_is(p, name, "isgreaterequal") || + toy_sym_is(p, name, "islessgreater") || + toy_sym_is(p, name, "isunordered")) { + CfreeCgTypeId a, b; + CfreeCgFpCmpOp pred = CFREE_CG_FP_OLT; + int is_unordered = toy_sym_is(p, name, "isunordered"); + if (toy_sym_is(p, name, "islessequal")) + pred = CFREE_CG_FP_OLE; + else if (toy_sym_is(p, name, "isgreater")) + pred = CFREE_CG_FP_OGT; + else if (toy_sym_is(p, name, "isgreaterequal")) + pred = CFREE_CG_FP_OGE; + else if (toy_sym_is(p, name, "islessgreater")) + pred = CFREE_CG_FP_ONE; + if (!toy_parser_expect(p, TOK_LPAREN)) return CFREE_CG_TYPE_NONE; + a = toy_parse_expr(p); + if (a == CFREE_CG_TYPE_NONE || !toy_expect_comma(p)) + return CFREE_CG_TYPE_NONE; + b = toy_parse_expr(p); + if (b == CFREE_CG_TYPE_NONE || !toy_parser_expect(p, TOK_RPAREN)) + return CFREE_CG_TYPE_NONE; + if (a != b || !toy_type_is_float(p, a)) { + toy_error(p, p->cur.loc, + "floating comparison builtin expects two floating operands of " + "the same type"); + return CFREE_CG_TYPE_NONE; + } + if (is_unordered) { + /* stack: [a, b] -> isnan(b) then isnan(a), OR'd. */ + cfree_cg_dup(p->cg); /* [a, b, b] */ + cfree_cg_fp_cmp(p->cg, CFREE_CG_FP_UNE); /* [a, isnan_b] */ + cfree_cg_swap(p->cg); /* [isnan_b, a] */ + cfree_cg_dup(p->cg); /* [isnan_b, a, a] */ + cfree_cg_fp_cmp(p->cg, CFREE_CG_FP_UNE); /* [isnan_b, isnan_a] */ + cfree_cg_int_binop(p->cg, CFREE_CG_INT_OR, 0); + } else { + cfree_cg_fp_cmp(p->cg, pred); + } + /* Widen the i32 0/1 to the toy int type, matching the comparison operators + * (toy_parse_expr_cmp zext's its result), so the builtin composes in + * arithmetic the same way `a < b` does. */ + cfree_cg_zext(p->cg, p->int_type); + return p->int_type; + } + { int low_level_recognized = 0; CfreeCgTypeId low_level_ty = diff --git a/lang/toy/expr.c b/lang/toy/expr.c @@ -1473,7 +1473,8 @@ static CfreeCgTypeId toy_parse_expr_cmp(ToyParser* p) { fp_cmp = CFREE_CG_FP_OEQ; break; case TOK_NE: - fp_cmp = CFREE_CG_FP_ONE; + /* `!=` on floats is unordered not-equal (true on NaN), not ONE. */ + fp_cmp = CFREE_CG_FP_UNE; break; case TOK_LT: fp_cmp = CFREE_CG_FP_OLT; diff --git a/src/arch/aa64/native.c b/src/arch/aa64/native.c @@ -653,14 +653,30 @@ static u32 cmp_cond(CmpOp op) { return 0xcu; case CMP_GE_S: return 0xau; - case CMP_LT_F: - return 0x4u; - case CMP_LE_F: - return 0x9u; - case CMP_GT_F: - return 0xcu; - case CMP_GE_F: - return 0xau; + /* FP predicates after FCMP set NZCV as: a<b -> N; a==b -> Z,C; a>b -> C; + * unordered -> C,V. Each maps to a single condition except CMP_ONE_F / + * CMP_UEQ_F (synthesized with two instructions in aa_cmp/aa_cmp_branch, + * which intercept them before calling cmp_cond). */ + case CMP_OEQ_F: + return 0x0u; /* EQ */ + case CMP_OLT_F: + return 0x4u; /* MI */ + case CMP_OLE_F: + return 0x9u; /* LS */ + case CMP_OGT_F: + return 0xcu; /* GT */ + case CMP_OGE_F: + return 0xau; /* GE */ + case CMP_UNE_F: + return 0x1u; /* NE (unordered or not-equal) */ + case CMP_ULT_F: + return 0xbu; /* LT (unordered or less-than) */ + case CMP_ULE_F: + return 0xdu; /* LE (unordered or less-or-equal) */ + case CMP_UGT_F: + return 0x8u; /* HI (unordered or greater-than) */ + case CMP_UGE_F: + return 0x2u; /* CS (unordered or greater-or-equal) */ default: return 0x0u; } @@ -1668,6 +1684,24 @@ static void aa_jump(NativeTarget* t, MCLabel label) { static void aa_cmp_branch(NativeTarget* t, CmpOp op, NativeLoc lhs, NativeLoc rhs, MCLabel label) { aa_emit_cmp_to_flags(t, lhs, rhs); + /* CMP_ONE_F / CMP_UEQ_F have no single FP condition: take the branch from a + * pair of conditional branches to the same label (no scratch register). */ + if (op == CMP_ONE_F) { + /* ordered & !=: branch if a<b (MI) or a>b (GT). */ + aa_emit32(t->mc, aa64_brcond_pack((AA64BrCond){.cond = 0x4u})); /* MI */ + t->mc->emit_label_ref(t->mc, label, R_AARCH64_CONDBR19, 4, 0); + aa_emit32(t->mc, aa64_brcond_pack((AA64BrCond){.cond = 0xcu})); /* GT */ + t->mc->emit_label_ref(t->mc, label, R_AARCH64_CONDBR19, 4, 0); + return; + } + if (op == CMP_UEQ_F) { + /* unordered | ==: branch if a==b (EQ) or unordered (VS). */ + aa_emit32(t->mc, aa64_brcond_pack((AA64BrCond){.cond = 0x0u})); /* EQ */ + t->mc->emit_label_ref(t->mc, label, R_AARCH64_CONDBR19, 4, 0); + aa_emit32(t->mc, aa64_brcond_pack((AA64BrCond){.cond = 0x6u})); /* VS */ + t->mc->emit_label_ref(t->mc, label, R_AARCH64_CONDBR19, 4, 0); + return; + } aa_emit32(t->mc, aa64_brcond_pack((AA64BrCond){.cond = cmp_cond(op)})); t->mc->emit_label_ref(t->mc, label, R_AARCH64_CONDBR19, 4, 0); } @@ -2204,8 +2238,25 @@ static void aa_emit_cmp_to_flags(NativeTarget* t, NativeLoc lhs, static void aa_cmp(NativeTarget* t, CmpOp op, NativeLoc dst, NativeLoc lhs, NativeLoc rhs) { + u32 sf = loc_is_64(t, dst); + u32 rd = loc_reg(dst); aa_emit_cmp_to_flags(t, lhs, rhs); - aa_emit32(t->mc, aa_cset(loc_is_64(t, dst), loc_reg(dst), cmp_cond(op))); + /* CMP_ONE_F (ordered & !=) and CMP_UEQ_F (unordered | ==) have no single + * AArch64 FP condition. After FCMP, unordered sets V (and Z=0), so VC + * (V==0) selects "ordered". */ + if (op == CMP_ONE_F) { + /* ordered & not-equal: NE masked to the ordered case. */ + aa_emit32(t->mc, aa_cset(sf, rd, 0x1u)); /* cset rd, NE */ + aa_emit32(t->mc, aa64_csel_enc(sf, rd, rd, AA64_ZR, 0x7u)); /* csel rd,rd,zr,VC */ + return; + } + if (op == CMP_UEQ_F) { + /* equal, or forced to 1 when unordered. */ + aa_emit32(t->mc, aa_cset(sf, rd, 0x0u)); /* cset rd, EQ */ + aa_emit32(t->mc, aa64_csinc_enc(sf, rd, rd, AA64_ZR, 0x7u)); /* csinc rd,rd,zr,VC */ + return; + } + aa_emit32(t->mc, aa_cset(sf, rd, cmp_cond(op))); } static void aa_convert(NativeTarget* t, ConvKind op, NativeLoc dst, diff --git a/src/arch/c_target/c_emit.c b/src/arch/c_target/c_emit.c @@ -1337,32 +1337,55 @@ void c_emit_unop(CTarget* t, UnOp op, Operand dst, Operand a) { /* ===== compare ops ===== */ +/* The single C operator for ops that lower to one relational/equality + * expression: all integer ops, plus the FP predicates whose plain C operator + * already has the right NaN behavior (<,<=,>,>= and == are ordered: false on + * NaN; != is unordered: true on NaN). The remaining FP predicates need a + * compound expression and are handled in c_emit_cmp_operands; they return NULL + * here. No `default:` so -Wswitch flags any unhandled enumerator. */ static const char* cmp_to_c(CmpOp op) { switch (op) { case CMP_EQ: + case CMP_OEQ_F: return "=="; case CMP_NE: + case CMP_UNE_F: return "!="; case CMP_LT_S: case CMP_LT_U: - case CMP_LT_F: + case CMP_OLT_F: return "<"; case CMP_LE_S: case CMP_LE_U: - case CMP_LE_F: + case CMP_OLE_F: return "<="; case CMP_GT_S: case CMP_GT_U: - case CMP_GT_F: + case CMP_OGT_F: return ">"; case CMP_GE_S: case CMP_GE_U: - case CMP_GE_F: + case CMP_OGE_F: return ">="; + /* Compound FP predicates — no single C operator (see c_emit_cmp_operands). */ + case CMP_ONE_F: + case CMP_UEQ_F: + case CMP_ULT_F: + case CMP_ULE_F: + case CMP_UGT_F: + case CMP_UGE_F: + return NULL; } return NULL; } +/* The 6 FP predicates with no single C operator: built from compound ordered + * comparisons (no isnan(); host must not be built with -ffast-math). */ +static int cmp_is_fp_compound(CmpOp op) { + return op == CMP_ONE_F || op == CMP_UEQ_F || op == CMP_ULT_F || + op == CMP_ULE_F || op == CMP_UGT_F || op == CMP_UGE_F; +} + /* Returns 1 if cmp op needs unsigned operand cast. -1 if signed. 0 if no cast * (EQ/NE — sign doesn't matter for integer equality at the same width — and * float compares). */ @@ -1383,7 +1406,57 @@ static int cmp_signedness(CmpOp op) { } } +/* Emit one ordered comparison `<a> opstr <b>` (no signedness cast — FP). */ +static void c_emit_fp_rel(CTarget* t, Operand a, const char* opstr, Operand b) { + c_emit_operand_arith(t, a); + cbuf_puts(&t->body, " "); + cbuf_puts(&t->body, opstr); + cbuf_puts(&t->body, " "); + c_emit_operand_arith(t, b); +} + static void c_emit_cmp_operands(CTarget* t, CmpOp op, Operand a, Operand b) { + /* The 6 FP predicates without a single C operator. Composed from ordered + * comparisons via unordered-R == !(ordered-not-R); ONE/UEQ from a<b / a>b. + * Each `!(...)` / `(...)` wraps the full cast-bearing comparison. */ + switch (op) { + case CMP_UGE_F: /* !(OLT) */ + cbuf_puts(&t->body, "!("); + c_emit_fp_rel(t, a, "<", b); + cbuf_puts(&t->body, ")"); + return; + case CMP_UGT_F: /* !(OLE) */ + cbuf_puts(&t->body, "!("); + c_emit_fp_rel(t, a, "<=", b); + cbuf_puts(&t->body, ")"); + return; + case CMP_ULE_F: /* !(OGT) */ + cbuf_puts(&t->body, "!("); + c_emit_fp_rel(t, a, ">", b); + cbuf_puts(&t->body, ")"); + return; + case CMP_ULT_F: /* !(OGE) */ + cbuf_puts(&t->body, "!("); + c_emit_fp_rel(t, a, ">=", b); + cbuf_puts(&t->body, ")"); + return; + case CMP_ONE_F: /* ordered & !=: a<b || a>b */ + cbuf_puts(&t->body, "("); + c_emit_fp_rel(t, a, "<", b); + cbuf_puts(&t->body, " || "); + c_emit_fp_rel(t, a, ">", b); + cbuf_puts(&t->body, ")"); + return; + case CMP_UEQ_F: /* unordered | ==: !(a<b) && !(a>b) */ + cbuf_puts(&t->body, "(!("); + c_emit_fp_rel(t, a, "<", b); + cbuf_puts(&t->body, ") && !("); + c_emit_fp_rel(t, a, ">", b); + cbuf_puts(&t->body, "))"); + return; + default: + break; /* integer ops + single-operator FP fall through */ + } int sg = cmp_signedness(op); if (sg == 0) { c_emit_operand_arith(t, a); @@ -1406,12 +1479,13 @@ void c_emit_cmp(CTarget* t, CmpOp op, Operand dst, Operand a, Operand b) { if (dst.kind != OPK_LOCAL) { compiler_panic(t->c, loc, "C target: cmp dst must be LOCAL"); } - if (!cmp_to_c(op)) { + if (!cmp_to_c(op) && !cmp_is_fp_compound(op)) { compiler_panic(t->c, loc, "C target: unknown cmp %d", (int)op); } c_ensure_local(t, dst.v.local, dst.type); /* Compare result is C `int` (0/1); assigning to integer dst.type narrows - * implicitly without -Wall complaint. */ + * implicitly without -Wall complaint. The result of a `!(...)` / `||` / `&&` + * compound FP predicate is already an int 0/1. */ c_emit_local_assign_open(t, dst.v.local, dst.type); c_emit_cmp_operands(t, op, a, b); c_emit_local_assign_close(t); @@ -1489,7 +1563,7 @@ void c_emit_jump(CTarget* t, Label l) { void c_emit_cmp_branch(CTarget* t, CmpOp op, Operand a, Operand b, Label l) { if (t->last_was_terminator) return; SrcLoc loc = t->cur_fn ? t->cur_fn->loc : (SrcLoc){0, 0, 0}; - if (!cmp_to_c(op)) { + if (!cmp_to_c(op) && !cmp_is_fp_compound(op)) { compiler_panic(t->c, loc, "C target: unknown cmp %d", (int)op); } const char* kw = c_scope_kw_for_label(t, l); diff --git a/src/arch/rv64/native.c b/src/arch/rv64/native.c @@ -988,42 +988,77 @@ static void rv_emit_icmp(NativeTarget* t, CmpOp op, u32 rd, u32 ra, u32 rb) { } } +/* Format-dispatching wrappers over the ordered FP compares (feq/flt/fle are + * ordered: they yield 0 on NaN; flt/fle are signaling, raising NV on NaN — + * pre-existing for ordered ops, and the boolean result is still correct). */ +static u32 rv_feq_fmt(u32 fmt, u32 rd, u32 ra, u32 rb) { + return fmt == RV_FMT_D ? rv_feq_d(rd, ra, rb) : rv_feq_s(rd, ra, rb); +} +static u32 rv_flt_fmt(u32 fmt, u32 rd, u32 ra, u32 rb) { + return fmt == RV_FMT_D ? rv_flt_d(rd, ra, rb) : rv_flt_s(rd, ra, rb); +} +static u32 rv_fle_fmt(u32 fmt, u32 rd, u32 ra, u32 rb) { + return fmt == RV_FMT_D ? rv_fle_d(rd, ra, rb) : rv_fle_s(rd, ra, rb); +} + static void rv_cmp(NativeTarget* t, CmpOp op, NativeLoc dst, NativeLoc aop, NativeLoc bop) { MCEmitter* mc = t->mc; u32 rd = loc_reg(dst); - /* EQ/NE are shared int/FP opcodes; FP equality (and FP x!=x for isnan, - * bool-from-float, etc.) arrives as CMP_EQ/CMP_NE with FP-class operands and - * must use feq, not an integer compare on the FP register numbers. */ - if (op >= CMP_LT_F || - ((op == CMP_EQ || op == CMP_NE) && loc_is_fp(aop))) { + /* FP-ness is self-describing from the opcode (FP block starts at CMP_OEQ_F). + * Unordered predicates use unordered-R == NOT(ordered-not-R): the ordered + * compare into rd, then `xori rd,rd,1`. ONE/UEQ have no single ordered + * primitive and OR the two strict relations (a<b | a>b) via scratch RV_TMP2 + * (x7, reserved & never allocable, so it can't alias rd). */ + if (op >= CMP_OEQ_F) { u32 fmt = rv_type_size(t, aop.type) == 8u ? RV_FMT_D : RV_FMT_S; u32 ra = loc_reg(aop), rb = loc_reg(bop); switch (op) { - case CMP_EQ: - rv64_emit32(mc, fmt == RV_FMT_D ? rv_feq_d(rd, ra, rb) - : rv_feq_s(rd, ra, rb)); + case CMP_OEQ_F: + rv64_emit32(mc, rv_feq_fmt(fmt, rd, ra, rb)); return; - case CMP_NE: - rv64_emit32(mc, fmt == RV_FMT_D ? rv_feq_d(rd, ra, rb) - : rv_feq_s(rd, ra, rb)); + case CMP_UNE_F: /* !(OEQ) */ + rv64_emit32(mc, rv_feq_fmt(fmt, rd, ra, rb)); rv64_emit32(mc, rv_xori(rd, rd, 1)); return; - case CMP_LT_F: - rv64_emit32(mc, fmt == RV_FMT_D ? rv_flt_d(rd, ra, rb) - : rv_flt_s(rd, ra, rb)); + case CMP_OLT_F: + rv64_emit32(mc, rv_flt_fmt(fmt, rd, ra, rb)); return; - case CMP_LE_F: - rv64_emit32(mc, fmt == RV_FMT_D ? rv_fle_d(rd, ra, rb) - : rv_fle_s(rd, ra, rb)); + case CMP_OLE_F: + rv64_emit32(mc, rv_fle_fmt(fmt, rd, ra, rb)); return; - case CMP_GT_F: - rv64_emit32(mc, fmt == RV_FMT_D ? rv_flt_d(rd, rb, ra) - : rv_flt_s(rd, rb, ra)); + case CMP_OGT_F: + rv64_emit32(mc, rv_flt_fmt(fmt, rd, rb, ra)); return; - case CMP_GE_F: - rv64_emit32(mc, fmt == RV_FMT_D ? rv_fle_d(rd, rb, ra) - : rv_fle_s(rd, rb, ra)); + case CMP_OGE_F: + rv64_emit32(mc, rv_fle_fmt(fmt, rd, rb, ra)); + return; + case CMP_UGE_F: /* !(OLT) */ + rv64_emit32(mc, rv_flt_fmt(fmt, rd, ra, rb)); + rv64_emit32(mc, rv_xori(rd, rd, 1)); + return; + case CMP_UGT_F: /* !(OLE) */ + rv64_emit32(mc, rv_fle_fmt(fmt, rd, ra, rb)); + rv64_emit32(mc, rv_xori(rd, rd, 1)); + return; + case CMP_ULE_F: /* !(OGT) */ + rv64_emit32(mc, rv_flt_fmt(fmt, rd, rb, ra)); + rv64_emit32(mc, rv_xori(rd, rd, 1)); + return; + case CMP_ULT_F: /* !(OGE) */ + rv64_emit32(mc, rv_fle_fmt(fmt, rd, rb, ra)); + rv64_emit32(mc, rv_xori(rd, rd, 1)); + return; + case CMP_ONE_F: /* ordered & !=: (a<b) | (a>b) */ + rv64_emit32(mc, rv_flt_fmt(fmt, rd, ra, rb)); + rv64_emit32(mc, rv_flt_fmt(fmt, RV_TMP2, rb, ra)); + rv64_emit32(mc, rv_or(rd, rd, RV_TMP2)); + return; + case CMP_UEQ_F: /* unordered | ==: !((a<b) | (a>b)) */ + rv64_emit32(mc, rv_flt_fmt(fmt, rd, ra, rb)); + rv64_emit32(mc, rv_flt_fmt(fmt, RV_TMP2, rb, ra)); + rv64_emit32(mc, rv_or(rd, rd, RV_TMP2)); + rv64_emit32(mc, rv_xori(rd, rd, 1)); return; default: rv_panic(rv_of(t), "unsupported fp cmp"); @@ -1143,10 +1178,9 @@ static void rv_cmp_branch(NativeTarget* t, CmpOp op, NativeLoc aop, * branch's displacement is the constant SKIP_JAL (skip just the jal) and * so is always in range; the jal carries the long reach. See rv_jump. */ enum { SKIP_JAL = 8 }; /* branch over the 4-byte jal that follows it */ - /* FP compares (incl. EQ/NE on FP operands) have no register-register branch - * form: materialize the 0/1 into TMP0 via rv_cmp, then branch on nonzero. */ - if (op >= CMP_LT_F || - ((op == CMP_EQ || op == CMP_NE) && loc_is_fp(aop))) { + /* FP compares have no register-register branch form: materialize the 0/1 + * into TMP0 via rv_cmp (handles all 12 predicates), then branch on nonzero. */ + if (op >= CMP_OEQ_F) { NativeLoc tmp = rv_reg_loc(builtin_id(CFREE_CG_BUILTIN_I64), NATIVE_REG_INT, RV_TMP0); rv_cmp(t, op, tmp, aop, bop); diff --git a/src/arch/wasm/emit.c b/src/arch/wasm/emit.c @@ -2547,18 +2547,68 @@ static WasmInsnKind cmp_kind(WTarget* t, CmpOp op, WasmValType vt) { return is64 ? WASM_INSN_I64_GT_U : WASM_INSN_I32_GT_U; case CMP_GE_U: return is64 ? WASM_INSN_I64_GE_U : WASM_INSN_I32_GE_U; - case CMP_LT_F: - return vt == WASM_VAL_F64 ? WASM_INSN_F64_LT : WASM_INSN_F32_LT; - case CMP_LE_F: - return vt == WASM_VAL_F64 ? WASM_INSN_F64_LE : WASM_INSN_F32_LE; - case CMP_GT_F: - return vt == WASM_VAL_F64 ? WASM_INSN_F64_GT : WASM_INSN_F32_GT; - case CMP_GE_F: - return vt == WASM_VAL_F64 ? WASM_INSN_F64_GE : WASM_INSN_F32_GE; + /* FP compares are lowered by emit_fp_cmp (they may need multiple wasm + * instructions) and never reach cmp_kind. Listed so -Wswitch stays useful. */ + case CMP_OEQ_F: case CMP_ONE_F: case CMP_OLT_F: case CMP_OLE_F: + case CMP_OGT_F: case CMP_OGE_F: case CMP_UEQ_F: case CMP_UNE_F: + case CMP_ULT_F: case CMP_ULE_F: case CMP_UGT_F: case CMP_UGE_F: + break; } wfail(t, "wasm: unsupported cmp %d", (int)op); } +/* Push both compare operands (a then b) onto the wasm stack. */ +static void push_cmp_operands(WTarget* t, WIR* w, CfreeCgTypeId opty) { + emit_push_operand(t, w->imm_kind, w->imm_a, w->a, opty); + emit_push_operand(t, w->imm_kind_b, w->imm_b, w->b, opty); +} + +/* Lower an FP compare to the wasm stack, leaving an i32 0/1 result. wasm's + * f.eq/f.lt/f.le/f.gt/f.ge are ordered (false on NaN) and f.ne is unordered + * (true on NaN), so the 12 IEEE predicates compose from those plus i32.eqz / + * i32.or, using unordered-R == !(ordered-not-R). ONE/UEQ need both operands + * twice, so push_cmp_operands runs again for the second relation. */ +static void emit_fp_cmp(WTarget* t, CmpOp op, WIR* w, CfreeCgTypeId opty) { + int d = (type_valtype(t, opty) == WASM_VAL_F64); + WasmInsnKind EQ = d ? WASM_INSN_F64_EQ : WASM_INSN_F32_EQ; + WasmInsnKind NE = d ? WASM_INSN_F64_NE : WASM_INSN_F32_NE; + WasmInsnKind LT = d ? WASM_INSN_F64_LT : WASM_INSN_F32_LT; + WasmInsnKind LE = d ? WASM_INSN_F64_LE : WASM_INSN_F32_LE; + WasmInsnKind GT = d ? WASM_INSN_F64_GT : WASM_INSN_F32_GT; + WasmInsnKind GE = d ? WASM_INSN_F64_GE : WASM_INSN_F32_GE; + switch (op) { + case CMP_OEQ_F: push_cmp_operands(t, w, opty); emit_insn(t, EQ, 0); return; + case CMP_UNE_F: push_cmp_operands(t, w, opty); emit_insn(t, NE, 0); return; + case CMP_OLT_F: push_cmp_operands(t, w, opty); emit_insn(t, LT, 0); return; + case CMP_OLE_F: push_cmp_operands(t, w, opty); emit_insn(t, LE, 0); return; + case CMP_OGT_F: push_cmp_operands(t, w, opty); emit_insn(t, GT, 0); return; + case CMP_OGE_F: push_cmp_operands(t, w, opty); emit_insn(t, GE, 0); return; + case CMP_UGE_F: /* !(OLT) */ + push_cmp_operands(t, w, opty); emit_insn(t, LT, 0); + emit_insn(t, WASM_INSN_I32_EQZ, 0); return; + case CMP_UGT_F: /* !(OLE) */ + push_cmp_operands(t, w, opty); emit_insn(t, LE, 0); + emit_insn(t, WASM_INSN_I32_EQZ, 0); return; + case CMP_ULE_F: /* !(OGT) */ + push_cmp_operands(t, w, opty); emit_insn(t, GT, 0); + emit_insn(t, WASM_INSN_I32_EQZ, 0); return; + case CMP_ULT_F: /* !(OGE) */ + push_cmp_operands(t, w, opty); emit_insn(t, GE, 0); + emit_insn(t, WASM_INSN_I32_EQZ, 0); return; + case CMP_ONE_F: /* ordered & !=: (a<b) | (a>b) */ + push_cmp_operands(t, w, opty); emit_insn(t, LT, 0); + push_cmp_operands(t, w, opty); emit_insn(t, GT, 0); + emit_insn(t, WASM_INSN_I32_OR, 0); return; + case CMP_UEQ_F: /* unordered | ==: !((a<b) | (a>b)) */ + push_cmp_operands(t, w, opty); emit_insn(t, LT, 0); + push_cmp_operands(t, w, opty); emit_insn(t, GT, 0); + emit_insn(t, WASM_INSN_I32_OR, 0); + emit_insn(t, WASM_INSN_I32_EQZ, 0); return; + default: + wfail(t, "wasm: unsupported fp cmp %d", (int)op); + } +} + static void emit_convert(WTarget* t, ConvKind ck, WasmValType src, WasmValType dst, u32 sw, u32 dw) { (void)dw; @@ -3367,10 +3417,13 @@ static void linearize_range(WTarget* t, LoweringState* L, u32 start, u32 end) { break; } case WIR_CMP: { - WasmValType vt_operand = type_valtype(t, w->type2); - emit_push_operand(t, w->imm_kind, w->imm_a, w->a, w->type2); - emit_push_operand(t, w->imm_kind_b, w->imm_b, w->b, w->type2); - emit_insn(t, cmp_kind(t, (CmpOp)w->cgop, vt_operand), 0); + CmpOp cop = (CmpOp)w->cgop; + if (cop >= CMP_OEQ_F) { + emit_fp_cmp(t, cop, w, w->type2); + } else { + push_cmp_operands(t, w, w->type2); + emit_insn(t, cmp_kind(t, cop, type_valtype(t, w->type2)), 0); + } /* cmp result is i32 (0/1). dst type may be wider — but cg generally * stores cmp results into i32. */ emit_local_set(t, w->dst, w->type, (RegClass)w->cls); @@ -3798,10 +3851,13 @@ static void linearize_range(WTarget* t, LoweringState* L, u32 start, u32 end) { break; } case WIR_CMP_BRANCH: { - WasmValType vt = type_valtype(t, w->type); - emit_push_operand(t, w->imm_kind, w->imm_a, w->a, w->type); - emit_push_operand(t, w->imm_kind_b, w->imm_b, w->b, w->type); - emit_insn(t, cmp_kind(t, (CmpOp)w->cgop, vt), 0); + CmpOp cop = (CmpOp)w->cgop; + if (cop >= CMP_OEQ_F) { + emit_fp_cmp(t, cop, w, w->type); + } else { + push_cmp_operands(t, w, w->type); + emit_insn(t, cmp_kind(t, cop, type_valtype(t, w->type)), 0); + } u32 d = br_to_label(L, w->labels[0]); emit_insn(t, WASM_INSN_BR_IF, (i64)d); break; diff --git a/src/arch/x64/native.c b/src/arch/x64/native.c @@ -1109,7 +1109,10 @@ static u32 cmp_to_cc(CmpOp op) { } static int cmp_is_fp(CmpOp op, NativeLoc aop) { - return op >= CMP_LT_F || ((op == CMP_EQ || op == CMP_NE) && loc_is_fp(aop)); + /* FP-ness is self-describing from the opcode; FP eq/ne are distinct opcodes + * (CMP_OEQ_F/CMP_UNE_F), so no operand-class sniffing is needed. */ + (void)aop; + return op >= CMP_OEQ_F; } /* Emit `cmp ra, rb` (or ucomis[sd] for FP), setting flags from ra - rb. */ @@ -1148,12 +1151,12 @@ static void x64_fp_setcc_ordered(NativeTarget* t, u32 primary, u32 dst) { emit_alu_rr(mc, 0, X64_OPC_ALU_AND, dst, X64_R11); } -/* FP NE: result = unordered (P) || NE. */ -static void x64_fp_setcc_ne(NativeTarget* t, u32 dst) { +/* FP unordered predicate: result = (primary cc) || unordered (P). */ +static void x64_fp_setcc_unord(NativeTarget* t, u32 primary, u32 dst) { MCEmitter* mc = t->mc; - emit_setcc(mc, X64_CC_P, dst); + emit_setcc(mc, primary, dst); emit_movzx_r32_r8(mc, dst, dst); - emit_setcc(mc, X64_CC_NE, X64_R11); + emit_setcc(mc, X64_CC_P, X64_R11); emit_movzx_r32_r8(mc, X64_R11, X64_R11); emit_alu_rr(mc, 0, X64_OPC_ALU_OR, dst, X64_R11); } @@ -1165,16 +1168,27 @@ static void x64_cmp(NativeTarget* t, CmpOp op, NativeLoc dst, NativeLoc aop, int fp = cmp_is_fp(op, aop); x64_emit_cmp_flags(t, aop, bop, fp); if (fp) { - /* ucomis sets CF/ZF; unordered sets PF. GT/GE map to A/AE (operands not - * swapped — ucomis already gives the right unsigned-flag semantics for - * a>b / a>=b on ordered inputs). EQ/LT/LE additionally require ordered. */ + /* ucomis sets ZF/CF and, when unordered (NaN), also PF. Each predicate's + * flag formula is built explicitly (NOT blindly as !(opposite)): + * ordered: E/B/BE alias {==,<,<=} only when also NP (not-parity); + * NE/A/AE already exclude unordered, so they stand alone. + * unordered: E/B/BE already include the unordered case (ZF/CF set on + * NaN), so they stand alone; NE/A/AE need an OR with P. */ switch (op) { - case CMP_NE: x64_fp_setcc_ne(t, d); return; - case CMP_EQ: x64_fp_setcc_ordered(t, X64_CC_E, d); return; - case CMP_LT_F: x64_fp_setcc_ordered(t, X64_CC_B, d); return; - case CMP_LE_F: x64_fp_setcc_ordered(t, X64_CC_BE, d); return; - case CMP_GT_F: emit_setcc(mc, X64_CC_A, d); break; - case CMP_GE_F: emit_setcc(mc, X64_CC_AE, d); break; + /* ordered: require not-unordered (NP) on the equality-flag cases */ + case CMP_OEQ_F: x64_fp_setcc_ordered(t, X64_CC_E, d); return; + case CMP_OLT_F: x64_fp_setcc_ordered(t, X64_CC_B, d); return; + case CMP_OLE_F: x64_fp_setcc_ordered(t, X64_CC_BE, d); return; + case CMP_ONE_F: emit_setcc(mc, X64_CC_NE, d); break; + case CMP_OGT_F: emit_setcc(mc, X64_CC_A, d); break; + case CMP_OGE_F: emit_setcc(mc, X64_CC_AE, d); break; + /* unordered: OR-with-P on the cases that exclude unordered */ + case CMP_UEQ_F: emit_setcc(mc, X64_CC_E, d); break; + case CMP_ULT_F: emit_setcc(mc, X64_CC_B, d); break; + case CMP_ULE_F: emit_setcc(mc, X64_CC_BE, d); break; + case CMP_UNE_F: x64_fp_setcc_unord(t, X64_CC_NE, d); return; + case CMP_UGT_F: x64_fp_setcc_unord(t, X64_CC_A, d); return; + case CMP_UGE_F: x64_fp_setcc_unord(t, X64_CC_AE, d); return; default: emit_setcc(mc, cmp_to_cc(op), d); break; } emit_movzx_r32_r8(mc, d, d); diff --git a/src/cg/arith.c b/src/cg/arith.c @@ -123,6 +123,16 @@ void api_cg_unop(CfreeCg* g, UnOp iop, u32 flags) { return; } + /* Logical NOT of a delayed compare stays delayed: invert the predicate in + * place. For FP this flips ordered<->unordered as well as the relation (via + * api_invert_cmp), so `!(a<b)` becomes UGE (NaN -> true), matching IEEE + * negation. The inverted compare keeps the same i32 result type. */ + if (iop == UO_NOT && a.kind == SV_CMP) { + a.delayed.cmp.op = api_invert_cmp(a.delayed.cmp.op); + api_push(g, a); + return; + } + if (!flags && api_sv_op_is(&a, OPK_IMM) && api_try_fold_int_unop(g, iop, ty, a.op.v.imm, &folded)) { api_release(g, &a); @@ -154,15 +164,11 @@ void api_cg_unop(CfreeCg* g, UnOp iop, u32 flags) { void api_cg_cmp(CfreeCg* g, CmpOp cop) { ApiSValue b, a; - CgTarget* T; CfreeCgTypeId opty; CfreeCgTypeId i32; Operand ra, rb; - CGLocal rr; - Operand dst; i64 folded; if (!g) return; - T = g->target; b = api_pop(g); a = api_pop(g); opty = a.type ? a.type : b.type; @@ -178,18 +184,16 @@ void api_cg_cmp(CfreeCg* g, CmpOp cop) { ra = api_force_local_unless_imm(g, &a, opty); rb = api_force_local_unless_imm(g, &b, opty); - if (!api_type_is_float(g->c, opty)) { - api_push(g, - api_make_cmp(cop, ra, rb, i32, api_sv_owns_operand_local(&a, &ra), - api_sv_owns_operand_local(&b, &rb))); - return; - } - rr = api_alloc_temp_local(g, i32); - dst = api_op_local(rr, i32); - T->cmp(T, cop, dst, ra, rb); - api_release(g, &a); - api_release(g, &b); - api_push(g, api_make_sv(dst, i32)); + /* Both integer and FP compares are produced as delayed SV_CMP values. + * Delaying is what lets api_branch_if (and api_cg_unop's UO_NOT) invert + * the compare via api_invert_cmp, reaching the unordered FP duals + * (UGE/UGT/ULE/ULT/UEQ/UNE) from `!(a<b)` etc. with NaN-correct semantics. + * If the compare instead escapes into value context it is materialized + * unchanged via api_materialize_cmp_to, which calls T->cmp with the same + * opcode the eager path used to. */ + api_push(g, + api_make_cmp(cop, ra, rb, i32, api_sv_owns_operand_local(&a, &ra), + api_sv_owns_operand_local(&b, &rb))); } int api_try_i128_convert(CfreeCg* g, ConvKind ck, CfreeCgTypeId sty, @@ -599,53 +603,78 @@ void cfree_cg_fp_unop(CfreeCg* g, CfreeCgFpUnOp op, uint32_t flags) { api_cg_unop(g, UO_FNEG, 0); } +/* f128 single-libcall comparison: call `name(a,b)` and test its i32 three-way + * result against 0 with `icmp`. Consumes the two f128 operands on the stack and + * pushes the i32 boolean. */ +static void api_f128_cmp_call(CfreeCg* g, const char* name, CmpOp icmp) { + CfreeCgTypeId f128 = builtin_id(CFREE_CG_BUILTIN_F128); + CfreeCgTypeId i32 = builtin_id(CFREE_CG_BUILTIN_I32); + CfreeCgTypeId ps[2]; + ApiSValue args[2]; + ps[0] = f128; + ps[1] = f128; + args[1] = api_pop(g); + args[0] = api_pop(g); + api_runtime_call_values(g, name, i32, ps, 2, args); + cfree_cg_push_int(g, 0, i32); + api_cg_cmp(g, icmp); +} + +/* UEQ and ONE are the only f128 predicates that cannot be a single libcall: + * "equal" and "unordered" both yield a nonzero magnitude from __eqtf2/__netf2, + * so they need a separate __unordtf2 to split them. + * UEQ = (__eqtf2(a,b) == 0) || (__unordtf2(a,b) != 0) + * ONE = (__netf2(a,b) != 0) && (__unordtf2(a,b) == 0) + * The operands are dup'd (cfree_cg_dup copies into a fresh owned local) so each + * libcall consumes its own copy. */ +static void api_f128_cmp_with_unord(CfreeCg* g, CfreeCgFpCmpOp op) { + const char* relname = (op == CFREE_CG_FP_UEQ) ? "__eqtf2" : "__netf2"; + CmpOp relcmp = (op == CFREE_CG_FP_UEQ) ? CMP_EQ : CMP_NE; + /* [a, b] -> [a, b, a, b] */ + cfree_cg_dup2(g); + /* relation on the top (dup'd) copy: [a, b, R] */ + api_f128_cmp_call(g, relname, relcmp); + /* bring the original a, b back to TOS with R underneath: [R, a, b] */ + cfree_cg_rot3(g); + cfree_cg_rot3(g); + if (op == CFREE_CG_FP_UEQ) { + api_f128_cmp_call(g, "__unordtf2", CMP_NE); /* [R, unordered?] */ + api_cg_binop(g, BO_OR, 0); /* R || unordered */ + } else { + api_f128_cmp_call(g, "__unordtf2", CMP_EQ); /* [R, ordered?] */ + api_cg_binop(g, BO_AND, 0); /* R && ordered */ + } +} + void cfree_cg_fp_cmp(CfreeCg* g, CfreeCgFpCmpOp op) { if (api_f128_stack_top(g, 0) || api_f128_stack_top(g, 1)) { - CfreeCgTypeId f128 = builtin_id(CFREE_CG_BUILTIN_F128); - CfreeCgTypeId i32 = builtin_id(CFREE_CG_BUILTIN_I32); - CfreeCgTypeId ps[2]; - ApiSValue args[2]; - const char* name = "__eqtf2"; - CmpOp cmp = CMP_EQ; + /* f128/long double is soft-float: the comparison is a libcall returning a + * three-way i32 we test against 0. cfree's runtime uses the standard + * compiler-rt sign convention (rt/lib/impl/fp_compare_impl.inc): + * __le-family (__eqtf2/__netf2/__lttf2/__letf2): NaN -> +1 + * __ge-family (__getf2/__gttf2): NaN -> -1 + * so each ordered predicate AND its unordered dual maps to one libcall, + * choosing the helper whose NaN sign makes the integer test fall the right + * way (ordered: NaN must fail; unordered: NaN must pass). Only UEQ/ONE, + * which must split "equal" from "unordered", need a second (__unordtf2) + * call. */ switch (op) { - case CFREE_CG_FP_OEQ: + case CFREE_CG_FP_OEQ: api_f128_cmp_call(g, "__eqtf2", CMP_EQ); return; + case CFREE_CG_FP_UNE: api_f128_cmp_call(g, "__netf2", CMP_NE); return; + case CFREE_CG_FP_OLT: api_f128_cmp_call(g, "__lttf2", CMP_LT_S); return; + case CFREE_CG_FP_OLE: api_f128_cmp_call(g, "__letf2", CMP_LE_S); return; + case CFREE_CG_FP_OGT: api_f128_cmp_call(g, "__gttf2", CMP_GT_S); return; + case CFREE_CG_FP_OGE: api_f128_cmp_call(g, "__getf2", CMP_GE_S); return; + /* unordered duals via the opposite-sign helper (NaN flips the test): */ + case CFREE_CG_FP_UGE: api_f128_cmp_call(g, "__lttf2", CMP_GE_S); return; + case CFREE_CG_FP_UGT: api_f128_cmp_call(g, "__letf2", CMP_GT_S); return; + case CFREE_CG_FP_ULT: api_f128_cmp_call(g, "__getf2", CMP_LT_S); return; + case CFREE_CG_FP_ULE: api_f128_cmp_call(g, "__gttf2", CMP_LE_S); return; case CFREE_CG_FP_UEQ: - name = "__eqtf2"; - cmp = CMP_EQ; - break; case CFREE_CG_FP_ONE: - case CFREE_CG_FP_UNE: - name = "__netf2"; - cmp = CMP_NE; - break; - case CFREE_CG_FP_OLT: - case CFREE_CG_FP_ULT: - name = "__lttf2"; - cmp = CMP_LT_S; - break; - case CFREE_CG_FP_OLE: - case CFREE_CG_FP_ULE: - name = "__letf2"; - cmp = CMP_LE_S; - break; - case CFREE_CG_FP_OGT: - case CFREE_CG_FP_UGT: - name = "__gttf2"; - cmp = CMP_GT_S; - break; - case CFREE_CG_FP_OGE: - case CFREE_CG_FP_UGE: - name = "__getf2"; - cmp = CMP_GE_S; - break; + api_f128_cmp_with_unord(g, op); + return; } - args[1] = api_pop(g); - args[0] = api_pop(g); - ps[0] = f128; - ps[1] = f128; - api_runtime_call_values(g, name, i32, ps, 2, args); - cfree_cg_push_int(g, 0, i32); - api_cg_cmp(g, cmp); return; } api_cg_cmp(g, api_map_fp_cmp(op)); diff --git a/src/cg/cgtarget.h b/src/cg/cgtarget.h @@ -51,12 +51,16 @@ typedef enum UnOp { UO_BNOT, /* bitwise ~ */ } UnOp; -/* Compares producing i1. Integer signed/unsigned variants are total. The - * floating relationals (CMP_LT_F/LE_F/GT_F/GE_F) are ordered (NaN -> false); on - * floats CMP_EQ is ordered-equal (NaN -> false) and CMP_NE is unordered-not- - * equal (NaN -> true), matching C ==/!=. The internal set does not encode the - * ordered/unordered distinction for the relationals; see the FP-compare notes - * and the known lowering gap in doc/IR.md. */ +/* Compares producing i1. The 10 integer members (CMP_EQ..CMP_GE_U) are total + * and 1:1 with CfreeCgIntCmpOp; on integers CMP_EQ/CMP_NE are plain equality. + * + * The 12 floating-point members form a disjoint block laid out *after* the + * integer block, in the same order as the public CfreeCgFpCmpOp, and are + * IEEE-complete: each predicate encodes ordered (NaN -> false) vs unordered + * (NaN -> true) explicitly, so the distinction reaches every backend. The + * identity used throughout the backends is unordered-R == NOT(ordered-not-R) + * (e.g. ULT == !(OGE), UNE == !(OEQ)). CMP_OEQ_F is the FP boundary: an op is a + * floating compare iff op >= CMP_OEQ_F. */ typedef enum CmpOp { CMP_EQ, CMP_NE, @@ -68,10 +72,20 @@ typedef enum CmpOp { CMP_LE_U, CMP_GT_U, CMP_GE_U, - CMP_LT_F, - CMP_LE_F, - CMP_GT_F, - CMP_GE_F, + /* Ordered FP relationals (NaN -> false). */ + CMP_OEQ_F, + CMP_ONE_F, + CMP_OLT_F, + CMP_OLE_F, + CMP_OGT_F, + CMP_OGE_F, + /* Unordered FP relationals (NaN -> true). */ + CMP_UEQ_F, + CMP_UNE_F, + CMP_ULT_F, + CMP_ULE_F, + CMP_UGT_F, + CMP_UGE_F, } CmpOp; /* Conversions. Widths must order correctly (sext/zext widen, trunc narrows, diff --git a/src/cg/fold.c b/src/cg/fold.c @@ -233,14 +233,33 @@ CmpOp api_invert_cmp(CmpOp op) { return CMP_LE_U; case CMP_GE_U: return CMP_LT_U; - case CMP_LT_F: - return CMP_GE_F; - case CMP_LE_F: - return CMP_GT_F; - case CMP_GT_F: - return CMP_LE_F; - case CMP_GE_F: - return CMP_LT_F; + /* FP: the negation of a compare must flip ordered<->unordered (the NaN + * outcome flips too) as well as negate the relation. The correct inverse + * of ordered `a<b` is *unordered* `a>=b`, not ordered `a>=b`. */ + case CMP_OEQ_F: + return CMP_UNE_F; + case CMP_ONE_F: + return CMP_UEQ_F; + case CMP_OLT_F: + return CMP_UGE_F; + case CMP_OLE_F: + return CMP_UGT_F; + case CMP_OGT_F: + return CMP_ULE_F; + case CMP_OGE_F: + return CMP_ULT_F; + case CMP_UEQ_F: + return CMP_ONE_F; + case CMP_UNE_F: + return CMP_OEQ_F; + case CMP_ULT_F: + return CMP_OGE_F; + case CMP_ULE_F: + return CMP_OGT_F; + case CMP_UGT_F: + return CMP_OLE_F; + case CMP_UGE_F: + return CMP_OLT_F; } return CMP_EQ; } diff --git a/src/cg/ir_dump.c b/src/cg/ir_dump.c @@ -114,10 +114,18 @@ static const char* cg_ir_cmp_name(CmpOp op) { case CMP_LE_U: return "le_u"; case CMP_GT_U: return "gt_u"; case CMP_GE_U: return "ge_u"; - case CMP_LT_F: return "lt_f"; - case CMP_LE_F: return "le_f"; - case CMP_GT_F: return "gt_f"; - case CMP_GE_F: return "ge_f"; + case CMP_OEQ_F: return "oeq_f"; + case CMP_ONE_F: return "one_f"; + case CMP_OLT_F: return "olt_f"; + case CMP_OLE_F: return "ole_f"; + case CMP_OGT_F: return "ogt_f"; + case CMP_OGE_F: return "oge_f"; + case CMP_UEQ_F: return "ueq_f"; + case CMP_UNE_F: return "une_f"; + case CMP_ULT_F: return "ult_f"; + case CMP_ULE_F: return "ule_f"; + case CMP_UGT_F: return "ugt_f"; + case CMP_UGE_F: return "uge_f"; } return "??"; } diff --git a/src/cg/value.c b/src/cg/value.c @@ -544,28 +544,36 @@ CmpOp api_map_int_cmp(CfreeCgIntCmpOp op) { return CMP_EQ; } +/* 1:1: the internal FP block mirrors CfreeCgFpCmpOp's order, so the public + * ordered/unordered distinction is preserved all the way to the backends. */ CmpOp api_map_fp_cmp(CfreeCgFpCmpOp op) { switch (op) { case CFREE_CG_FP_OEQ: - case CFREE_CG_FP_UEQ: - return CMP_EQ; + return CMP_OEQ_F; case CFREE_CG_FP_ONE: - case CFREE_CG_FP_UNE: - return CMP_NE; + return CMP_ONE_F; case CFREE_CG_FP_OLT: - case CFREE_CG_FP_ULT: - return CMP_LT_F; + return CMP_OLT_F; case CFREE_CG_FP_OLE: - case CFREE_CG_FP_ULE: - return CMP_LE_F; + return CMP_OLE_F; case CFREE_CG_FP_OGT: - case CFREE_CG_FP_UGT: - return CMP_GT_F; + return CMP_OGT_F; case CFREE_CG_FP_OGE: + return CMP_OGE_F; + case CFREE_CG_FP_UEQ: + return CMP_UEQ_F; + case CFREE_CG_FP_UNE: + return CMP_UNE_F; + case CFREE_CG_FP_ULT: + return CMP_ULT_F; + case CFREE_CG_FP_ULE: + return CMP_ULE_F; + case CFREE_CG_FP_UGT: + return CMP_UGT_F; case CFREE_CG_FP_UGE: - return CMP_GE_F; + return CMP_UGE_F; } - return CMP_EQ; + return CMP_OEQ_F; } diff --git a/src/interp/engine.c b/src/interp/engine.c @@ -337,16 +337,25 @@ static u64 do_binop(InterpStack* st, u32 binop, u64 a, u64 b, u32 w, u8 fp) { } } -static int do_cmp(InterpStack* st, u32 cmp, u64 a, u64 b, u32 w, u8 fp) { - if (fp || (cmp >= CMP_LT_F)) { +static int do_cmp(InterpStack* st, u32 cmp, u64 a, u64 b, u32 w) { + /* FP-ness is self-describing from the opcode (the FP block starts at + * CMP_OEQ_F); no operand-class sniffing needed. */ + if (cmp >= CMP_OEQ_F) { double x = rd_f(a, w), y = rd_f(b, w); + int uno = (x != x) || (y != y); /* unordered: either operand is NaN */ switch ((CmpOp)cmp) { - case CMP_EQ: return x == y; - case CMP_NE: return x != y; - case CMP_LT_F: return x < y; - case CMP_LE_F: return x <= y; - case CMP_GT_F: return x > y; - case CMP_GE_F: return x >= y; + case CMP_OEQ_F: return x == y; /* ordered: false on NaN */ + case CMP_ONE_F: return !uno && (x != y); + case CMP_OLT_F: return x < y; + case CMP_OLE_F: return x <= y; + case CMP_OGT_F: return x > y; + case CMP_OGE_F: return x >= y; + case CMP_UEQ_F: return uno || (x == y); + case CMP_UNE_F: return x != y; /* unordered: true on NaN */ + case CMP_ULT_F: return uno || (x < y); + case CMP_ULE_F: return uno || (x <= y); + case CMP_UGT_F: return uno || (x > y); + case CMP_UGE_F: return uno || (x >= y); default: break; } } @@ -908,7 +917,7 @@ CfreeInterpStatus interp_run_stack(InterpStack* st, int64_t* out_ret) { u64 r = (u64)do_cmp(st, in->sub, op_value(st, fn, regs, mem_off, &I->opnds[1]), op_value(st, fn, regs, mem_off, &I->opnds[2]), - in->w0, in->fp0); + in->w0); if (st->status) goto stop; write_dst(st, fn, regs, mem_off, &I->opnds[0], r); NEXT(); @@ -952,7 +961,7 @@ CfreeInterpStatus interp_run_stack(InterpStack* st, int64_t* out_ret) { int taken = do_cmp(st, in->sub, op_value(st, fn, regs, mem_off, &I->opnds[0]), op_value(st, fn, regs, mem_off, &I->opnds[1]), - in->w0 ? in->w0 : 8u, in->fp0); + in->w0 ? in->w0 : 8u); if (st->status) goto stop; if (g_mem_fault) { fault(st, "invalid memory access"); goto stop; } ip = &fn->code[taken ? in->t0 : in->t1]; diff --git a/src/opt/pass_jump.c b/src/opt/pass_jump.c @@ -206,6 +206,45 @@ static int invert_cmp(CmpOp op, CmpOp* out) { case CMP_GE_U: *out = CMP_LT_U; return 1; + /* FP: negation flips ordered<->unordered (the NaN outcome flips too) as + * well as negating the relation. Kept byte-for-byte in sync with + * api_invert_cmp (src/cg/fold.c); see that function for the rationale. */ + case CMP_OEQ_F: + *out = CMP_UNE_F; + return 1; + case CMP_ONE_F: + *out = CMP_UEQ_F; + return 1; + case CMP_OLT_F: + *out = CMP_UGE_F; + return 1; + case CMP_OLE_F: + *out = CMP_UGT_F; + return 1; + case CMP_OGT_F: + *out = CMP_ULE_F; + return 1; + case CMP_OGE_F: + *out = CMP_ULT_F; + return 1; + case CMP_UEQ_F: + *out = CMP_ONE_F; + return 1; + case CMP_UNE_F: + *out = CMP_OEQ_F; + return 1; + case CMP_ULT_F: + *out = CMP_OGE_F; + return 1; + case CMP_ULE_F: + *out = CMP_OGT_F; + return 1; + case CMP_UGT_F: + *out = CMP_OLE_F; + return 1; + case CMP_UGE_F: + *out = CMP_OLT_F; + return 1; default: return 0; } diff --git a/test/api/cg_fp_cmp_test.c b/test/api/cg_fp_cmp_test.c @@ -0,0 +1,290 @@ +/* cg_fp_cmp_test — drives every public CfreeCgFpCmpOp predicate through + * cfree_cg_fp_cmp. The six discriminating predicates (ULT/ULE/UGT/UGE/UEQ/ONE) + * cannot be produced by any C or toy source — they are reachable *only* through + * this entry point — so this is the guard against the "advertise-but-ignore" + * gap the public surface used to have (api_map_fp_cmp collapsed all 12 down to + * 6, dropping the ordered/unordered distinction before any backend saw it). + * + * Two kinds of coverage: + * + * Execution — build `int f(double,double){ return pred(a,b); }` (and an f128 + * variant that fpext's the operands to long double), capture each into an + * in-process interpreter, and assert per-predicate results for a spread of + * NaN / ordinary / signed-zero operands. Host-independent: the interpreter + * runs the target-independent IR, so this validates the public->internal + * mapping and the engine for all 12 predicates, scalar and f128 (§6). + * + * Emission — build the same functions for every native and special backend + * (aarch64 / x86-64 / riscv64 / wasm32 / C-source) at -O0 and -O1 and + * finalize the object. A backend that mishandles a new unordered opcode + * panics, which aborts (and fails) the test — this is what catches the + * wasm FP-eq/ne gap (§5) and the silent default-arm dispatch tables. + * + * Run by: make test-cg-api + */ + +#include <cfree/cg.h> +#include <cfree/core.h> +#include <cfree/interp.h> +#include <cfree/object.h> +#include <stdint.h> +#include <stdio.h> +#include <string.h> + +#include "lib/cfree_unit.h" + +static CfreeUnit g_u; +#define EXPECT(cond, ...) CU_EXPECT(&g_u, cond, __VA_ARGS__) + +/* ---- the 12 predicates, in CfreeCgFpCmpOp order ---------------------- */ + +typedef struct { + CfreeCgFpCmpOp op; + const char* name; +} Pred; + +static const Pred PREDS[] = { + {CFREE_CG_FP_OEQ, "oeq"}, {CFREE_CG_FP_ONE, "one"}, + {CFREE_CG_FP_OLT, "olt"}, {CFREE_CG_FP_OLE, "ole"}, + {CFREE_CG_FP_OGT, "ogt"}, {CFREE_CG_FP_OGE, "oge"}, + {CFREE_CG_FP_UEQ, "ueq"}, {CFREE_CG_FP_UNE, "une"}, + {CFREE_CG_FP_ULT, "ult"}, {CFREE_CG_FP_ULE, "ule"}, + {CFREE_CG_FP_UGT, "ugt"}, {CFREE_CG_FP_UGE, "uge"}, +}; +enum { NPRED = (int)(sizeof PREDS / sizeof PREDS[0]) }; + +/* IEEE-correct reference result (mirrors src/interp/engine.c do_cmp). */ +static int fp_expected(CfreeCgFpCmpOp op, double a, double b) { + int uno = (a != a) || (b != b); /* either operand is NaN */ + switch (op) { + case CFREE_CG_FP_OEQ: return a == b; + case CFREE_CG_FP_ONE: return !uno && (a != b); + case CFREE_CG_FP_OLT: return a < b; + case CFREE_CG_FP_OLE: return a <= b; + case CFREE_CG_FP_OGT: return a > b; + case CFREE_CG_FP_OGE: return a >= b; + case CFREE_CG_FP_UEQ: return uno || (a == b); + case CFREE_CG_FP_UNE: return a != b; + case CFREE_CG_FP_ULT: return uno || (a < b); + case CFREE_CG_FP_ULE: return uno || (a <= b); + case CFREE_CG_FP_UGT: return uno || (a > b); + case CFREE_CG_FP_UGE: return uno || (a >= b); + } + return 0; +} + +/* ---- operand spread: bit patterns so NaN/-0.0 are exact ------------- */ + +typedef struct { + uint64_t bits; + const char* desc; +} Operand64; + +static const Operand64 OPS[] = { + {0x7ff8000000000000ull, "nan"}, {0x3ff0000000000000ull, "1.0"}, + {0x4000000000000000ull, "2.0"}, {0x8000000000000000ull, "-0.0"}, + {0x0000000000000000ull, "0.0"}, +}; +enum { NOPS = (int)(sizeof OPS / sizeof OPS[0]) }; + +static double bits_to_double(uint64_t b) { + double d; + memcpy(&d, &b, sizeof d); + return d; +} + +/* ---- build `int f(double,double){ return pred(a,b); }` -------------- * + * When use_f128 is set the loaded f64 operands are fpext'd to long double + * before the compare, exercising the soft-float libcall path. Returns the + * declared symbol; the function is captured by `name` for interp lookup. */ +static void build_cmp_fn(CfreeCompiler* c, CfreeCg* cg, const char* name, + CfreeCgFpCmpOp op, int use_f128) { + CfreeCgBuiltinTypes bi = cfree_cg_builtin_types(c); + CfreeCgTypeId i32 = bi.id[CFREE_CG_BUILTIN_I32]; + CfreeCgTypeId f64 = bi.id[CFREE_CG_BUILTIN_F64]; + CfreeCgTypeId f128 = bi.id[CFREE_CG_BUILTIN_F128]; + CfreeCgFuncParam params[2]; + CfreeCgFuncResult result; + CfreeCgFuncSig sig; + CfreeCgDecl decl; + CfreeCgSym sym; + CfreeCgLocalAttrs la; + CfreeCgLocal p0, p1; + CfreeCgMemAccess ma; + + memset(params, 0, sizeof params); + params[0].type = f64; + params[1].type = f64; + memset(&result, 0, sizeof result); + result.type = i32; + memset(&sig, 0, sizeof sig); + sig.results = &result; + sig.nresults = 1; + sig.params = params; + sig.nparams = 2; + sig.call_conv = CFREE_CG_CC_TARGET_C; + + memset(&decl, 0, sizeof decl); + decl.kind = CFREE_CG_DECL_FUNC; + decl.linkage_name = cfree_sym_intern(c, cfree_slice_cstr(name)); + decl.display_name = decl.linkage_name; + decl.type = cfree_cg_type_func(c, sig); + decl.sym.bind = CFREE_SB_GLOBAL; + decl.sym.visibility = CFREE_CG_VIS_DEFAULT; + sym = cfree_cg_decl(cg, decl); + EXPECT(sym != CFREE_CG_SYM_NONE, "%s: decl failed", name); + + cfree_cg_func_begin(cg, sym); + memset(&la, 0, sizeof la); + p0 = cfree_cg_param(cg, 0, f64, la); + p1 = cfree_cg_param(cg, 1, f64, la); + + memset(&ma, 0, sizeof ma); + ma.type = f64; + ma.align = cfree_cg_type_align(c, f64); + + cfree_cg_push_local(cg, p0); + cfree_cg_load(cg, ma); + if (use_f128) cfree_cg_fpext(cg, f128); + cfree_cg_push_local(cg, p1); + cfree_cg_load(cg, ma); + if (use_f128) cfree_cg_fpext(cg, f128); + cfree_cg_fp_cmp(cg, op); + cfree_cg_ret(cg); + cfree_cg_func_end(cg); +} + +/* ---- Execution coverage (in-process interpreter) -------------------- */ + +/* The interpreter captures functions during the optimizer's emit pass, which + * only runs at opt_level >= 1 (matching `cfree run --no-jit`), and the capture + * happens for the non-aarch64 codegen path — so we build for x86-64 here. The + * interpreter runs the *target-independent* IR, so the result is the host- + * independent semantics of every predicate regardless of this capture target. + * + * Scalar (f64) only: the f128 path lowers each compare to a soft-float libcall + * (__eqtf2/__unordtf2/...), which the bare interpreter cannot resolve without a + * linked runtime. The f128 lowering is covered by run_emit (emission) and, end + * to end, by the `long double` cases in the parse suite under JIT. */ +static void run_exec(void) { + CfreeTarget tgt = + cfree_unit_target(CFREE_ARCH_X86_64, CFREE_OS_LINUX, CFREE_OBJ_ELF); + CfreeCompiler* c = NULL; + CfreeInterpProgram* pp; + CfreeObjBuilder* ob = NULL; + CfreeCg* cg = NULL; + CfreeCodeOptions opts; + const char* tag = "f64"; + int i, j, k; + char nm[32]; + + if (cfree_unit_compiler_new(&g_u, tgt, &c) != CFREE_OK || !c) { + EXPECT(0, "%s: compiler_new failed", tag); + return; + } + pp = cfree_interp_program_new(c); + EXPECT(pp != NULL, "%s: interp_program_new failed", tag); + cfree_interp_program_attach(pp, c); + + EXPECT(cfree_obj_builder_new(c, &ob) == CFREE_OK && ob, "%s: obj_new", tag); + EXPECT(cfree_cg_new(c, &cg) == CFREE_OK && cg, "%s: cg_new", tag); + memset(&opts, 0, sizeof opts); + opts.opt_level = 1; /* interp capture requires the optimizer pass */ + cfree_cg_begin_obj(cg, ob, &opts); + + for (i = 0; i < NPRED; ++i) { + snprintf(nm, sizeof nm, "cmp_%s_%d", tag, i); + build_cmp_fn(c, cg, nm, PREDS[i].op, /*use_f128=*/0); + } + EXPECT(cfree_cg_end_obj(cg) == CFREE_OK, "%s: end_obj", tag); + + for (i = 0; i < NPRED; ++i) { + CfreeInterpFunc* fn; + snprintf(nm, sizeof nm, "cmp_%s_%d", tag, i); + fn = cfree_interp_lookup(pp, cfree_slice_cstr(nm)); + EXPECT(fn != NULL, "%s: %s not captured", tag, PREDS[i].name); + if (!fn) continue; + for (j = 0; j < NOPS; ++j) { + for (k = 0; k < NOPS; ++k) { + uint64_t args[2] = {OPS[j].bits, OPS[k].bits}; + int64_t ret = -1; + int want = fp_expected(PREDS[i].op, bits_to_double(OPS[j].bits), + bits_to_double(OPS[k].bits)); + CfreeInterpStatus s = + cfree_interp_call_args(pp, fn, args, 2, &ret); + EXPECT(s == CFREE_INTERP_DONE && (int)ret == want, + "%s %s(%s,%s): want %d got %lld (status %d)", tag, + PREDS[i].name, OPS[j].desc, OPS[k].desc, want, (long long)ret, + (int)s); + } + } + } + + cfree_cg_free(cg); + cfree_obj_builder_free(ob); + cfree_interp_program_free(pp); + cfree_compiler_free(c); +} + +/* ---- Emission coverage (every backend, no execution) ---------------- */ + +static void run_emit(CfreeArchKind arch, CfreeOSKind os, CfreeObjFmt fmt, + const char* tag, int opt_level, int do_f128) { + CfreeTarget tgt = cfree_unit_target(arch, os, fmt); + CfreeCompiler* c = NULL; + CfreeObjBuilder* ob = NULL; + CfreeCg* cg = NULL; + CfreeCodeOptions opts; + int i; + char nm[40]; + + if (cfree_unit_compiler_new(&g_u, tgt, &c) != CFREE_OK || !c) { + EXPECT(0, "%s/O%d: compiler_new failed", tag, opt_level); + return; + } + EXPECT(cfree_obj_builder_new(c, &ob) == CFREE_OK && ob, "%s: obj_new", tag); + EXPECT(cfree_cg_new(c, &cg) == CFREE_OK && cg, "%s: cg_new", tag); + memset(&opts, 0, sizeof opts); + opts.opt_level = opt_level; + cfree_cg_begin_obj(cg, ob, &opts); + + for (i = 0; i < NPRED; ++i) { + snprintf(nm, sizeof nm, "emit_%s_o%d_f64_%d", tag, opt_level, i); + build_cmp_fn(c, cg, nm, PREDS[i].op, /*use_f128=*/0); + if (do_f128) { + snprintf(nm, sizeof nm, "emit_%s_o%d_f128_%d", tag, opt_level, i); + build_cmp_fn(c, cg, nm, PREDS[i].op, /*use_f128=*/1); + } + } + /* If any backend mishandles a new opcode it panics here (aborting the test); + * otherwise the object finalizes cleanly. */ + EXPECT(cfree_cg_end_obj(cg) == CFREE_OK, "%s/O%d: end_obj failed", tag, + opt_level); + + cfree_cg_free(cg); + cfree_obj_builder_free(ob); + cfree_compiler_free(c); +} + +int main(void) { + int opt; + cfree_unit_init(&g_u); + + /* Execution: public mapping + interp semantics for all 12 scalar predicates. */ + run_exec(); + + /* Emission: every native backend lowers every predicate (f64 + f128) without + * panicking, at both opt levels. */ + for (opt = 0; opt <= 1; ++opt) { + run_emit(CFREE_ARCH_ARM_64, CFREE_OS_LINUX, CFREE_OBJ_ELF, "aa64", opt, 1); + run_emit(CFREE_ARCH_X86_64, CFREE_OS_LINUX, CFREE_OBJ_ELF, "x64", opt, 1); + run_emit(CFREE_ARCH_RV64, CFREE_OS_LINUX, CFREE_OBJ_ELF, "rv64", opt, 1); + } + /* wasm: O0 only (the O1 optimizer is native-target only), f64 only — + * exercises the from-scratch FP eq/ne + unordered arms in the wasm backend + * (emit_fp_cmp), which previously had no float eq/ne path at all. */ + run_emit(CFREE_ARCH_WASM, CFREE_OS_WASI, CFREE_OBJ_WASM, "wasm", 0, 0); + + cfree_unit_summary(&g_u, "cg_fp_cmp_test"); + return cfree_unit_status(&g_u); +} diff --git a/test/interp/interp_smoke_test.c b/test/interp/interp_smoke_test.c @@ -554,12 +554,46 @@ static void spec_fp_cmp_nan(void) { TestCtx tc; double nan = bitsd(0x7ff8000000000000ull); tc_init(&tc); - EXPECT(run_fcmp(&tc, CMP_LT_F, nan, 1.0) == 0, "lt_f NaN ordered -> false"); - EXPECT(run_fcmp(&tc, CMP_GE_F, 1.0, nan) == 0, "ge_f NaN ordered -> false"); - EXPECT(run_fcmp(&tc, CMP_EQ, nan, nan) == 0, "eq NaN ordered -> false"); - EXPECT(run_fcmp(&tc, CMP_NE, nan, nan) == 1, "ne NaN unordered -> true"); - EXPECT(run_fcmp(&tc, CMP_EQ, -0.0, 0.0) == 1, "eq -0.0 == 0.0 -> true"); - EXPECT(run_fcmp(&tc, CMP_LT_F, 1.0, 2.0) == 1, "lt_f ordinary -> true"); + /* Ordered relationals + OEQ are false on NaN; the unordered duals are true. + * Each predicate is checked against NaN-lhs, NaN-rhs, both-NaN, ordered, and + * a -0.0/0.0 boundary so the backend's ordered/unordered split is exercised + * end to end. */ + EXPECT(run_fcmp(&tc, CMP_OLT_F, nan, 1.0) == 0, "olt NaN-lhs -> false"); + EXPECT(run_fcmp(&tc, CMP_OGE_F, 1.0, nan) == 0, "oge NaN-rhs -> false"); + EXPECT(run_fcmp(&tc, CMP_OEQ_F, nan, nan) == 0, "oeq both-NaN -> false"); + EXPECT(run_fcmp(&tc, CMP_UNE_F, nan, nan) == 1, "une both-NaN -> true"); + EXPECT(run_fcmp(&tc, CMP_OEQ_F, -0.0, 0.0) == 1, "oeq -0.0 == 0.0 -> true"); + EXPECT(run_fcmp(&tc, CMP_OLT_F, 1.0, 2.0) == 1, "olt ordinary -> true"); + + /* Ordered predicates: false on any NaN, normal otherwise. */ + EXPECT(run_fcmp(&tc, CMP_OEQ_F, 1.0, 1.0) == 1, "oeq 1==1 -> true"); + EXPECT(run_fcmp(&tc, CMP_ONE_F, 1.0, 2.0) == 1, "one 1!=2 -> true"); + EXPECT(run_fcmp(&tc, CMP_ONE_F, 1.0, 1.0) == 0, "one 1!=1 -> false"); + EXPECT(run_fcmp(&tc, CMP_ONE_F, nan, 1.0) == 0, "one NaN -> false"); + EXPECT(run_fcmp(&tc, CMP_OLE_F, 1.0, 1.0) == 1, "ole 1<=1 -> true"); + EXPECT(run_fcmp(&tc, CMP_OLE_F, 2.0, 1.0) == 0, "ole 2<=1 -> false"); + EXPECT(run_fcmp(&tc, CMP_OLE_F, nan, 1.0) == 0, "ole NaN -> false"); + EXPECT(run_fcmp(&tc, CMP_OGT_F, 2.0, 1.0) == 1, "ogt 2>1 -> true"); + EXPECT(run_fcmp(&tc, CMP_OGT_F, 1.0, nan) == 0, "ogt NaN-rhs -> false"); + EXPECT(run_fcmp(&tc, CMP_OGE_F, 1.0, 1.0) == 1, "oge 1>=1 -> true"); + + /* Unordered predicates: true on any NaN, ordered result otherwise. */ + EXPECT(run_fcmp(&tc, CMP_UEQ_F, nan, 1.0) == 1, "ueq NaN -> true"); + EXPECT(run_fcmp(&tc, CMP_UEQ_F, 1.0, 2.0) == 0, "ueq 1==2 ordered -> false"); + EXPECT(run_fcmp(&tc, CMP_UEQ_F, 1.0, 1.0) == 1, "ueq 1==1 ordered -> true"); + EXPECT(run_fcmp(&tc, CMP_UNE_F, 1.0, 1.0) == 0, "une 1!=1 ordered -> false"); + EXPECT(run_fcmp(&tc, CMP_ULT_F, nan, 1.0) == 1, "ult NaN-lhs -> true"); + EXPECT(run_fcmp(&tc, CMP_ULT_F, 1.0, 2.0) == 1, "ult 1<2 -> true"); + EXPECT(run_fcmp(&tc, CMP_ULT_F, 2.0, 1.0) == 0, "ult 2<1 -> false"); + EXPECT(run_fcmp(&tc, CMP_ULE_F, 1.0, nan) == 1, "ule NaN-rhs -> true"); + EXPECT(run_fcmp(&tc, CMP_ULE_F, 1.0, 1.0) == 1, "ule 1<=1 -> true"); + EXPECT(run_fcmp(&tc, CMP_ULE_F, 2.0, 1.0) == 0, "ule 2<=1 -> false"); + EXPECT(run_fcmp(&tc, CMP_UGT_F, nan, nan) == 1, "ugt both-NaN -> true"); + EXPECT(run_fcmp(&tc, CMP_UGT_F, 2.0, 1.0) == 1, "ugt 2>1 -> true"); + EXPECT(run_fcmp(&tc, CMP_UGT_F, 1.0, 2.0) == 0, "ugt 1>2 -> false"); + EXPECT(run_fcmp(&tc, CMP_UGE_F, nan, 1.0) == 1, "uge NaN-lhs -> true"); + EXPECT(run_fcmp(&tc, CMP_UGE_F, 1.0, 1.0) == 1, "uge 1>=1 -> true"); + EXPECT(run_fcmp(&tc, CMP_UGE_F, 1.0, 2.0) == 0, "uge 1>=2 -> false"); tc_fini(&tc); } diff --git a/test/lib/unit.mk b/test/lib/unit.mk @@ -30,11 +30,12 @@ UNIT_CFLAGS_INTERNAL = $(HOST_CFLAGS) -Iinclude -Isrc -Itest # ---- registrations: stem lists + per-stem source --------------------------- UNIT_TESTS_PUBLIC := \ - ar_test cg_api_test cg_switch_test rv64_jit_test \ + ar_test cg_api_test cg_switch_test cg_fp_cmp_test rv64_jit_test \ aa64_inline_test rv64_inline_test x64_inline_test ar_test_SRC := test/ar/ar_test.c cg_api_test_SRC := test/api/cg_type_test.c cg_switch_test_SRC := test/api/cg_switch_test.c +cg_fp_cmp_test_SRC := test/api/cg_fp_cmp_test.c rv64_jit_test_SRC := test/link/rv64_jit_test.c aa64_inline_test_SRC := test/arch/aa64_inline_test.c rv64_inline_test_SRC := test/arch/rv64_inline_test.c diff --git a/test/parse/cases/builtin_fp_cmp_negation.c b/test/parse/cases/builtin_fp_cmp_negation.c @@ -0,0 +1,54 @@ +/* Route B — negating a floating comparison folds to the unordered dual. + * + * Because FP compares are delayed SV_CMPs, `!(a < b)` becomes the NaN-correct + * unordered relation rather than a bare bitwise negate: + * !(a < b) == UGE (a >= b OR unordered) + * !(a <= b) == UGT + * !(a > b) == ULE + * !(a >= b) == ULT + * !(a == b) == UNE (already; C `!=` is unordered) + * !islessgreater(a, b) == UEQ + * For ordinary operands these behave like the plain (negated) comparison; for + * a NaN operand every negated ordered relation is true. */ +static double mknan(void) { + double z = 0.0; + return z / z; +} + +int test_main(void) { + double nan = mknan(); + double a = 1.0, b = 2.0; + int rc = 0; + + /* !(a < b) -> UGE */ + if (!(a < b) != 0) rc |= 1 << 0; /* 1<2 true -> 0 */ + if (!(b < a) != 1) rc |= 1 << 1; /* 2<1 false -> 1 */ + if (!(nan < a) != 1) rc |= 1 << 2; /* NaN unordered -> 1 */ + + /* !(a <= b) -> UGT */ + if (!(a <= b) != 0) rc |= 1 << 3; + if (!(b <= a) != 1) rc |= 1 << 4; + if (!(nan <= a) != 1) rc |= 1 << 5; + + /* !(a > b) -> ULE */ + if (!(b > a) != 0) rc |= 1 << 6; + if (!(a > b) != 1) rc |= 1 << 7; + if (!(nan > a) != 1) rc |= 1 << 8; + + /* !(a >= b) -> ULT */ + if (!(b >= a) != 0) rc |= 1 << 9; + if (!(a >= b) != 1) rc |= 1 << 10; + if (!(nan >= a) != 1) rc |= 1 << 11; + + /* !(a == b) -> UNE */ + if (!(a == a) != 0) rc |= 1 << 12; + if (!(a == b) != 1) rc |= 1 << 13; + if (!(nan == a) != 1) rc |= 1 << 14; /* NaN never equal -> !false = 1 */ + + /* !islessgreater -> UEQ (equal OR unordered) */ + if (!__builtin_islessgreater(a, a) != 1) rc |= 1 << 15; /* equal -> 1 */ + if (!__builtin_islessgreater(a, b) != 0) rc |= 1 << 16; /* distinct -> 0 */ + if (!__builtin_islessgreater(nan, a) != 1) rc |= 1 << 17; /* NaN -> 1 */ + + return rc == 0 ? 42 : 0; +} diff --git a/test/parse/cases/builtin_fp_cmp_negation.expected b/test/parse/cases/builtin_fp_cmp_negation.expected @@ -0,0 +1 @@ +42 diff --git a/test/parse/cases/builtin_islessgreater_unordered.c b/test/parse/cases/builtin_islessgreater_unordered.c @@ -0,0 +1,48 @@ +/* Route A — C99 floating comparison builtins. + * + * isless/islessequal/isgreater/isgreaterequal reach the ordered predicates + * OLT/OLE/OGT/OGE (NaN -> 0); islessgreater reaches the discriminating ONE + * predicate (ordered-and-not-equal); isunordered is true iff a NaN. */ +static double mknan(void) { + double z = 0.0; + return z / z; +} + +int test_main(void) { + double nan = mknan(); + int rc = 0; + + /* isless -> OLT */ + if (__builtin_isless(1.0, 2.0) != 1) rc |= 1 << 0; + if (__builtin_isless(2.0, 1.0) != 0) rc |= 1 << 1; + if (__builtin_isless(nan, 1.0) != 0) rc |= 1 << 2; /* NaN ordered -> 0 */ + + /* islessequal -> OLE */ + if (__builtin_islessequal(1.0, 1.0) != 1) rc |= 1 << 3; + if (__builtin_islessequal(2.0, 1.0) != 0) rc |= 1 << 4; + if (__builtin_islessequal(nan, 1.0) != 0) rc |= 1 << 5; + + /* isgreater -> OGT */ + if (__builtin_isgreater(2.0, 1.0) != 1) rc |= 1 << 6; + if (__builtin_isgreater(1.0, 2.0) != 0) rc |= 1 << 7; + if (__builtin_isgreater(nan, 1.0) != 0) rc |= 1 << 8; + + /* isgreaterequal -> OGE */ + if (__builtin_isgreaterequal(1.0, 1.0) != 1) rc |= 1 << 9; + if (__builtin_isgreaterequal(1.0, 2.0) != 0) rc |= 1 << 10; + if (__builtin_isgreaterequal(nan, 1.0) != 0) rc |= 1 << 11; + + /* islessgreater -> ONE (ordered and not equal) */ + if (__builtin_islessgreater(1.0, 2.0) != 1) rc |= 1 << 12; + if (__builtin_islessgreater(2.0, 1.0) != 1) rc |= 1 << 13; + if (__builtin_islessgreater(1.0, 1.0) != 0) rc |= 1 << 14; + if (__builtin_islessgreater(nan, 1.0) != 0) rc |= 1 << 15; /* NaN -> 0 */ + + /* isunordered -> true iff a NaN */ + if (__builtin_isunordered(nan, 1.0) != 1) rc |= 1 << 16; + if (__builtin_isunordered(1.0, nan) != 1) rc |= 1 << 17; + if (__builtin_isunordered(nan, nan) != 1) rc |= 1 << 18; + if (__builtin_isunordered(1.0, 2.0) != 0) rc |= 1 << 19; + + return rc == 0 ? 42 : 0; +} diff --git a/test/parse/cases/builtin_islessgreater_unordered.expected b/test/parse/cases/builtin_islessgreater_unordered.expected @@ -0,0 +1 @@ +42 diff --git a/test/parse/cases/builtin_islessgreater_unordered.wasm.skip b/test/parse/cases/builtin_islessgreater_unordered.wasm.skip @@ -0,0 +1 @@ +wasm re-lowering of the discriminating FP predicates (ONE / unordered duals) is not yet NaN-correct on the foundation; same gap fails the untouched rv64_fp_nan_compare on lane W (got 4 at ne_d). Tracked separately from the C-frontend Route A/B work. diff --git a/test/test.mk b/test/test.mk @@ -371,13 +371,15 @@ test-interp-toy: bin CG_API_TEST_BIN = build/test/cg_api_test CG_SWITCH_TEST_BIN = build/test/cg_switch_test +CG_FP_CMP_TEST_BIN = build/test/cg_fp_cmp_test ABI_CLASSIFY_TEST_BIN = build/test/abi_classify_test IR_RECORDER_TEST_BIN = build/test/ir_recorder_test NATIVE_DIRECT_TARGET_TEST_BIN = build/test/native_direct_target_test -test-cg-api: $(CG_API_TEST_BIN) $(CG_SWITCH_TEST_BIN) +test-cg-api: $(CG_API_TEST_BIN) $(CG_SWITCH_TEST_BIN) $(CG_FP_CMP_TEST_BIN) $(CG_API_TEST_BIN) $(CG_SWITCH_TEST_BIN) + $(CG_FP_CMP_TEST_BIN) test-abi-classify: $(ABI_CLASSIFY_TEST_BIN) $(ABI_CLASSIFY_TEST_BIN) diff --git a/test/toy/cases/152_fp_cmp_builtins_a.expected b/test/toy/cases/152_fp_cmp_builtins_a.expected @@ -0,0 +1 @@ +0 diff --git a/test/toy/cases/152_fp_cmp_builtins_a.toy b/test/toy/cases/152_fp_cmp_builtins_a.toy @@ -0,0 +1,62 @@ +// Route A: C99-style floating comparison builtins. +// @isless/@islessequal/@isgreater/@isgreaterequal -> ordered OLT/OLE/OGT/OGE +// @islessgreater -> ordered-and-not-equal (ONE), reachable only here +// @isunordered -> synthesized isnan(a) || isnan(b) +// Each test fn returns 0 on success; __user_main sums them, so a nonzero exit +// pinpoints the first failing predicate. NaN is built from a runtime 0.0/0.0. +fn nan_via(x: f64): f64 { let z: f64 = x - x; return z / z; } + +fn test_isless(): i64 { + if @isless(1.0, 2.0) { } else { return 1; } // 1 < 2 -> true + if @isless(2.0, 1.0) { return 2; } // false + let n: f64 = nan_via(3.0); + if @isless(n, 1.0) { return 3; } // NaN ordered -> false + return 0; +} + +fn test_islessequal(): i64 { + if @islessequal(1.0, 1.0) { } else { return 1; } + if @islessequal(2.0, 1.0) { return 2; } + let n: f64 = nan_via(3.0); + if @islessequal(n, n) { return 3; } + return 0; +} + +fn test_isgreater(): i64 { + if @isgreater(2.0, 1.0) { } else { return 1; } + if @isgreater(1.0, 2.0) { return 2; } + let n: f64 = nan_via(3.0); + if @isgreater(n, 1.0) { return 3; } + return 0; +} + +fn test_isgreaterequal(): i64 { + if @isgreaterequal(2.0, 2.0) { } else { return 1; } + if @isgreaterequal(1.0, 2.0) { return 2; } + let n: f64 = nan_via(3.0); + if @isgreaterequal(n, 1.0) { return 3; } + return 0; +} + +fn test_islessgreater(): i64 { + if @islessgreater(1.0, 2.0) { } else { return 1; } // ordered & not equal -> true + if @islessgreater(2.0, 2.0) { return 2; } // equal -> false + let n: f64 = nan_via(3.0); + if @islessgreater(n, 1.0) { return 3; } // NaN ordered -> false + return 0; +} + +fn test_isunordered(): i64 { + let n: f64 = nan_via(3.0); + if @isunordered(n, 1.0) { } else { return 1; } // NaN -> true + if @isunordered(1.0, n) { } else { return 2; } + if @isunordered(1.0, 2.0) { return 3; } // both ordered -> false + return 0; +} + +fn __user_main(): i64 { + return test_isless() + test_islessequal() + test_isgreater() + + test_isgreaterequal() + test_islessgreater() + test_isunordered(); +} + +fn main(): i32 { return __user_main() as i32; } diff --git a/test/toy/cases/152_fp_cmp_builtins_a.wasm.skip b/test/toy/cases/152_fp_cmp_builtins_a.wasm.skip @@ -0,0 +1 @@ +wasm re-lowering of the discriminating FP predicates (ONE / the unordered duals synthesized by @isunordered) is not yet NaN-correct on the foundation; the same gap fails the untouched rv64_fp_nan_compare on lane W. Tracked separately from the toy Route A work. diff --git a/test/toy/cases/153_fp_cmp_negation_b.expected b/test/toy/cases/153_fp_cmp_negation_b.expected @@ -0,0 +1 @@ +0 diff --git a/test/toy/cases/153_fp_cmp_negation_b.toy b/test/toy/cases/153_fp_cmp_negation_b.toy @@ -0,0 +1,45 @@ +// Route B: negating a delayed FP compare reaches the unordered dual. +// !(a < b) == UGE !(a <= b) == UGT +// !(a > b) == ULE !(a >= b) == ULT +// On NaN the ordered relation is false, so its negation (the unordered dual) is +// true. Each test fn returns 0 on success; __user_main sums them so a nonzero +// exit pinpoints the first failing predicate. NaN is built from runtime 0.0/0.0. +fn nan_via(x: f64): f64 { let z: f64 = x - x; return z / z; } + +fn test_not_lt(): i64 { + if !(1.0 < 2.0) { return 1; } // !(true) -> false + if !(2.0 < 1.0) { } else { return 2; } // !(false) -> true + let n: f64 = nan_via(3.0); + if !(n < 1.0) { } else { return 3; } // UGE: NaN -> true + return 0; +} + +fn test_not_le(): i64 { + if !(1.0 <= 1.0) { return 1; } + if !(2.0 <= 1.0) { } else { return 2; } + let n: f64 = nan_via(3.0); + if !(n <= 1.0) { } else { return 3; } // UGT: NaN -> true + return 0; +} + +fn test_not_gt(): i64 { + if !(2.0 > 1.0) { return 1; } + if !(1.0 > 2.0) { } else { return 2; } + let n: f64 = nan_via(3.0); + if !(n > 1.0) { } else { return 3; } // ULE: NaN -> true + return 0; +} + +fn test_not_ge(): i64 { + if !(2.0 >= 1.0) { return 1; } + if !(1.0 >= 2.0) { } else { return 2; } + let n: f64 = nan_via(3.0); + if !(n >= 1.0) { } else { return 3; } // ULT: NaN -> true + return 0; +} + +fn __user_main(): i64 { + return test_not_lt() + test_not_le() + test_not_gt() + test_not_ge(); +} + +fn main(): i32 { return __user_main() as i32; } diff --git a/test/toy/cases/153_fp_cmp_negation_b.wasm.skip b/test/toy/cases/153_fp_cmp_negation_b.wasm.skip @@ -0,0 +1 @@ +wasm re-lowering of a negated FP compare (the unordered dual) is not yet correct on the foundation: `cfree run` on the .wasm hits a relower `local type mismatch (expected i64 got i32)` fatal. Reproduces with a minimal `!(1.0 < 2.0)` and no builtins, so it is a foundation-level gap, not toy Route B. Tracked separately.