kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 6da4e5eab3c298143aed3153325ca03e27c19d4a
parent 88f892c4c2d7df14d39dabecfbeb98b878e36b9b
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Fri, 29 May 2026 16:47:56 -0700

O1: deref through a frame-resident pointer local

A pointer local that lives in its frame home -- address-taken (&p), or
forced there by being an AGG_COPY/AGG_SET operand -- was used as an
OPK_INDIRECT base/index by reading local->storage.v.reg directly. For
FRAME storage that union member overlaps the frame-slot id, so the base
register was never defined: a wrong store/load target or an outright
crash at -O1/-O2. (The -O0 single-pass path already handled this via a
FRAME_VALUE base.)

Fix:
- Lower AGG_COPY/AGG_SET operands as values (lower_use_ops); only a
  non-pointer aggregate operand addresses its own slot
  (operand_uses_local_agg_addr), so a pointer operand stays in a register
  instead of being force-homed.
- Before emitting each instruction, load any FRAME pointer base/index
  from its home into a fresh reg (prematerialize_indirect_bases /
  resolve_indirect_base_reg) and resolve the indirect operand to it.

Adds C regression cases (test/parse/cases) that return the wrong value
or crash at -O1/-O2 without the fix, and pass at -O0/-O1/-O2 with it:
- 6_5_3_2_05_store_through_addr_taken_ptr  (store base; &p escapes+repointed)
- 6_5_3_2_06_addr_escape_then_field_stores (field stores; &p escapes; crashes)
- 6_5_16_07_struct_assign_through_ptr      (AGG_COPY force-home + field read)

Records the root cause in doc/BOOTSTRAP_O1.md.

Diffstat:
Mdoc/BOOTSTRAP_O1.md | 179+++++++++++++++++++++++++++++++++++++++++++++++++------------------------------
Msrc/opt/cg_ir_lower.c | 140+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
Atest/parse/cases/6_5_16_07_struct_assign_through_ptr.c | 20++++++++++++++++++++
Atest/parse/cases/6_5_16_07_struct_assign_through_ptr.expected | 1+
Atest/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.c | 23+++++++++++++++++++++++
Atest/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.expected | 1+
Atest/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.c | 23+++++++++++++++++++++++
Atest/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.expected | 1+
8 files changed, 308 insertions(+), 80 deletions(-)

diff --git a/doc/BOOTSTRAP_O1.md b/doc/BOOTSTRAP_O1.md @@ -84,74 +84,119 @@ conflict — the two emitters keep independent register models.) --- -## -O1 — OPEN: runtime entry-param-bind miscompile in `driver/cc.c` +## -O1 — deref-through-frame-pointer-local miscompile fixed (`src/opt/cg_ir_lower.c`) + +This fix took the `-O1` self-build from *crashing while compiling stage3's +`src/abi/abi.c`* to *compiling all of stage3*. toy is still **1034/0/8 at both +`-O0` and `-O1`** (toy runs `R`/`L` at both opt levels, so the suite exercises +this `-O1` path). `-O0` still reproduces. + +> ⚠️ The earlier "`cc_alloc_arrays` … missing x19 reload" root cause in this doc +> was a **`-g` artifact**: compiling the suspect TU with `-g` to get a named +> backtrace *perturbs codegen* and shifts/creates a *different* bug. The real +> stage2 crash compiling `abi.c` was in **`cc_fill_c_opts`** (`driver/cc.c`), +> and `cc_alloc_arrays` is in fact compiled correctly. **Reproduce and inspect +> without `-g`** (it changes the object: e.g. `driver/cc.o` sha differs with vs +> without `-g`). The clang build is deterministic across host opt level +> (`build/cfree` `-O0`+asan == `build/release/cfree` `-O1`), so `build/cfree cc +> -O1 …` reproduces exactly what stage1 feeds into stage2. + +### Root cause (two coordinated defects, both about pointer locals) +`cc_fill_c_opts(const CcOptions* o, …, CfreeCCompileOptions* copts)` does +`*copts = zero;` then a run of `copts->code.field = …;`. `copts` is a pointer +param whose **address is never taken** in C, so it should stay in a register. + +1. **AGG_COPY/AGG_SET force-homed the pointer.** `*copts = zero` lowers to an + `agg_copy` whose dest operand is `copts`. The aggregate ops take their operands + as pointer *values* (the emitter derefs an `OPK_LOCAL` pointer operand via + `pointer_addr_from_operand`), but `lower_one_inst` ran them through + `lower_addr_value_ops(naddr=1)`, so the dest went through `lower_operand_addr`, + which **force-creates a frame home** for any `OPK_LOCAL`. `local_address_used_in_cg_ir` + *also* flagged the same operand. Both together marked `copts` `address_taken` + ⇒ `CG_LOCAL_STORAGE_FRAME`. + +2. **A FRAME pointer local can't be an indirect base.** The scalar stores + `copts->field` lower to `store [copts+off]` (`OPK_INDIRECT`, base `copts`). + `lower_operand_addr`'s `OPK_INDIRECT` case set `out.v.ind.base = + base->storage.v.reg` — but `storage.v` is a **union** of `reg`/`frame_slot`, so + for a FRAME local `.v.reg` is garbage. The store base register was therefore + *never defined* (read uninitialized; flaky NULL / leftover path string ⇒ + segfault). `-O0`'s `nd_addr_pointer` handles this with a `FRAME_VALUE` base; + `-O1` did not. + +### Fix +- **(B)** Lower `AGG_COPY`/`AGG_SET` operands as **values** (`lower_use_ops`), and + only count a *non-pointer* agg operand as address-taken + (`operand_uses_local_agg_addr`) — a pointer operand uses the local's value, not + its slot. This keeps `copts` in a register. +- **(A)** `prematerialize_indirect_bases` runs before each instruction is emitted + and, for every `OPK_INDIRECT` base/index that *is* a FRAME local, emits a load + of the pointer from its home into a fresh PReg (or `addr_of` for a non-pointer + aggregate base — defensive, not yet observed); `resolve_indirect_base_reg` then + returns that PReg. This makes a genuinely address-taken pointer local + (`int **q = &p; p->f = …`) work too, not just the `copts` false-positive. + +Minimal repro of the genuine (A) case (was: undefined `x8` base; now: `ldr x8, +[home]` before each deref): +```c +typedef struct { int a,b; long d; } S; +void g(S* p, S*** out){ *out=&p; p->a=1; p->b=2; p->d=3; } +``` -The `-O1`-self-compiled **stage2** compiler **segfaults compiling stage3's -`abi.c`**. This is a *runtime* miscompile (wrong machine code in stage2), not a -compile-time panic. +--- + +## -O1 — STILL OPEN: stage2 emits malformed objects ⇒ stage3 **link** fails + +With the fix above, stage2 compiles every stage3 TU, but the **final stage3 link +segfaults** (`Makefile:424`). The objects themselves are malformed — *not* a +linker bug: -### Reproduce -``` -build/release/bootstrap/stage2/cc -O1 -DNDEBUG -ffunction-sections -fdata-sections \ - -std=c11 -ffreestanding -nostdinc -Irt/include -fvisibility=hidden \ - -Iinclude -Isrc -c src/abi/abi.c -o /tmp/x.o # exit 139 -``` -`build/cfree` is the clang build, so `build/cfree` compiling `driver/cc.c` at -`-O1` reproduces the same miscompiled object directly. - -### Localization -Bisected (hybrid relink harness, below) across all subsystems — -opt / cg / lang(c,cpp) / core / arch / obj / abi / api were all clean — down to -`driver/*` and finally to **`driver/cc.c`**, function **`cc_alloc_arrays`** -(line ~278). - -### Root cause -`cc_alloc_arrays(CcOptions* o, int argc)` makes ~14 sequential -`driver_alloc_zeroed(o->env, ...)` calls, so `o` is live across all of them and is -(correctly) allocated to a **callee-saved** register, x19. The prologue saves the -old x19 and stores the incoming arg (x0 = `o`) to a spill slot — but the body uses -**x19** as `o` and the move/reload that should populate x19 (from x0, or from the -spill slot) is **missing**. x19 therefore holds caller garbage (a leftover path -string), so `ldr x0, [x19]` reads junk for `o->env` and `driver_alloc_zeroed` -faults. - -Disassembly of `cc_alloc_arrays` (stage2, `-O1`): ``` -+292: str x19, [x29, #0x28] ; save old callee-saved x19 -+296: str x0, [x29, #0x20] ; store incoming `o` to a spill slot -... -+324: str x9, [x19, #0x10] ; <-- uses x19 as `o`, but x19 was NEVER loaded -+328: ldr x0, [x19] ; o->env (x19 = garbage -> crash downstream) -+33c: bl driver_alloc_zeroed +# the clang-built (correct) linker rejects stage2-compiled stage3 objects too: +build/release/cfree cc <captured stage3 link args, output to build/tmp> + → fatal: link: LDST32_ABS_LO12_NC misaligned address (kind=27 S=0x10048ea26 A=0 …) +# stage2's own ld is flaky on the same input: usually SIGSEGV, sometimes +# "link_emit_macho: entry symbol below __TEXT base" (src/obj/macho/link.c:2410) ``` -`aa_bind_native_param` (`src/arch/aa64/native.c` ~3498) is correct for a register -destination (it emits `mov d, src`). The defect is **upstream**: the allocator -gave `o`'s *param storage* a frame slot while a body value (a copy of `o`) lives -in x19, and the connecting reload (`ldr x19, [slot]`) was elided. So the hunt is -in the param-storage / copy-reload path at `-O1`, not in the aa64 bind code. - -### This bug was *exposed*, not caused, by commit `b520142` -With the pre-`b520142` compiler, `driver/cc.c` couldn't be compiled at `-O1` at -all — it hit the interference verifier (block 103, op 15 / BINOP). Fix #2 above -correctly routes the live-across-call param away from caller-saved x0 to -callee-saved x19, which surfaced the latent param-bind defect. It is **not** a -verifier issue and the verifier must not be relaxed. - -### Bisection harness (how the file was found) -The release stage2 links **only** via cfree's own `ld` (Apple `ld` asserts on -cfree objects; the stage2 binary's own `ld` also crashes once it is itself -miscompiled). To build a hybrid stage2 with a chosen subset of TUs at `-O0`: - -1. Replay each TU's exact stage2 compile command from the bootstrap build log, - swapping `-O1`→`-O0` (flags **must** match the original or `.build-config` - changes and forces a full recompile, clobbering the swap). -2. Relink by replaying the captured `ld -r … -o …/libcfree.o`, - `ar rcs …/libcfree.a …`, and the final `stage1/cc … -o …/stage2/cfree` - commands (grep them out of the build log). -3. Smoke-test: `stage2/cc … -c src/abi/abi.c -o /tmp/x.o` (139 = still crashes, - 0 = the buggy TU is now in the `-O0` set). - -For a named backtrace: drop `-Wl,-S` on the final link and compile the suspect TU -with `-g`. cfree emits symbols only for **non-static** functions, so static -functions bucket under the previous global symbol — `-g` line info is needed to -identify them. +An `LDST32_ABS_LO12_NC` reloc (32-bit ldr/str `:lo12:`) targets a **2-aligned** +symbol — the scaled imm12 can't encode a non-4-aligned offset. So a 32-bit access +references a symbol cfree placed at a misaligned address. + +### What is known +- This is a **separate, latent `-O1` codegen bug** (not A/B); it was masked by the + earlier `abi.c` crash. toy does **not** trigger it — the triggering C pattern is + in the compiler's own sources, not in toy programs. +- It is a **stage2 self-miscompile**: cfree's `-O1` codegen is deterministic + across host opt (`build/cfree` == `build/release/cfree` per-TU), and + `build/release/cfree`(=stage1) output == `stage2/lib/*.o`. But **stage2 (cfree + built by stage1) produces different output than the reference** for some TUs — + i.e. cfree's `-O1` machine code for *some codegen function* is wrong, so stage2 + miscompiles. Per-object check `build/cfree cc -O1 … -c <src>` vs + `stage3/lib/<src>.o`: **DIFFERS** for `cg/data.c`, `cg/type.c`, `cg/value.c`; + **SAME** for `core/arena.c`, `core/vec.c`, `cg/local.c`. So it is one (or few) + codegen function(s) hit by a specific pattern, not a universal break. +- Bisection so far (swap stage2 codegen TUs to `-O0`, then test + `stage2(cg/data.c) == build/cfree(cg/data.c)`): **not** fixed by `opt/*` at + `-O0`, **not** by `arch/aa64/* + arch/mc.c + arch/cgtarget.c` at `-O0`. The + buggy TU is therefore elsewhere — likely `cg/*` or `obj/*`. + +### Harnesses (left in `/tmp` from this session; regenerate from a fresh log) +- `bisect2.sh <src…>`: recompile listed TUs at `-O0` into stage2, relink stage2 + (`s2_ldr.sh`,`s2_ar.sh`,`s2_link.sh`), then compare `stage2 cc -O1 -c + src/cg/data.c` to `build/cfree`'s output. Prints `FIXED`/`STILL WRONG`. This is + the right oracle — the **clang-built `build/release/cfree` linker reproduces the + malformed-object rejection deterministically**, which is far cleaner than + stage2's flaky `ld`. +- For function-level IR inspection, re-add a `CFREE_DUMPFN=<substr>` filter to + `opt_dbg_dump`/`opt_dbg_dump_cg` in `src/opt/opt.c` (resolve the func name via + `pool_slice(o->c->global, obj_symbol_get(o->target->obj, f->desc.sym)->name)`) + and dump at tag `pre-emit` (final MIR) / `entry` (clean IR). Invaluable; it is + how `cc_fill_c_opts`'s use-before-def of the base PReg was found. + +### Old bisection harness notes (still apply) +Replay each stage2 TU's exact compile command, swapping `-O1`→`-O0` (flags **must** +match or `.build-config` changes force a full recompile). The release stage2 links +only via cfree's own `ld`; but for triage prefer the clang-built +`build/release/cfree` as the linker — it rejects the malformed objects with a +precise message instead of crashing. Avoid `-g` for codegen triage — it perturbs +the output. diff --git a/src/opt/cg_ir_lower.c b/src/opt/cg_ir_lower.c @@ -22,6 +22,10 @@ typedef struct OptLocalMap { u8 pad[2]; } OptLocalMap; +/* Per-instruction record of pointer locals whose value was loaded from their + * frame home into a fresh PReg so they can serve as an indirect-addressing base + * (see frame_indirect_base_reg). Reset for each lowered instruction. */ +#define CG_IR_LOWER_MAX_MAT 8u typedef struct CgIrLower { Compiler* c; const CgIrFunc* src; @@ -32,6 +36,9 @@ typedef struct CgIrLower { u32 nlabels; u32* inst_block; u8* leader; + CGLocal mat_local[CG_IR_LOWER_MAX_MAT]; + Reg mat_reg[CG_IR_LOWER_MAX_MAT]; + u32 nmat; } CgIrLower; static _Noreturn void lower_panic(CgIrLower* l, SrcLoc loc, const char* msg) { @@ -106,7 +113,22 @@ static int operand_uses_local_addr(const Operand* op, CGLocal local) { return 0; } -static int local_address_used_in_cg_ir(const CgIrFunc* f, CGLocal local) { +/* AGG_COPY/AGG_SET take their dest/src as *pointer values* to the aggregate — + * the emitter derefs an OPK_LOCAL pointer operand via pointer_addr_from_operand + * (it loads the pointer; it does not address the local's own slot). So a + * pointer-typed local operand of an aggregate op uses the local's VALUE, not + * its address, and must not force the local to a frame home. Only a non-pointer + * operand (the aggregate-typed local itself) genuinely addresses its storage. + * (STORE/LOAD/ADDR_OF use addr_from_operand, where an OPK_LOCAL always + * addresses the slot, so they keep operand_uses_local_addr.) */ +static int operand_uses_local_agg_addr(Compiler* c, const Operand* op, + CGLocal local) { + if (!op || op->kind != OPK_LOCAL || op->v.local != local) return 0; + return !cg_type_is_ptr(c, op->type); +} + +static int local_address_used_in_cg_ir(Compiler* c, const CgIrFunc* f, + CGLocal local) { for (u32 i = 0; i < f->ninsts; ++i) { const CgIrInst* in = &f->insts[i]; switch ((CgIrOp)in->op) { @@ -116,19 +138,24 @@ static int local_address_used_in_cg_ir(const CgIrFunc* f, CGLocal local) { return 1; break; case CG_IR_STORE: - case CG_IR_AGG_SET: case CG_IR_BITFIELD_STORE: if (in->nopnds > 0u && operand_uses_local_addr(&in->opnds[0], local)) return 1; break; + case CG_IR_AGG_SET: + if (in->nopnds > 0u && + operand_uses_local_agg_addr(c, &in->opnds[0], local)) + return 1; + break; case CG_IR_ADDR_OF: if (in->nopnds > 1u && operand_uses_local_addr(&in->opnds[1], local)) return 1; break; case CG_IR_AGG_COPY: if ((in->nopnds > 0u && - operand_uses_local_addr(&in->opnds[0], local)) || - (in->nopnds > 1u && operand_uses_local_addr(&in->opnds[1], local))) + operand_uses_local_agg_addr(c, &in->opnds[0], local)) || + (in->nopnds > 1u && + operand_uses_local_agg_addr(c, &in->opnds[1], local))) return 1; break; /* VA_START/VA_ARG/VA_END/VA_COPY consume a pointer *value* (the address of @@ -159,7 +186,8 @@ static void lower_locals(CgIrLower* l) { /* Aggregates and oversized scalars cannot live in a single PReg; they need * a memory home regardless of whether their address is taken. */ m->address_taken = - local_needs_home(in) || local_address_used_in_cg_ir(l->src, in->id) || + local_needs_home(in) || + local_address_used_in_cg_ir(l->c, l->src, in->id) || cg_type_is_aggregate(l->c, in->desc.type) || cg_type_size(l->c, in->desc.type) > 8u; @@ -447,6 +475,84 @@ static OptOperand opt_frame_operand(OptLocalMap* m) { return out; } +/* Base/index register for an OPK_INDIRECT whose base is a local. A REG-storage + * local supplies its value register directly. A FRAME-storage local (its + * address was taken, e.g. `int **q = &p; p->f = ...`) holds the pointer value + * in its frame home, so storage.v.reg is meaningless; load the home into a + * fresh PReg. prematerialize_indirect_bases emits that load before the using + * instruction; here we just look the result up (l->mat_*). */ +static Reg resolve_indirect_base_reg(CgIrLower* l, CGLocal local, SrcLoc loc) { + OptLocalMap* m = local_map(l, local, loc); + if (m->storage.kind == CG_LOCAL_STORAGE_REG) return m->storage.v.reg; + for (u32 i = 0; i < l->nmat; ++i) + if (l->mat_local[i] == local) return l->mat_reg[i]; + lower_panic(l, loc, "indirect base local not materialized"); +} + +/* Emit `r = load <local home>` once per instruction for each FRAME-storage + * local used as an OPK_INDIRECT base/index, recording r in l->mat_*. Must run + * before the consuming instruction is emitted so the load dominates its uses. */ +static void materialize_frame_base(CgIrLower* l, u32 block, CGLocal local, + SrcLoc loc) { + OptLocalMap* m = local_map(l, local, loc); + if (m->storage.kind == CG_LOCAL_STORAGE_REG) return; + for (u32 i = 0; i < l->nmat; ++i) + if (l->mat_local[i] == local) return; + if (l->nmat >= CG_IR_LOWER_MAX_MAT) + lower_panic(l, loc, "too many frame indirect bases in one instruction"); + PReg r = ir_alloc_preg(l->f, m->type, RC_INT); + OptOperand ops[2]; + ops[1] = opt_frame_operand(m); + if (cg_type_is_ptr(l->c, m->type)) { + /* The local *holds* a pointer; load that value to use as the base. */ + Inst* ld = ir_emit(l->f, block, IR_LOAD); + ld->loc = loc; + memset(&ops[0], 0, sizeof ops[0]); + ops[0].kind = OPK_REG; + ops[0].cls = RC_INT; + ops[0].type = m->type; + ops[0].v.reg = (Reg)r; + ld->opnds = dup_opt_ops(l, ops, 2); + ld->nopnds = 2; + ld->def = (Val)r; + ld->type = m->type; + memset(&ld->extra.mem, 0, sizeof ld->extra.mem); + ld->extra.mem.type = m->type; + ld->extra.mem.size = m->size ? m->size : 8u; + ld->extra.mem.align = m->align ? m->align : 8u; + } else { + /* The local *is* the storage; its frame address is the base. */ + Inst* ao = ir_emit(l->f, block, IR_ADDR_OF); + ao->loc = loc; + memset(&ops[0], 0, sizeof ops[0]); + ops[0].kind = OPK_REG; + ops[0].cls = RC_INT; + ops[0].type = m->type; + ops[0].v.reg = (Reg)r; + ao->opnds = dup_opt_ops(l, ops, 2); + ao->nopnds = 2; + ao->def = (Val)r; + ao->type = m->type; + } + l->mat_local[l->nmat] = local; + l->mat_reg[l->nmat] = (Reg)r; + l->nmat++; +} + +/* Scan the CG instruction's operands for OPK_INDIRECT bases/indices that are + * FRAME-storage locals and pre-load them (see materialize_frame_base). */ +static void prematerialize_indirect_bases(CgIrLower* l, const CgIrInst* in, + u32 block) { + l->nmat = 0; + for (u32 i = 0; i < in->nopnds; ++i) { + const Operand* op = &in->opnds[i]; + if (op->kind != OPK_INDIRECT) continue; + materialize_frame_base(l, block, op->v.ind.base, in->loc); + if (op->v.ind.index != CG_LOCAL_NONE) + materialize_frame_base(l, block, op->v.ind.index, in->loc); + } +} + static OptOperand lower_operand_value(CgIrLower* l, const Operand* in, SrcLoc loc); @@ -479,15 +585,12 @@ static OptOperand lower_operand_addr(CgIrLower* l, const Operand* in, out.v.global.addend = in->v.global.addend; return out; case OPK_INDIRECT: { - OptLocalMap* base = local_map(l, in->v.ind.base, loc); out.kind = OPK_INDIRECT; out.cls = RC_INT; - out.v.ind.base = base->storage.v.reg; + out.v.ind.base = resolve_indirect_base_reg(l, in->v.ind.base, loc); out.v.ind.index = REG_NONE; - if (in->v.ind.index != CG_LOCAL_NONE) { - OptLocalMap* idx = local_map(l, in->v.ind.index, loc); - out.v.ind.index = idx->storage.v.reg; - } + if (in->v.ind.index != CG_LOCAL_NONE) + out.v.ind.index = resolve_indirect_base_reg(l, in->v.ind.index, loc); out.v.ind.log2_scale = in->v.ind.log2_scale; out.v.ind.ofs = in->v.ind.ofs; return out; @@ -826,6 +929,9 @@ static void lower_one_inst(CgIrLower* l, u32 idx) { op = IR_NOP; break; } + /* Pre-load any FRAME-resident pointer locals used as indirect bases so the + * load dominates this instruction (which is emitted next). */ + prematerialize_indirect_bases(l, in, block); out = ir_emit(l->f, block, op); out->loc = in->loc; switch ((CgIrOp)in->op) { @@ -872,8 +978,6 @@ static void lower_one_inst(CgIrLower* l, u32 idx) { break; } case CG_IR_STORE: - case CG_IR_AGG_COPY: - case CG_IR_AGG_SET: case CG_IR_BITFIELD_STORE: lower_addr_value_ops(l, out, in, 1, in->nopnds - 1u); if ((CgIrOp)in->op == CG_IR_STORE) @@ -881,6 +985,16 @@ static void lower_one_inst(CgIrLower* l, u32 idx) { else out->extra.aux = in->extra.aux; break; + case CG_IR_AGG_COPY: + case CG_IR_AGG_SET: + /* Aggregate ops take their operands as pointer *values* to the aggregates + * (the emitter derefs them via pointer_addr_from_operand). Lowering them + * as values keeps a pointer local in its register instead of forcing a + * frame home — the home would otherwise break the local's other uses as + * an indirect base, whose lowering reads storage.v.reg. */ + lower_use_ops(l, out, in, in->nopnds); + out->extra.aux = in->extra.aux; + break; case CG_IR_ATOMIC_STORE: { OptOperand ops[2]; ops[0] = lower_operand_value(l, &in->opnds[0], in->loc); diff --git a/test/parse/cases/6_5_16_07_struct_assign_through_ptr.c b/test/parse/cases/6_5_16_07_struct_assign_through_ptr.c @@ -0,0 +1,20 @@ +/* Aggregate assignment through a pointer parameter used to force the pointer + into a frame home (its operand was lowered as an address, not a value). A + field read on the same pointer then used the garbage frame slot as a register + base, returning the wrong field at -O1/-O2. */ +typedef struct { + int x, y, z, w; +} S; +S src = {10, 20, 30, 40}; + +__attribute__((noinline)) static int f(S *p) { + *p = src; /* AGG_COPY, dst operand = pointer p */ + return p->y; /* field read: indirect base = p */ +} + +int test_main(void) { + S d; + if (f(&d) != 20) return 1; + if (d.x != 10 || d.w != 40) return 2; + return 42; +} diff --git a/test/parse/cases/6_5_16_07_struct_assign_through_ptr.expected b/test/parse/cases/6_5_16_07_struct_assign_through_ptr.expected @@ -0,0 +1 @@ +42 diff --git a/test/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.c b/test/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.c @@ -0,0 +1,23 @@ +/* A pointer local whose address is taken lives in its frame home. After it is + repointed through that taken address, a store through it must reload the + updated pointer from the home, not reuse a stale register holding the + original argument. Regressed at -O1/-O2 when an indirect base that is a + frame-resident local was read straight out of its (garbage) register slot. */ +int g = 0; + +__attribute__((noinline)) static void repoint(int **pp) { *pp = &g; } + +__attribute__((noinline)) static int via(int *q) { + int *p = q; + repoint(&p); /* &p taken -> p is frame-resident; p now points at g */ + *p = 99; /* store through p: indirect base = p */ + return g; +} + +int test_main(void) { + int local = 7; + if (via(&local) != 99) return 1; + if (g != 99) return 2; + if (local != 7) return 3; /* store must not hit the stale &local */ + return 42; +} diff --git a/test/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.expected b/test/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.expected @@ -0,0 +1 @@ +42 diff --git a/test/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.c b/test/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.c @@ -0,0 +1,23 @@ +/* The address of a pointer parameter escapes (forcing it into a frame home), + then the pointer is used as the base of several field stores. Reading that + frame-resident base out of its register slot yields an undefined base + register -- a wrong store target or an outright crash at -O1/-O2. */ +typedef struct { + int a, b; + long d; +} S; + +__attribute__((noinline)) static void g(S *p, S ***out) { + *out = &p; /* &p escapes -> p is frame-resident */ + p->a = 1; /* field stores: indirect base = p */ + p->b = 2; + p->d = 3; +} + +int test_main(void) { + S s; + S **pp; + g(&s, &pp); + if (s.a != 1 || s.b != 2 || s.d != 3) return 1; + return 42; +} diff --git a/test/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.expected b/test/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.expected @@ -0,0 +1 @@ +42