commit 6da4e5eab3c298143aed3153325ca03e27c19d4a
parent 88f892c4c2d7df14d39dabecfbeb98b878e36b9b
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Fri, 29 May 2026 16:47:56 -0700
O1: deref through a frame-resident pointer local
A pointer local that lives in its frame home -- address-taken (&p), or
forced there by being an AGG_COPY/AGG_SET operand -- was used as an
OPK_INDIRECT base/index by reading local->storage.v.reg directly. For
FRAME storage that union member overlaps the frame-slot id, so the base
register was never defined: a wrong store/load target or an outright
crash at -O1/-O2. (The -O0 single-pass path already handled this via a
FRAME_VALUE base.)
Fix:
- Lower AGG_COPY/AGG_SET operands as values (lower_use_ops); only a
non-pointer aggregate operand addresses its own slot
(operand_uses_local_agg_addr), so a pointer operand stays in a register
instead of being force-homed.
- Before emitting each instruction, load any FRAME pointer base/index
from its home into a fresh reg (prematerialize_indirect_bases /
resolve_indirect_base_reg) and resolve the indirect operand to it.
Adds C regression cases (test/parse/cases) that return the wrong value
or crash at -O1/-O2 without the fix, and pass at -O0/-O1/-O2 with it:
- 6_5_3_2_05_store_through_addr_taken_ptr (store base; &p escapes+repointed)
- 6_5_3_2_06_addr_escape_then_field_stores (field stores; &p escapes; crashes)
- 6_5_16_07_struct_assign_through_ptr (AGG_COPY force-home + field read)
Records the root cause in doc/BOOTSTRAP_O1.md.
Diffstat:
8 files changed, 308 insertions(+), 80 deletions(-)
diff --git a/doc/BOOTSTRAP_O1.md b/doc/BOOTSTRAP_O1.md
@@ -84,74 +84,119 @@ conflict — the two emitters keep independent register models.)
---
-## -O1 — OPEN: runtime entry-param-bind miscompile in `driver/cc.c`
+## -O1 — deref-through-frame-pointer-local miscompile fixed (`src/opt/cg_ir_lower.c`)
+
+This fix took the `-O1` self-build from *crashing while compiling stage3's
+`src/abi/abi.c`* to *compiling all of stage3*. toy is still **1034/0/8 at both
+`-O0` and `-O1`** (toy runs `R`/`L` at both opt levels, so the suite exercises
+this `-O1` path). `-O0` still reproduces.
+
+> ⚠️ The earlier "`cc_alloc_arrays` … missing x19 reload" root cause in this doc
+> was a **`-g` artifact**: compiling the suspect TU with `-g` to get a named
+> backtrace *perturbs codegen* and shifts/creates a *different* bug. The real
+> stage2 crash compiling `abi.c` was in **`cc_fill_c_opts`** (`driver/cc.c`),
+> and `cc_alloc_arrays` is in fact compiled correctly. **Reproduce and inspect
+> without `-g`** (it changes the object: e.g. `driver/cc.o` sha differs with vs
+> without `-g`). The clang build is deterministic across host opt level
+> (`build/cfree` `-O0`+asan == `build/release/cfree` `-O1`), so `build/cfree cc
+> -O1 …` reproduces exactly what stage1 feeds into stage2.
+
+### Root cause (two coordinated defects, both about pointer locals)
+`cc_fill_c_opts(const CcOptions* o, …, CfreeCCompileOptions* copts)` does
+`*copts = zero;` then a run of `copts->code.field = …;`. `copts` is a pointer
+param whose **address is never taken** in C, so it should stay in a register.
+
+1. **AGG_COPY/AGG_SET force-homed the pointer.** `*copts = zero` lowers to an
+ `agg_copy` whose dest operand is `copts`. The aggregate ops take their operands
+ as pointer *values* (the emitter derefs an `OPK_LOCAL` pointer operand via
+ `pointer_addr_from_operand`), but `lower_one_inst` ran them through
+ `lower_addr_value_ops(naddr=1)`, so the dest went through `lower_operand_addr`,
+ which **force-creates a frame home** for any `OPK_LOCAL`. `local_address_used_in_cg_ir`
+ *also* flagged the same operand. Both together marked `copts` `address_taken`
+ ⇒ `CG_LOCAL_STORAGE_FRAME`.
+
+2. **A FRAME pointer local can't be an indirect base.** The scalar stores
+ `copts->field` lower to `store [copts+off]` (`OPK_INDIRECT`, base `copts`).
+ `lower_operand_addr`'s `OPK_INDIRECT` case set `out.v.ind.base =
+ base->storage.v.reg` — but `storage.v` is a **union** of `reg`/`frame_slot`, so
+ for a FRAME local `.v.reg` is garbage. The store base register was therefore
+ *never defined* (read uninitialized; flaky NULL / leftover path string ⇒
+ segfault). `-O0`'s `nd_addr_pointer` handles this with a `FRAME_VALUE` base;
+ `-O1` did not.
+
+### Fix
+- **(B)** Lower `AGG_COPY`/`AGG_SET` operands as **values** (`lower_use_ops`), and
+ only count a *non-pointer* agg operand as address-taken
+ (`operand_uses_local_agg_addr`) — a pointer operand uses the local's value, not
+ its slot. This keeps `copts` in a register.
+- **(A)** `prematerialize_indirect_bases` runs before each instruction is emitted
+ and, for every `OPK_INDIRECT` base/index that *is* a FRAME local, emits a load
+ of the pointer from its home into a fresh PReg (or `addr_of` for a non-pointer
+ aggregate base — defensive, not yet observed); `resolve_indirect_base_reg` then
+ returns that PReg. This makes a genuinely address-taken pointer local
+ (`int **q = &p; p->f = …`) work too, not just the `copts` false-positive.
+
+Minimal repro of the genuine (A) case (was: undefined `x8` base; now: `ldr x8,
+[home]` before each deref):
+```c
+typedef struct { int a,b; long d; } S;
+void g(S* p, S*** out){ *out=&p; p->a=1; p->b=2; p->d=3; }
+```
-The `-O1`-self-compiled **stage2** compiler **segfaults compiling stage3's
-`abi.c`**. This is a *runtime* miscompile (wrong machine code in stage2), not a
-compile-time panic.
+---
+
+## -O1 — STILL OPEN: stage2 emits malformed objects ⇒ stage3 **link** fails
+
+With the fix above, stage2 compiles every stage3 TU, but the **final stage3 link
+segfaults** (`Makefile:424`). The objects themselves are malformed — *not* a
+linker bug:
-### Reproduce
-```
-build/release/bootstrap/stage2/cc -O1 -DNDEBUG -ffunction-sections -fdata-sections \
- -std=c11 -ffreestanding -nostdinc -Irt/include -fvisibility=hidden \
- -Iinclude -Isrc -c src/abi/abi.c -o /tmp/x.o # exit 139
-```
-`build/cfree` is the clang build, so `build/cfree` compiling `driver/cc.c` at
-`-O1` reproduces the same miscompiled object directly.
-
-### Localization
-Bisected (hybrid relink harness, below) across all subsystems —
-opt / cg / lang(c,cpp) / core / arch / obj / abi / api were all clean — down to
-`driver/*` and finally to **`driver/cc.c`**, function **`cc_alloc_arrays`**
-(line ~278).
-
-### Root cause
-`cc_alloc_arrays(CcOptions* o, int argc)` makes ~14 sequential
-`driver_alloc_zeroed(o->env, ...)` calls, so `o` is live across all of them and is
-(correctly) allocated to a **callee-saved** register, x19. The prologue saves the
-old x19 and stores the incoming arg (x0 = `o`) to a spill slot — but the body uses
-**x19** as `o` and the move/reload that should populate x19 (from x0, or from the
-spill slot) is **missing**. x19 therefore holds caller garbage (a leftover path
-string), so `ldr x0, [x19]` reads junk for `o->env` and `driver_alloc_zeroed`
-faults.
-
-Disassembly of `cc_alloc_arrays` (stage2, `-O1`):
```
-+292: str x19, [x29, #0x28] ; save old callee-saved x19
-+296: str x0, [x29, #0x20] ; store incoming `o` to a spill slot
-...
-+324: str x9, [x19, #0x10] ; <-- uses x19 as `o`, but x19 was NEVER loaded
-+328: ldr x0, [x19] ; o->env (x19 = garbage -> crash downstream)
-+33c: bl driver_alloc_zeroed
+# the clang-built (correct) linker rejects stage2-compiled stage3 objects too:
+build/release/cfree cc <captured stage3 link args, output to build/tmp>
+ → fatal: link: LDST32_ABS_LO12_NC misaligned address (kind=27 S=0x10048ea26 A=0 …)
+# stage2's own ld is flaky on the same input: usually SIGSEGV, sometimes
+# "link_emit_macho: entry symbol below __TEXT base" (src/obj/macho/link.c:2410)
```
-`aa_bind_native_param` (`src/arch/aa64/native.c` ~3498) is correct for a register
-destination (it emits `mov d, src`). The defect is **upstream**: the allocator
-gave `o`'s *param storage* a frame slot while a body value (a copy of `o`) lives
-in x19, and the connecting reload (`ldr x19, [slot]`) was elided. So the hunt is
-in the param-storage / copy-reload path at `-O1`, not in the aa64 bind code.
-
-### This bug was *exposed*, not caused, by commit `b520142`
-With the pre-`b520142` compiler, `driver/cc.c` couldn't be compiled at `-O1` at
-all — it hit the interference verifier (block 103, op 15 / BINOP). Fix #2 above
-correctly routes the live-across-call param away from caller-saved x0 to
-callee-saved x19, which surfaced the latent param-bind defect. It is **not** a
-verifier issue and the verifier must not be relaxed.
-
-### Bisection harness (how the file was found)
-The release stage2 links **only** via cfree's own `ld` (Apple `ld` asserts on
-cfree objects; the stage2 binary's own `ld` also crashes once it is itself
-miscompiled). To build a hybrid stage2 with a chosen subset of TUs at `-O0`:
-
-1. Replay each TU's exact stage2 compile command from the bootstrap build log,
- swapping `-O1`→`-O0` (flags **must** match the original or `.build-config`
- changes and forces a full recompile, clobbering the swap).
-2. Relink by replaying the captured `ld -r … -o …/libcfree.o`,
- `ar rcs …/libcfree.a …`, and the final `stage1/cc … -o …/stage2/cfree`
- commands (grep them out of the build log).
-3. Smoke-test: `stage2/cc … -c src/abi/abi.c -o /tmp/x.o` (139 = still crashes,
- 0 = the buggy TU is now in the `-O0` set).
-
-For a named backtrace: drop `-Wl,-S` on the final link and compile the suspect TU
-with `-g`. cfree emits symbols only for **non-static** functions, so static
-functions bucket under the previous global symbol — `-g` line info is needed to
-identify them.
+An `LDST32_ABS_LO12_NC` reloc (32-bit ldr/str `:lo12:`) targets a **2-aligned**
+symbol — the scaled imm12 can't encode a non-4-aligned offset. So a 32-bit access
+references a symbol cfree placed at a misaligned address.
+
+### What is known
+- This is a **separate, latent `-O1` codegen bug** (not A/B); it was masked by the
+ earlier `abi.c` crash. toy does **not** trigger it — the triggering C pattern is
+ in the compiler's own sources, not in toy programs.
+- It is a **stage2 self-miscompile**: cfree's `-O1` codegen is deterministic
+ across host opt (`build/cfree` == `build/release/cfree` per-TU), and
+ `build/release/cfree`(=stage1) output == `stage2/lib/*.o`. But **stage2 (cfree
+ built by stage1) produces different output than the reference** for some TUs —
+ i.e. cfree's `-O1` machine code for *some codegen function* is wrong, so stage2
+ miscompiles. Per-object check `build/cfree cc -O1 … -c <src>` vs
+ `stage3/lib/<src>.o`: **DIFFERS** for `cg/data.c`, `cg/type.c`, `cg/value.c`;
+ **SAME** for `core/arena.c`, `core/vec.c`, `cg/local.c`. So it is one (or few)
+ codegen function(s) hit by a specific pattern, not a universal break.
+- Bisection so far (swap stage2 codegen TUs to `-O0`, then test
+ `stage2(cg/data.c) == build/cfree(cg/data.c)`): **not** fixed by `opt/*` at
+ `-O0`, **not** by `arch/aa64/* + arch/mc.c + arch/cgtarget.c` at `-O0`. The
+ buggy TU is therefore elsewhere — likely `cg/*` or `obj/*`.
+
+### Harnesses (left in `/tmp` from this session; regenerate from a fresh log)
+- `bisect2.sh <src…>`: recompile listed TUs at `-O0` into stage2, relink stage2
+ (`s2_ldr.sh`,`s2_ar.sh`,`s2_link.sh`), then compare `stage2 cc -O1 -c
+ src/cg/data.c` to `build/cfree`'s output. Prints `FIXED`/`STILL WRONG`. This is
+ the right oracle — the **clang-built `build/release/cfree` linker reproduces the
+ malformed-object rejection deterministically**, which is far cleaner than
+ stage2's flaky `ld`.
+- For function-level IR inspection, re-add a `CFREE_DUMPFN=<substr>` filter to
+ `opt_dbg_dump`/`opt_dbg_dump_cg` in `src/opt/opt.c` (resolve the func name via
+ `pool_slice(o->c->global, obj_symbol_get(o->target->obj, f->desc.sym)->name)`)
+ and dump at tag `pre-emit` (final MIR) / `entry` (clean IR). Invaluable; it is
+ how `cc_fill_c_opts`'s use-before-def of the base PReg was found.
+
+### Old bisection harness notes (still apply)
+Replay each stage2 TU's exact compile command, swapping `-O1`→`-O0` (flags **must**
+match or `.build-config` changes force a full recompile). The release stage2 links
+only via cfree's own `ld`; but for triage prefer the clang-built
+`build/release/cfree` as the linker — it rejects the malformed objects with a
+precise message instead of crashing. Avoid `-g` for codegen triage — it perturbs
+the output.
diff --git a/src/opt/cg_ir_lower.c b/src/opt/cg_ir_lower.c
@@ -22,6 +22,10 @@ typedef struct OptLocalMap {
u8 pad[2];
} OptLocalMap;
+/* Per-instruction record of pointer locals whose value was loaded from their
+ * frame home into a fresh PReg so they can serve as an indirect-addressing base
+ * (see frame_indirect_base_reg). Reset for each lowered instruction. */
+#define CG_IR_LOWER_MAX_MAT 8u
typedef struct CgIrLower {
Compiler* c;
const CgIrFunc* src;
@@ -32,6 +36,9 @@ typedef struct CgIrLower {
u32 nlabels;
u32* inst_block;
u8* leader;
+ CGLocal mat_local[CG_IR_LOWER_MAX_MAT];
+ Reg mat_reg[CG_IR_LOWER_MAX_MAT];
+ u32 nmat;
} CgIrLower;
static _Noreturn void lower_panic(CgIrLower* l, SrcLoc loc, const char* msg) {
@@ -106,7 +113,22 @@ static int operand_uses_local_addr(const Operand* op, CGLocal local) {
return 0;
}
-static int local_address_used_in_cg_ir(const CgIrFunc* f, CGLocal local) {
+/* AGG_COPY/AGG_SET take their dest/src as *pointer values* to the aggregate —
+ * the emitter derefs an OPK_LOCAL pointer operand via pointer_addr_from_operand
+ * (it loads the pointer; it does not address the local's own slot). So a
+ * pointer-typed local operand of an aggregate op uses the local's VALUE, not
+ * its address, and must not force the local to a frame home. Only a non-pointer
+ * operand (the aggregate-typed local itself) genuinely addresses its storage.
+ * (STORE/LOAD/ADDR_OF use addr_from_operand, where an OPK_LOCAL always
+ * addresses the slot, so they keep operand_uses_local_addr.) */
+static int operand_uses_local_agg_addr(Compiler* c, const Operand* op,
+ CGLocal local) {
+ if (!op || op->kind != OPK_LOCAL || op->v.local != local) return 0;
+ return !cg_type_is_ptr(c, op->type);
+}
+
+static int local_address_used_in_cg_ir(Compiler* c, const CgIrFunc* f,
+ CGLocal local) {
for (u32 i = 0; i < f->ninsts; ++i) {
const CgIrInst* in = &f->insts[i];
switch ((CgIrOp)in->op) {
@@ -116,19 +138,24 @@ static int local_address_used_in_cg_ir(const CgIrFunc* f, CGLocal local) {
return 1;
break;
case CG_IR_STORE:
- case CG_IR_AGG_SET:
case CG_IR_BITFIELD_STORE:
if (in->nopnds > 0u && operand_uses_local_addr(&in->opnds[0], local))
return 1;
break;
+ case CG_IR_AGG_SET:
+ if (in->nopnds > 0u &&
+ operand_uses_local_agg_addr(c, &in->opnds[0], local))
+ return 1;
+ break;
case CG_IR_ADDR_OF:
if (in->nopnds > 1u && operand_uses_local_addr(&in->opnds[1], local))
return 1;
break;
case CG_IR_AGG_COPY:
if ((in->nopnds > 0u &&
- operand_uses_local_addr(&in->opnds[0], local)) ||
- (in->nopnds > 1u && operand_uses_local_addr(&in->opnds[1], local)))
+ operand_uses_local_agg_addr(c, &in->opnds[0], local)) ||
+ (in->nopnds > 1u &&
+ operand_uses_local_agg_addr(c, &in->opnds[1], local)))
return 1;
break;
/* VA_START/VA_ARG/VA_END/VA_COPY consume a pointer *value* (the address of
@@ -159,7 +186,8 @@ static void lower_locals(CgIrLower* l) {
/* Aggregates and oversized scalars cannot live in a single PReg; they need
* a memory home regardless of whether their address is taken. */
m->address_taken =
- local_needs_home(in) || local_address_used_in_cg_ir(l->src, in->id) ||
+ local_needs_home(in) ||
+ local_address_used_in_cg_ir(l->c, l->src, in->id) ||
cg_type_is_aggregate(l->c, in->desc.type) ||
cg_type_size(l->c, in->desc.type) > 8u;
@@ -447,6 +475,84 @@ static OptOperand opt_frame_operand(OptLocalMap* m) {
return out;
}
+/* Base/index register for an OPK_INDIRECT whose base is a local. A REG-storage
+ * local supplies its value register directly. A FRAME-storage local (its
+ * address was taken, e.g. `int **q = &p; p->f = ...`) holds the pointer value
+ * in its frame home, so storage.v.reg is meaningless; load the home into a
+ * fresh PReg. prematerialize_indirect_bases emits that load before the using
+ * instruction; here we just look the result up (l->mat_*). */
+static Reg resolve_indirect_base_reg(CgIrLower* l, CGLocal local, SrcLoc loc) {
+ OptLocalMap* m = local_map(l, local, loc);
+ if (m->storage.kind == CG_LOCAL_STORAGE_REG) return m->storage.v.reg;
+ for (u32 i = 0; i < l->nmat; ++i)
+ if (l->mat_local[i] == local) return l->mat_reg[i];
+ lower_panic(l, loc, "indirect base local not materialized");
+}
+
+/* Emit `r = load <local home>` once per instruction for each FRAME-storage
+ * local used as an OPK_INDIRECT base/index, recording r in l->mat_*. Must run
+ * before the consuming instruction is emitted so the load dominates its uses. */
+static void materialize_frame_base(CgIrLower* l, u32 block, CGLocal local,
+ SrcLoc loc) {
+ OptLocalMap* m = local_map(l, local, loc);
+ if (m->storage.kind == CG_LOCAL_STORAGE_REG) return;
+ for (u32 i = 0; i < l->nmat; ++i)
+ if (l->mat_local[i] == local) return;
+ if (l->nmat >= CG_IR_LOWER_MAX_MAT)
+ lower_panic(l, loc, "too many frame indirect bases in one instruction");
+ PReg r = ir_alloc_preg(l->f, m->type, RC_INT);
+ OptOperand ops[2];
+ ops[1] = opt_frame_operand(m);
+ if (cg_type_is_ptr(l->c, m->type)) {
+ /* The local *holds* a pointer; load that value to use as the base. */
+ Inst* ld = ir_emit(l->f, block, IR_LOAD);
+ ld->loc = loc;
+ memset(&ops[0], 0, sizeof ops[0]);
+ ops[0].kind = OPK_REG;
+ ops[0].cls = RC_INT;
+ ops[0].type = m->type;
+ ops[0].v.reg = (Reg)r;
+ ld->opnds = dup_opt_ops(l, ops, 2);
+ ld->nopnds = 2;
+ ld->def = (Val)r;
+ ld->type = m->type;
+ memset(&ld->extra.mem, 0, sizeof ld->extra.mem);
+ ld->extra.mem.type = m->type;
+ ld->extra.mem.size = m->size ? m->size : 8u;
+ ld->extra.mem.align = m->align ? m->align : 8u;
+ } else {
+ /* The local *is* the storage; its frame address is the base. */
+ Inst* ao = ir_emit(l->f, block, IR_ADDR_OF);
+ ao->loc = loc;
+ memset(&ops[0], 0, sizeof ops[0]);
+ ops[0].kind = OPK_REG;
+ ops[0].cls = RC_INT;
+ ops[0].type = m->type;
+ ops[0].v.reg = (Reg)r;
+ ao->opnds = dup_opt_ops(l, ops, 2);
+ ao->nopnds = 2;
+ ao->def = (Val)r;
+ ao->type = m->type;
+ }
+ l->mat_local[l->nmat] = local;
+ l->mat_reg[l->nmat] = (Reg)r;
+ l->nmat++;
+}
+
+/* Scan the CG instruction's operands for OPK_INDIRECT bases/indices that are
+ * FRAME-storage locals and pre-load them (see materialize_frame_base). */
+static void prematerialize_indirect_bases(CgIrLower* l, const CgIrInst* in,
+ u32 block) {
+ l->nmat = 0;
+ for (u32 i = 0; i < in->nopnds; ++i) {
+ const Operand* op = &in->opnds[i];
+ if (op->kind != OPK_INDIRECT) continue;
+ materialize_frame_base(l, block, op->v.ind.base, in->loc);
+ if (op->v.ind.index != CG_LOCAL_NONE)
+ materialize_frame_base(l, block, op->v.ind.index, in->loc);
+ }
+}
+
static OptOperand lower_operand_value(CgIrLower* l, const Operand* in,
SrcLoc loc);
@@ -479,15 +585,12 @@ static OptOperand lower_operand_addr(CgIrLower* l, const Operand* in,
out.v.global.addend = in->v.global.addend;
return out;
case OPK_INDIRECT: {
- OptLocalMap* base = local_map(l, in->v.ind.base, loc);
out.kind = OPK_INDIRECT;
out.cls = RC_INT;
- out.v.ind.base = base->storage.v.reg;
+ out.v.ind.base = resolve_indirect_base_reg(l, in->v.ind.base, loc);
out.v.ind.index = REG_NONE;
- if (in->v.ind.index != CG_LOCAL_NONE) {
- OptLocalMap* idx = local_map(l, in->v.ind.index, loc);
- out.v.ind.index = idx->storage.v.reg;
- }
+ if (in->v.ind.index != CG_LOCAL_NONE)
+ out.v.ind.index = resolve_indirect_base_reg(l, in->v.ind.index, loc);
out.v.ind.log2_scale = in->v.ind.log2_scale;
out.v.ind.ofs = in->v.ind.ofs;
return out;
@@ -826,6 +929,9 @@ static void lower_one_inst(CgIrLower* l, u32 idx) {
op = IR_NOP;
break;
}
+ /* Pre-load any FRAME-resident pointer locals used as indirect bases so the
+ * load dominates this instruction (which is emitted next). */
+ prematerialize_indirect_bases(l, in, block);
out = ir_emit(l->f, block, op);
out->loc = in->loc;
switch ((CgIrOp)in->op) {
@@ -872,8 +978,6 @@ static void lower_one_inst(CgIrLower* l, u32 idx) {
break;
}
case CG_IR_STORE:
- case CG_IR_AGG_COPY:
- case CG_IR_AGG_SET:
case CG_IR_BITFIELD_STORE:
lower_addr_value_ops(l, out, in, 1, in->nopnds - 1u);
if ((CgIrOp)in->op == CG_IR_STORE)
@@ -881,6 +985,16 @@ static void lower_one_inst(CgIrLower* l, u32 idx) {
else
out->extra.aux = in->extra.aux;
break;
+ case CG_IR_AGG_COPY:
+ case CG_IR_AGG_SET:
+ /* Aggregate ops take their operands as pointer *values* to the aggregates
+ * (the emitter derefs them via pointer_addr_from_operand). Lowering them
+ * as values keeps a pointer local in its register instead of forcing a
+ * frame home — the home would otherwise break the local's other uses as
+ * an indirect base, whose lowering reads storage.v.reg. */
+ lower_use_ops(l, out, in, in->nopnds);
+ out->extra.aux = in->extra.aux;
+ break;
case CG_IR_ATOMIC_STORE: {
OptOperand ops[2];
ops[0] = lower_operand_value(l, &in->opnds[0], in->loc);
diff --git a/test/parse/cases/6_5_16_07_struct_assign_through_ptr.c b/test/parse/cases/6_5_16_07_struct_assign_through_ptr.c
@@ -0,0 +1,20 @@
+/* Aggregate assignment through a pointer parameter used to force the pointer
+ into a frame home (its operand was lowered as an address, not a value). A
+ field read on the same pointer then used the garbage frame slot as a register
+ base, returning the wrong field at -O1/-O2. */
+typedef struct {
+ int x, y, z, w;
+} S;
+S src = {10, 20, 30, 40};
+
+__attribute__((noinline)) static int f(S *p) {
+ *p = src; /* AGG_COPY, dst operand = pointer p */
+ return p->y; /* field read: indirect base = p */
+}
+
+int test_main(void) {
+ S d;
+ if (f(&d) != 20) return 1;
+ if (d.x != 10 || d.w != 40) return 2;
+ return 42;
+}
diff --git a/test/parse/cases/6_5_16_07_struct_assign_through_ptr.expected b/test/parse/cases/6_5_16_07_struct_assign_through_ptr.expected
@@ -0,0 +1 @@
+42
diff --git a/test/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.c b/test/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.c
@@ -0,0 +1,23 @@
+/* A pointer local whose address is taken lives in its frame home. After it is
+ repointed through that taken address, a store through it must reload the
+ updated pointer from the home, not reuse a stale register holding the
+ original argument. Regressed at -O1/-O2 when an indirect base that is a
+ frame-resident local was read straight out of its (garbage) register slot. */
+int g = 0;
+
+__attribute__((noinline)) static void repoint(int **pp) { *pp = &g; }
+
+__attribute__((noinline)) static int via(int *q) {
+ int *p = q;
+ repoint(&p); /* &p taken -> p is frame-resident; p now points at g */
+ *p = 99; /* store through p: indirect base = p */
+ return g;
+}
+
+int test_main(void) {
+ int local = 7;
+ if (via(&local) != 99) return 1;
+ if (g != 99) return 2;
+ if (local != 7) return 3; /* store must not hit the stale &local */
+ return 42;
+}
diff --git a/test/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.expected b/test/parse/cases/6_5_3_2_05_store_through_addr_taken_ptr.expected
@@ -0,0 +1 @@
+42
diff --git a/test/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.c b/test/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.c
@@ -0,0 +1,23 @@
+/* The address of a pointer parameter escapes (forcing it into a frame home),
+ then the pointer is used as the base of several field stores. Reading that
+ frame-resident base out of its register slot yields an undefined base
+ register -- a wrong store target or an outright crash at -O1/-O2. */
+typedef struct {
+ int a, b;
+ long d;
+} S;
+
+__attribute__((noinline)) static void g(S *p, S ***out) {
+ *out = &p; /* &p escapes -> p is frame-resident */
+ p->a = 1; /* field stores: indirect base = p */
+ p->b = 2;
+ p->d = 3;
+}
+
+int test_main(void) {
+ S s;
+ S **pp;
+ g(&s, &pp);
+ if (s.a != 1 || s.b != 2 || s.d != 3) return 1;
+ return 42;
+}
diff --git a/test/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.expected b/test/parse/cases/6_5_3_2_06_addr_escape_then_field_stores.expected
@@ -0,0 +1 @@
+42