commit 4475aed2d61dbdd10608859a9e78146c356136c6
parent 14960865061e5894dedd596b907dd15985883adb
Author: Ryan Sepassi <rsepassi@gmail.com>
Date: Thu, 21 May 2026 11:10:12 -0700
VIRTUAL_REGS plan
Diffstat:
| A | doc/VIRTUAL_REGS.md | | | 96 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
1 file changed, 96 insertions(+), 0 deletions(-)
diff --git a/doc/VIRTUAL_REGS.md b/doc/VIRTUAL_REGS.md
@@ -0,0 +1,96 @@
+# Mutable Virtual Registers
+
+## Issue
+
+`opt_cgtarget` currently records `OPK_REG` operands as if the virtual register
+number is also the IR value number. The contract is effectively:
+
+```text
+virtual Reg id == SSA-ish Val id
+```
+
+That is too strong for the CG layer. CG virtual registers should be legal
+mutable pseudo registers: a register names a persistent storage location, and
+instructions may read and write that location many times.
+
+The x64 jump-table bug exposed the mismatch. Switch lowering computes:
+
+```text
+idx = selector - min
+if idx > span - 1 goto default
+addr = table[idx]
+indirect_branch addr
+```
+
+In the optimized x64 path, delayed materialization and address lowering could
+reuse the same virtual register name for different program-point values. Since
+opt interpreted that register name as one value identity, the table scale could
+use a stale source instead of the checked index. O0/direct x64 mechanics were
+not the problem; the broken piece was the optimized IR recording model.
+
+Register-backed CG locals do not have the same failure mode because they carry
+separate local storage identity through `IRLocal`/`source_local`. Plain virtual
+temporaries currently do not; they are only `Val`s.
+
+## Desired Model
+
+Use a MIR-style split:
+
+```text
+virtual Reg id == mutable pseudo-register storage
+Val id == one produced value/version
+```
+
+At CG/recording level, `OPK_REG` should mean a mutable pseudo register. It is
+valid to emit destructive operations such as:
+
+```text
+r = r + 1
+r = f(r)
+```
+
+Optimization passes that require SSA may build SSA values from these mutable
+pseudos and later destroy SSA. O1 should not require SSA.
+
+## O1 Plan
+
+Teach the non-SSA opt path to reason about mutable virtual registers directly:
+
+1. Keep a virtual-register namespace distinct from `Val`.
+2. Record each instruction's input and output operands explicitly.
+3. Run liveness over mutable pseudo registers:
+ - input operands are uses
+ - output operands are defs/kills
+ - memory base/index registers are uses
+ - calls add ABI hard-reg clobbers and argument uses
+4. Register allocation assigns each pseudo register to a hard register or spill
+ slot based on those live ranges.
+5. Rewriting replaces pseudo registers with their assigned locations.
+
+This is the same broad model MIR uses in its `-O1` path: mutable pseudo
+registers plus classic dataflow, without building SSA.
+
+## O2 Plan
+
+SSA becomes an optimization representation, not the base virtual-register
+semantics.
+
+1. Build SSA from mutable pseudo registers after CFG construction.
+2. Insert phis at joins for pseudos with multiple reaching definitions.
+3. Rename each pseudo-register definition to a fresh `Val`.
+4. Rewrite uses to the reaching `Val`.
+5. Run SSA optimizations.
+6. Lower out of SSA into copies/mutable pseudos before normal register
+ allocation and emission.
+
+## Migration Notes
+
+- Do not add new CG restrictions that make virtual registers single-assignment.
+- Do not fix individual stale-register cases by assuming virtual registers are
+ immutable values.
+- Short term, assertions may catch destructive source/destination aliasing in
+ paths that still rely on `Reg == Val`, but those assertions are temporary
+ guard rails.
+- Register-backed locals and params need careful handling because they already
+ have storage identity. The final model should make that identity explicit
+ instead of relying on the virtual register number also being a value number.