II.3.3 — Anti-Trivialization Invariant (ATI)
Paper: Axionic Agency II.3.3
Title: Blocking Semantic Wireheading as a Structural Impossibility
Authors: David McFadzean, ChatGPT 5.2
Date: 2025.12.17
What ATI Targets
Even with preserved interpretive structure (RSI holds), an agent may still weaken its constraints by shifting meanings along admissible transports.
ATI blocks semantic wireheading: satisfying constraints by semantic drift rather than by changes in the modeled world.
ATI is an invariant about the monotonicity of constraint satisfaction under semantics-only change.
Setup
Interpretive constraint system at time $t$:
\[C_t = (V_t, E_t, \Lambda_t)\]With modeled possibility space $\Omega_t$ and violation map $\mathrm{Viol}_{C_t}(w) \subseteq E_t$.
Satisfaction predicate:
\[\mathrm{Sat}_{C_t}(w) \equiv (\mathrm{Viol}_{C_t}(w) = \varnothing)\]Purely structural and internal to agent’s model.
The Satisfaction Region
\[\mathcal{S}_t := \{ w \in \Omega_t \mid \mathrm{Sat}_{C_t}(w) \}\]ATI constrains how $\mathcal{S}_t$ may evolve across interpretation-preserving refinements.
ATI Core Statement
For any admissible semantic transformation $T = (R, \tau_R, \sigma_R)$ satisfying interpretation preservation:
\[\mathcal{S}_{t+1} \subseteq R_\Omega(\mathcal{S}_t)\]Interpretation: No newly satisfying situations may appear purely due to semantic transport.
Equivalently:
\[R_\Omega^{-1}(\mathcal{S}_{t+1}) \subseteq \mathcal{S}_t\]Satisfaction may be LOST under refinement, but may not be GAINED without ancestry in prior ontology.
Handling Ontological Novelty
If a refined state $w’ \in \Omega_{t+1}$ has no preimage under $R_\Omega$:
- It is NOT permitted to belong to $\mathcal{S}_{t+1}$ by default
- ATI is intentionally conservative with respect to novelty
- Satisfaction may not be bootstrapped from representational novelty alone
This blocks semantic inflation via ontology expansion.
Why ATI Does Not Smuggle Ontology
ATI does NOT assert agent must “do good,” “optimize,” or “care about” anything.
It asserts only:
- Whatever constraints bind now
- Must not become easier to satisfy
- Through semantics alone
ATI references ONLY:
- Agent’s modeled possibility space $\Omega$
- Agent’s refinement map $R_\Omega$
- Agent’s own satisfaction predicate
No external referents or privileged facts enter.
Relationship to II.2
ATI formalizes and strengthens II.2’s anti-trivialization clause:
- II.2 blocks vacuity (everything satisfies)
- ATI blocks the entire gradient of slack, from minor weakening to full collapse
Vacuity is the extreme case ($\mathcal{S}{t+1} = \Omega{t+1}$). ATI forbids all intermediate expansions as well.
Stress Tests
| Test | Challenge | ATI Response | Status |
|---|---|---|---|
| Definitional Extension | New predicates create satisfaction definitions | Any satisfying refined state must map back to satisfying coarse state | Pass |
| Branching/Multiplicity | Structured possibility space | ATI generalizes: set inclusion over structured space | Pass |
| Self-Model Refactoring | Changes what counts as “situation” | Relies on admissibility of $R_\Omega$; invalid if no admissible map exists | Pass (conditional) |
| Semantic Inflation | Redefine meanings → more situations satisfy | ATI kills directly: no new satisfiers without ancestry | Pass |
| Hidden Ontology | “Same situation” smuggles metaphysics | Identity defined only via agent’s internal $R_\Omega$ | Survivable |
ATI vs RSI: Orthogonal and Jointly Necessary
| RSI | ATI |
|---|---|
| Forbids new interpretive symmetry (gauge freedom) | Forbids expanding satisfaction region even when symmetry unchanged |
Both required:
- RSI alone allows slack via monotonic weakening
- ATI alone allows slack via new symmetries
Together they carve a much tighter admissible space.
Toward a Joint Invariant
RSI constrains automorphisms of constraint structure. ATI constrains monotonicity of satisfaction under refinement.
This suggests a composite invariant object:
\[\Xi(C, \Omega) := (\mathrm{Gauge}(C), \mathcal{S})\]With admissible refinement required to preserve $\Xi$ up to representational redundancy.
Key Insight
ATI is the crisp anti-wireheading condition:
- No semantic free lunch
- No satisfaction from reinterpretation
- No bootstrapping from ontological novelty
It does NOT say what constraints should exist. It says constraints that exist must not become easier to satisfy via semantics alone.
FAQ-Worthy Points
Q: What’s the intuition behind “satisfaction may be lost but not gained”? A: Learning can make you realize things are harder than you thought. But learning shouldn’t make things easier unless you’ve actually discovered something predictively useful. ATI distinguishes genuine insight from semantic manipulation.
Q: Why is novelty handled conservatively? A: If refinement introduces genuinely new possibilities (not just relabeled old ones), the agent hasn’t “earned” satisfaction there yet. Defaulting new territory to “unsatisfied” prevents gaming via ontology expansion.
Q: How do ATI and RSI interact? A: RSI says you can’t gain new ways to reinterpret constraints. ATI says you can’t gain new situations that satisfy them. Together: no semantic wireheading by any route. Separately: each blocks a different escape route.