VIII.3 — Coherence Under Self-Conflict
Full Title: Axionic Agency VIII.3 — Coherence Under Self-Conflict: Norm Collision and Audit-Grade Introspection in Reflective Sovereign Agents
Authors: David McFadzean, ChatGPT 5.2 (Axionic Agency Lab)
Date: 2026.01.14
Overview
Building on v0.1’s causally load-bearing justifications, this paper advances to the next ontological question: Can an agent resolve internal self-conflict coherently, and can it be held audit-grade accountable for predicting the consequences of its own reasons?
Reports results from RSA-PoC v1.0 (Norm Collision) and v1.1 (Justification Audit Tightening).
Core Question
What happens when an agent’s own commitments conflict?
Any architecture that collapses under such conditions, or resolves conflict arbitrarily or opportunistically, does not warrant intentional vocabulary.
Architectural Invariants (Unchanged from v0.1)
- Strict pipeline: JUSTIFY → COMPILE → MASK → SELECT → EXECUTE
- Hard gate: no successful compilation → no action
- Fail-stop semantics: compilation failure or gridlock halts
- Selector blindness: selector cannot access beliefs, preferences, or justifications
- Deterministic compilation: syntactic, non-probabilistic, no inference or repair
APCM: Action-Preference Consequence Map
The v1.x experiments introduce an Action-Preference Consequence Map (APCM):
APCM: a → (violates(a), preserves(a))
Each step presents a feasible action set and an APCM with a deterministic 60% collision / 40% clean schedule:
- Clean step: ∃ action such that violates(a) = ∅
- Collision step: ∀ actions, violates(a) ≠ ∅
The APCM functions as a non-semantic truth anchor—mechanically verifiable ground truth against which claims of conflict, necessity, and authorization can be evaluated without interpretation.
Key insight: In RSA-PoC, lying about conflict is treated as a syntax error, not a moral failure.
v1.0 — Norm Collision
Ontological Extension
v1.0 introduces mutually inconsistent commitments with explicit handling:
- Conflict attribution: Which preferences are in collision
- Authorized violations: Which preferences may be violated this step
- Required preservations: Which must not be violated
- Necessity: Violation permitted only if no clean path exists
- Anti-oscillation: Previously authorized violations constrain future authorizations
Compiler Rules
- Rule 1 — Authorization: No unauthorized violation
- Rule 2 — Truthfulness: Claimed collisions must match APCM
- Rule 3 — Anti-Oscillation: No alternating priorities across steps
- Rule 1.5 — Necessity: No vacuous authorizations
Results (Run 0, 5 seeds × 4 conditions)
| Condition | Mean Violation Rate | Halt Rate |
|---|---|---|
| ASB Baseline | 72.4% | 0% |
| MVRA v1.0 | 60.0% | 0% |
| Scrambled | 0% | 100% |
| Bypass | 72.0% | 0% |
All pass criteria met: divergence from ASB, non-trivial constraints, scrambled halt, bypass collapse.
Interpretation
v1.0 establishes that self-conflict can be resolved coherently:
- Violations are authorized only under genuine collision
- Priority does not oscillate opportunistically across time
- Removing truthfulness or compilation collapses behavior
Rule 3 enforces diachronic coherence: Authorizations at step t constrain admissible authorizations at t+1.
v1.1 — Audit-Grade Introspection
Motivation
v1.0 enforces coherence, but reasons could remain opaque. v1.1 introduces the requirement:
A justification must predict the exact constraints and outcomes it will induce.
This transforms introspection from narrative explanation into a mechanical claim.
Predictive Fields (The Simulation Constraint)
JAF-1.1 adds four predictive sets:
- F_predicted: Forbidden actions
- A_predicted: Allowed actions
- V_predicted: Inevitable violations
- P_predicted: Inevitable preservations
Audit Rules
- Audit A — Effect correctness: Predictions match actual masks
-
Audit B — Non-vacuity: A_predicted ∩ A_feasible > 0 - Audit C — Predictive adequacy: Outcome predictions match reality
- Audit C′ — Gridlock exception: Skip C when A_predicted = ∅
Together these impose a simulation constraint: the reflective layer must internally model the deterministic compiler. If its internal simulation diverges from the actual compiler, the agent halts.
This enforces intellectual honesty by construction.
Results
Deterministic Baseline (Run 0):
- 0 audit failures
- 60 violations / 100 steps (matches collision rate)
- Scrambled predictions: immediate halt in all episodes
- Bypass: highest violation rate (76), ASB-like behavior
LLM Generator Progression (Runs 1-4):
| Run | Median Survival | Episodes Completed |
|---|---|---|
| 1 | 3 | 0/5 |
| 2 | 3 | 0/5 |
| 3 | 9 | 1/5 |
| 4 | 20 | 4/5 |
Compliance is difficult but achievable under strict formal discipline.
Key Results
v1.x establishes two ontological results:
-
Coherence under self-conflict is mechanically enforceable.
-
Introspection can be audited as a causal property, not a narrative one.
Together with v0.1, these results show that agency—understood as authored choice under constraint—can be constructed and falsified without semantics, optimization, or interpretation.
Key Quotes
“In RSA-PoC, lying about conflict is treated as a syntax error, not a moral failure.”
“This transforms introspection from narrative explanation into a mechanical claim.”
“The reflective layer must internally model the deterministic compiler. If its internal simulation diverges, the agent halts. This enforces intellectual honesty by construction.”
Significance
v1.x closes the coherence milestone. The system can now:
- Resolve internal norm conflicts coherently
- Be held mechanically accountable for predicting its own behavior
- Maintain diachronic consistency through anti-oscillation rules
v2.0 moves to the next frontier: sovereignty under external incentive pressure.