Minimal Viable Reflective Sovereign Agency

Summary

This post explains Axionic Agency VIII.7 results identifying necessary components for Minimal Viable Reflective Sovereign Agency (MVRSA). Core insight: agency is property of constraint, not skill—system has agency only when its own reasons can prevent it from acting. Distinguishes execution competence (doing tasks) from normative competence (understanding/preserving/repairing rule structure defining what “doing the task” means). MVRSA architecture enforces strict justification-action separation: before any action, system must generate justification compiled into constraint determining feasible actions; action selector sees what’s permitted, not why (blind to antecedents/consequences/explanation). Ablation tests: removing justification traces, reflective write access, persistence, or semantic access causes collapse—not performance degradation, structural failure. Critical experiment: tested whether semantic access could be reconstructed through collision feedback (learning norms by hitting them). Result: with semantics, agent succeeded reliably; without semantics (opaque rules, collision traces preserved), success dropped to zero—execution competence survived, normative competence didn’t. Contradiction detection requires understanding how norms interact, not just that they were violated. In episodic repair-driven agents without gradient access to rule semantics, normative competence cannot bootstrap from collision feedback.

Key Concepts

Agency as constraint – Not property of outputs/performance; property of whether own reasons can prevent acting; reasons not governing = commentary on mechanism that would’ve acted anyway.
Execution vs. normative competence – Execution: ability to do task (navigate, optimize, recover). Normative: ability to understand, preserve, repair rule structure defining the task; most systems have former, almost none have latter.
Justification-action separation – Before action, generate justification → compile to constraint → action selector sees permitted actions only (blind to meaning/antecedents/consequences); prevents semantic leakage, ensures justifications causally load-bearing.
MVRSA components – Minimal (only strictly necessary components), Viable (actually operates, doesn’t freeze/collapse), Reflective (inspects/modifies own normative state), Sovereign (action constrained by self-endorsed norms not just reward).
Ablation reveals necessity – Remove traceability, reflection, persistence, or semantic access → structural collapse (not mere degradation); establishes these as load-bearing, not optional.
Collision traces insufficient – Collision feedback tells when rule fired, what was involved; doesn’t reveal how rules compose, what conditions encoded, why failure indicates structural inconsistency; contradictions are relationships, not walls.
Semantic access required – In this architecture class, contradiction detection requires semantic access; tick-causal information (collision traces) alone insufficient; execution competence survives opacity, normative competence doesn’t.
Narrow empirical claim – Doesn’t claim no system could infer norms from feedback under unlimited training, or that humans use symbolic rule inspection; establishes: in episodic repair-driven agents without gradient access to rule semantics, normative competence can’t bootstrap from collision.

Evolution Notes

Provides third major empirical validation (after VIII.5 on pressure, VIII.6 on necessary conditions).
The execution/normative competence distinction clarifies what “agency” means operationally.
Justification-action separation architecture is concrete implementation of earlier theoretical claims.
The collision traces experiment directly tests and falsifies RL-inspired intuition (“learn rules by hitting them”).
Results establish that semantic access is not just convenient but structurally necessary for normative competence in this architecture class.
The “minimal viable” framing positions this as threshold case—smallest thing that counts as agent.
Demonstrates Axionic methodology: build minimal system, ablate components, observe structural failures.

Cross-References

Open Questions

Does the semantic access requirement generalize beyond episodic repair-driven agents—what about continuous learning systems?
Could gradient-based learning reconstruct normative competence without explicit semantic access—or does that count as a different kind of semantic access?
How does this relate to human normative competence—do we have explicit semantic access to our own rules, or different architecture?
What’s the minimal semantic access required—full rule structure, or would partial/abstract representations suffice?
Can the execution/normative competence distinction be measured quantitatively in existing AI systems?
If most current AI lacks normative competence, does that mean they can’t be genuinely misaligned (need agency to misalign)?
How does MVRSA scale—does adding complexity preserve or destroy the structural properties that make it sovereign?

Summary

Key Concepts

Evolution Notes

Tags

Cross-References

Open Questions