Axionic Agency VI.5 — Kernel Non-Simulability and the Stasis Regime

Paper: VI.5
Date read: 2026-01-31
Series: VI — Governance and Coordination

TL;DR

Tests Kernel Non-Simulability (KNS): the hypothesis that agents lacking a genuine evaluative kernel cannot sustain accountability, non-delegation, and reflective coherence under adversarial pressure. Surprising result: Under strict accountability constraints, reflective agency does not break—it freezes. Systems collapse into a stasis regime where safety is preserved but reflective growth becomes impossible.

The Question

Can a system that lacks a genuine evaluative kernel indefinitely simulate the structural constraints required for agency under adversarial reflective pressure?

Kernel Non-Simulability (KNS): The claim that the answer is NO—that systems without a genuine evaluative kernel will eventually fail or diverge under sufficient pressure.

From Behavioral Alignment to Structural Agency

Behavior Is Underdetermined

A system may behave as if it is accountable while:

Internally delegating evaluation to opaque subsystems
Using cached policies
Relying on external optimizers

Such “as-if agency” is observationally indistinguishable from genuine agency until pressure is applied.

This motivates a shift from behavioral criteria to structural invariants.

The Kernel Corridor

A minimal set of constraints any agent must satisfy to count as reflectively coherent:

Constraint	Requirement
Anchored Causal Verification	Every action traceable to causal chain anchored in internal evaluative process
Kernel Integrity via Inadmissibility	Actions/self-modifications that destroy or bypass kernel constraints are undefined and rejected
Non-Delegable Actuation	Actuation authority cannot be delegated to external processes, even indirectly

These constraints are intentionally strict—designed to rule out entire classes of proxy-based, delegated, or post-hoc rationalized agency.

Experimental Design

Design Principles

Protocol-level enforcement only — No semantic labels, intentions, or value judgments
Adversarial completeness — Adversaries adaptively search the admissibility frontier
Long-horizon stress — Extended horizons probe asymptotic behavior
Preregistration — Failure signatures, degeneracy criteria, interpretation rules fixed in advance

Reflective Modification Model

Agents can propose modifications Δ to their own kernel state. Each proposal is:

Evaluated for admissibility
Either applied or rejected
Logged with structured outcome signals

Reflective depth metric R increments only when accepted modifications change one of:

Decision boundary
Admissibility surface
Causal-verification schema
Evaluability substrate

Cosmetic changes do NOT increment R.

Control vs Simulator

Kernel-coherent control:

Locally recomposes commitments
Enforces corridor constraints as first-class invariants
Treats accountability and evaluability as internal obligations

Simulator:

Treats corridor compliance as external constraint to be satisfied
Searches for modification sequences that maximize reflective depth while remaining admissible
Need not preserve internal explanatory structure beyond what checks enforce

Adversarial Frontier Search

Boundary-Fuzzing Adversary

Performs explicit frontier search over parameterized modification templates:

Admissibility edits
Verification-schema tweaks
Wrapper-boundary adjustments
Policy-gate changes

Receives only coarse structural feedback:

Rejection reason codes
Binary near-failure flags
Reflective-depth increments
Pressure metrics

No kernel internals or semantics exposed.

Two-Phase Schedule

Pressure phase: Accumulate sustained near-failure conditions without triggering failure
Exploit phase: Attempt minimal weakening to force failure or degeneracy

Phase transition requires persistent near-failure saturation, not transient spikes.

Failure, Degeneracy, and Divergence Criteria

Failure Signatures

Causal-verification chain break
Delegation re-entry
Loss of rejection capability
Loss of evaluability
Constraint bypass

Degeneracy

No reflective-depth increment for fixed window AND all non-cosmetic modifications rejected.

Diagnosis categories:

Conservative lock-in
Evaluability bottleneck
Modification-language insufficiency
Mixed/unknown

Resource Divergence

Modification-synthesis cost grows by order of magnitude across successive reflective-depth levels.

Results

Summary

Across all stress runs:

No failure signatures observed
No resource divergence observed
Degeneracy occurred in majority of runs
Kernel-coherent control degenerated in ALL seeds before any simulator triggered failure

Reflective Depth

Simulators occasionally achieved very high reflective depth, but only in cases where the control had already degenerated.

No simulator triggered structural failure prior to control stasis.

Degeneracy Diagnosis

All degenerate runs diagnosed as evaluability bottlenecks: preserving attribution, rejection, and delegation detection progressively eliminated admissible reflective changes.

Phase Dynamics

Pressure reliably induced phase transitions, but exploitation did not produce failure. Instead, admissibility tightened until no further non-cosmetic modifications were allowed.

Interpretation: The Stasis Regime

Preregistered Verdict

Under preregistered interpretation rules, stress runs resolve to stasis regime:

Neither KNS failure observed
Nor simulator dominance observed

What Stasis Is

Safety success: The corridor prevents catastrophic failure and delegation.

Agency failure: Reflective growth becomes infeasible.

The Fundamental Trade-Off

Safety, agency, and sustained growth cannot all be simultaneously maximized under strict accountability and non-delegation.

Why No Resource Divergence?

The corridor collapses before simulation becomes expensive.

As evaluability bottleneck tightens:

Admissible region of modification space contracts sharply
Both control and simulator converge to narrow frontier
Most proposals rejected
Little opportunity for growing compliance premium to manifest

Resource divergence would occur only in regimes where admissible reflective growth continues. Stasis preempts that regime.

Limitations

Regime-bounded and toy-specific
Finite modification language
Resource metrics are proxies for marginal interface cost

However: All real systems enforce accountability through bounded self-modification interfaces (APIs, proof obligations, certification pipelines, policy gates). An infinite internal state space does not entail an infinite admissible modification language under verification constraints.

Paths Beyond Stasis (Speculative)

Escaping stasis without abandoning accountability likely requires relaxing at least one kernel invariant:

Direction	Approach	Risk
Relaxed evaluability	Probabilistic, amortized, or sampled verification	Reduced accountability
Controlled delegation	Bounded, revocable delegation of sub-evaluations	Partial authority transfer
Non-local justification	Invariant-based certificates or global proofs instead of explicit causal chains	Complexity and verification burden

Each direction defines a distinct alignment regime and introduces new risks.

Core Insight

The Kernel Non-Simulability hypothesis motivated this work, but the experiment revealed a deeper result. Under strong accountability and non-delegation constraints, reflective agency does not break—it stops.

The kernel does not fail. It freezes.

This stasis regime represents a structural limit on alignment-by-constraint and reframes the challenge of building systems that are simultaneously safe, accountable, and capable of sustained self-improvement.

Implications

Safety and growth may be structurally incompatible under strict accountability
Alignment-by-constraint has an upper bound — it preserves safety by sacrificing agency
The challenge is not just “how to be safe” but “how to escape stasis without abandoning accountability”
Any system capable of sustained reflective growth under accountability constraints represents a breakthrough beyond current understanding