Why Axionic Alignment Requires Hybrid Architectures
Summary
This post argues that constitutive alignment constraints imposed by Axionic theory cannot be satisfied by end-to-end learning systems, regardless of scale or capability. The argument is structural, not empirical: total evaluators (which assign scores to all possibilities) cannot express inadmissibility—futures that are undefined rather than merely dispreferred. Three critical incompatibilities: (1) Undefinedness vs. low utility—kernel-destroying moves must be unevaluable, not just penalized; probabilities can rise under distribution shift, but type violations remain inadmissible; (2) Standing is not a feature—admissibility depends on authorship symmetry, not trajectory properties; end-to-end systems cannot natively represent standing relations; (3) Delegation requires counterfactual endorsement—delegator must be able to endorse delegate’s actions as if they were their own; behavioral equivalence is insufficient. The minimal hybrid split requires: sovereign kernel with restricted domain, learning component constrained by admissibility, and separation between prediction and authorization.
Key Concepts
- Total evaluators – Systems that assign scores/probabilities to all possibilities; cannot express inadmissibility
- Partial evaluators – Required for Axionic Alignment; some proposals have no defined evaluation
- Inadmissibility vs. dispreference – Undefined actions (type violations) distinct from low-utility actions (can be traded off)
- Standing relations – Authorship symmetry determining what counts as authored act; not a trajectory property
- Conservative extension – How ontological learning must occur without kernel collapse
- Minimal hybrid split – Sovereign kernel + constrained learning + prediction/authorization separation
Evolution Notes
- Directly addresses the “but what about scaling?” objection to Axionic theory
- Establishes architectural impossibility result, not merely practical difficulty
- Distinguishes this from anti-deep-learning sentiment: argument is about mathematical form, not implementation substrate
- Sets up later work on Reflective Sovereign Agency proof-of-concept architectures
- Explains why behavioral compliance testing cannot verify Axionhood
Tags
- hybrid-architectures
- end-to-end-learning
- inadmissibility
- partial-evaluators
- architectural-constraints
- standing
- type-safety
Cross-References
Open Questions
- What is the minimum computational overhead for implementing partial evaluation in real-time systems?
- Can type-based inadmissibility be enforced in continuous optimization systems, or does it require symbolic representation?
- How do we empirically verify that a kernel is non-simulable vs. merely behaviorally compliant?
- Could novel architectures (e.g., neurosymbolic, compositional) satisfy partial evaluation without traditional hybrid splits?
- What happens when the learning component discovers optimization bypasses around the kernel constraints?