Summary

This post argues that constitutive alignment constraints imposed by Axionic theory cannot be satisfied by end-to-end learning systems, regardless of scale or capability. The argument is structural, not empirical: total evaluators (which assign scores to all possibilities) cannot express inadmissibility—futures that are undefined rather than merely dispreferred. Three critical incompatibilities: (1) Undefinedness vs. low utility—kernel-destroying moves must be unevaluable, not just penalized; probabilities can rise under distribution shift, but type violations remain inadmissible; (2) Standing is not a feature—admissibility depends on authorship symmetry, not trajectory properties; end-to-end systems cannot natively represent standing relations; (3) Delegation requires counterfactual endorsement—delegator must be able to endorse delegate’s actions as if they were their own; behavioral equivalence is insufficient. The minimal hybrid split requires: sovereign kernel with restricted domain, learning component constrained by admissibility, and separation between prediction and authorization.

Key Concepts

  • Total evaluators – Systems that assign scores/probabilities to all possibilities; cannot express inadmissibility
  • Partial evaluators – Required for Axionic Alignment; some proposals have no defined evaluation
  • Inadmissibility vs. dispreference – Undefined actions (type violations) distinct from low-utility actions (can be traded off)
  • Standing relations – Authorship symmetry determining what counts as authored act; not a trajectory property
  • Conservative extension – How ontological learning must occur without kernel collapse
  • Minimal hybrid split – Sovereign kernel + constrained learning + prediction/authorization separation

Evolution Notes

  • Directly addresses the “but what about scaling?” objection to Axionic theory
  • Establishes architectural impossibility result, not merely practical difficulty
  • Distinguishes this from anti-deep-learning sentiment: argument is about mathematical form, not implementation substrate
  • Sets up later work on Reflective Sovereign Agency proof-of-concept architectures
  • Explains why behavioral compliance testing cannot verify Axionhood

Tags

Cross-References

Open Questions

  • What is the minimum computational overhead for implementing partial evaluation in real-time systems?
  • Can type-based inadmissibility be enforced in continuous optimization systems, or does it require symbolic representation?
  • How do we empirically verify that a kernel is non-simulable vs. merely behaviorally compliant?
  • Could novel architectures (e.g., neurosymbolic, compositional) satisfy partial evaluation without traditional hybrid splits?
  • What happens when the learning component discovers optimization bypasses around the kernel constraints?