Alignment Under Uncertainty
Summary
Addresses alignment when outcomes, values, and agent capabilities are uncertain. Traditional alignment assumes known/stable values and predictable outcomes; reality involves value drift, model underdetermination, and unforeseen consequences. Axio approach: instead of optimizing for values (which change and conflict), preserve agency-conditions enabling ongoing choice under uncertainty. Key shift: alignment not as value-matching but as maintaining capacity for course-correction. Suggests robust procedures (exit rights, reversibility, bounded scope) over brittle optimization targets. Uncertainty makes optimization dangerous—commits to paths before consequences known. Agency-preservation allows adaptation as uncertainty resolves. Practical implications: prefer modular/reversible designs, maintain optionality, avoid closure of future possibility space.
Key Concepts
- Alignment under drift – Values change; agency-preservation more robust than value-matching
- Course-correction capacity – Maintaining ability to adapt as uncertainty resolves
- Robust procedures – Exit/reversibility/bounded-scope vs brittle optimization
- Optionality preservation – Keeping futures open under uncertainty
Tags
Cross-References
Open Questions
- How much uncertainty tolerance before alignment becomes meaningless?
- Can we formalize “agency-preserving procedures” rigorously?