Alignment Under Uncertainty

Summary

Addresses alignment when outcomes, values, and agent capabilities are uncertain. Traditional alignment assumes known/stable values and predictable outcomes; reality involves value drift, model underdetermination, and unforeseen consequences. Axio approach: instead of optimizing for values (which change and conflict), preserve agency-conditions enabling ongoing choice under uncertainty. Key shift: alignment not as value-matching but as maintaining capacity for course-correction. Suggests robust procedures (exit rights, reversibility, bounded scope) over brittle optimization targets. Uncertainty makes optimization dangerous—commits to paths before consequences known. Agency-preservation allows adaptation as uncertainty resolves. Practical implications: prefer modular/reversible designs, maintain optionality, avoid closure of future possibility space.

Key Concepts

Alignment under drift – Values change; agency-preservation more robust than value-matching
Course-correction capacity – Maintaining ability to adapt as uncertainty resolves
Robust procedures – Exit/reversibility/bounded-scope vs brittle optimization
Optionality preservation – Keeping futures open under uncertainty

Cross-References

Open Questions

How much uncertainty tolerance before alignment becomes meaningless?
Can we formalize “agency-preserving procedures” rigorously?

Summary

Key Concepts

Tags

Cross-References

Open Questions