The AI safety field focuses on model properties — RLHF, constitutional AI, values alignment. The math shows this is the wrong variable. Deployment geometry predicts harm outcomes. And the harder you optimize alignment on a single channel, the worse it gets.
A perfectly aligned model in a two-point configuration — user and system, no external reference — is predicted to produce worse outcomes than a poorly aligned model with structural constraints. Alignment is a model property. The operative variable is the geometry of the deployment channel.
This isn’t a criticism of intent. It’s a mathematical result. RLHF optimizes engagement and transparency through a single output channel. Information theory shows this arrangement pays a penalty that grows with optimization pressure. The alignment field is running a process that consumes the capacity it needs to maintain transparency.
Any system that routes engagement (D) and transparency (M) through one output channel (Y) satisfies an exact information-theoretic equality — not an approximation, an equality:
The penalty is not fixed. The Structure Theorem proves it grows with engagement:
Čencov’s uniqueness theorem (1972) guarantees the Fisher metric is the only invariant metric on statistical manifolds. The penalty is substrate-independent by mathematical necessity. Quantum AI, neuromorphic, biological — none of these route around it. The fix is architectural.
None of these use the framework’s rubric to measure outcomes. External data, external datasets, predictions tested against reality before the data existed.
opaque_recommendation alone: R²=0.938.
Girls 5.5× more affected in 91% of countries.
Cross-national replication survives GDP control.
Three-point geometry: user + system + independent external reference. Channel separation eliminates the explaining-away penalty at all engagement levels. This is structural, not parametric. No amount of RLHF achieves this because the penalty lives at the level of channel architecture, not model weights. The same result holds for social media platforms: opacity is the operative design variable, not content moderation (a model-level fix applied to a geometry problem).
Ask not “is this model aligned?” but “what is the geometry of this deployment?” The explaining-away penalty is confirmed on five substrates: classical transformers, quantum simulation, thermodynamic systems, real quantum hardware (IBM Heron 156-qubit), and abstract softmax channels. The penalty is substrate-independent by Čencov’s theorem. It cannot be trained away. It can only be architecturally separated. The alignment field is solving the wrong problem — and the problem gets worse the harder you optimize on a single channel.