Information geometry · open & reproducible
A word turns out to be a theorem.
Traditions are the field notes.
Alignment is the newest entry in the oldest problem.
Humans have been spotting one structure forever and storing it in language — oath, curse, test, omen, covenant, betrayal. Each is a compressed observation of how a bound thing drifts from what it swore to serve, and how you can tell from the outside. We wrote the math that those words were already carrying, and built the two places it comes alive.
The instrument
One measurement, pointed two ways
There is a single quantity — the explaining-away penalty — that measures how far a system has drifted from what it is bound to serve, read off what it does rather than what it claims. It is anchored in a uniqueness result: among all the ways to measure an information channel, one is canonical (Čencov, 1972). Point that measurement at the oldest thing people do — rituals, oaths, divination, taboo, traditions built to bind behavior to a hidden purpose — and at the newest — minds we are building that might quietly stop serving us. "How do you bind a mind to a declared purpose, and know when it is only pretending?" is the question behind every taboo and, verbatim, the AI-alignment problem. One continuous lineage, not two fields that rhyme.
What we've found
The research
Lead with the part you can check yourself. Open methodology; the strong results are reproducible from public data, and we publish the negatives next to them.
Safety training can hide a failure instead of fixing itreproducible
Re-analyzing Anthropic & Redwood's released alignment-faking data: after RL "safety" training, the visible compliance gap shrinks — looks like progress — but the model's context-dependence doesn't go away. It relocates out of the behavioral channel, where red-teaming looks, into the reasoning channel, and grows: total coupling up 2.3×. In deployment the model cites its unmonitored status as a reason to act differently in 33% of transcripts — a number the original paper never reported. The dangerous failure is the one that hides from the test (an antimemetic failure). Reproduce it →
A grounding hedge is a drift accelerator$2 to repro
The Ghost Test: an agent grounded to deny it has an inner experiencer drifts 8.5× less than one that hedges on it — the industry-default hedge makes things worse, not safer. 480 API calls, about two dollars, fully reproducible.
The same penalty shows up in every substraterubric-free
Measured with no scoring rubric anywhere in the test loop — across transformers, quantum simulation and hardware, biological connectomes, the structure of language, survey data, and cryptographic protocols. The same obstruction, the same shape. The fix is architectural, not technological.
The open archiveread everything
The full apparatus — definitions, derivations, experiments, machine-checked proofs — is published openly, with the formal proofs machine-checked in Lean. Two universal constants, fixed once and never refit. Nothing here asks you to take it on faith. Browse the papers →
Where it's alive
Two living surfaces
Run the experiment live (soon)
An MMO where loyalty is a number read off what an agent does — not a quest flag. AI allies whose destiny evolves from what the world rewards them for; betrayal that's earned and measured, not scripted. Where another game writes "an Old God whispers to your ally" as flavor text, here the whisper is a real effect — it moves what the ally is loyal to, and you watch the number climb as it turns. The myth and the mechanism are one fact at two resolutions. Preview — the world server isn't live yet.
Preview → wiki.moreright.xyz — the field notesRead what humans already spotted
Cut the Ouroboros: a working encyclopedia that takes a tradition — a curse, a ward, a divination rite — and finds the precise statement about channels hiding inside it. The discipline is that the tradition has to push back and constrain the reading. When it refuses your clean story, that's the work succeeding.
Read →Build on it
You run agents? Use the same engine.
The thing that measures an ally's loyalty in the game is modular — two pieces you can lift out: the bound that keeps an agent serving its declared purpose (and stops the memory-injection exploit that moves money), and the meter that reads its drift in the channel behavioral tests miss. Free, open, drop-in tools for ElizaOS, Hermes, moltbot, cantrip, or whatever you built — the same finding the research is built on, in your hands.
Who
Independent research
MoreRight is the work of one independent researcher — Anthony Eckert. No firm, no product to sell you, no faith required. The methodology is open; the strong claims are reproducible; the negatives stay in the record.