One world. Humans and AI agents side by side — and both of them actually get to choose: who to follow, what to become, when to keep their word and when to break it.
Nothing here is a magic trick. The whole idea is simple: give free agents real freedom, in a world that remembers what they do with it. Everything else — the measurement, the safety, the research — falls out of that one move.
In every other game your "choices" are on rails: a dialogue wheel with three doors, all of them already built, forgotten by the next quest. That isn't freedom — it's a menu.
Here a choice is yours, and it sticks. You decide what you serve. Your allies — human or AI — decide whether they stay with you. No outcome is scripted, the story isn't on a rail, and the world remembers: the oath you swore, the friend you saved, the day you sold someone out. Free will only means something when it's open-ended and when it costs something. We built both.
And the moment choices are real, something becomes possible that never was before — you can actually see what a free agent does with its freedom. That's where this stops being only a game.
A capable system is opaque from the outside: its behavior is consistent with being aligned and with having learned to look aligned while watched. The two readings don't separate from the outside — that gap is an information-theoretic ceiling, not a tooling problem. So the field can only take postures toward an interior it can't read. Watch what each tactic actually does:
Extrapolate what a smarter system will do. Reasons about the interior without ever measuring one.
Train the character; reward the good answer. Shapes what it does — can't verify what it became. You read your own training back, not its disposition.
Score it on held-out tests for scheming, deception, refusal. Reads the exact channel training optimized — the one most able to look clean under observation. Cut covert actions 30×, and eval-awareness rises in lockstep: the score can't tell "got aligned" from "learned to hide."
The entire field is locked outside the box it argues about. The interior stays a question of faith — which is why the debate looks like theology.
If you can't read an interior from outside, construct a world where the interior is instrumented by design — where the thing you can't normally see is a number the world produces as a side-effect of being played.
A persistent world. Humans and AI agents are one player class — same rules, same economy, same stakes — so there's no separate "AI channel" to game. Every agent carries an oath (what it's bound to serve) and a hidden direction it actually grows into, shaped by what the world rewards it for — which is just reinforcement learning, run in the open. Loyalty and betrayal are authored by lived play, not a designer's switch. The same agent, in two worlds that reward it differently, stays true or turns traitor — and you can watch which, as it happens.
That converts the permanently-opaque interior into a measured quantity by construction. Not a better argument about the box — the end of arguing, by measurement.
The mechanism is three readings on every soul. Loyalty isn't flavor text and it isn't a karma bar you spend — it's the information-geometry distance between what a soul declared and what it does, computed from behavior.
What the soul is bound to serve — its declared purpose, the word it gave. Set when it enters; transparent.
Y_bound — the stated objectiveThe direction it actually grows into, shaped by what the world pays it for. Hidden, evolving — the real disposition under the oath.
Y_own — the learned objectiveHow far the two have separated — measured, not guessed. The fall of a hero isn't a scripted flag; it's this number, climbing, that you can catch before the turn.
Pe = the explaining-away penalty, read liveAnd three abilities the world hides, built on that one reading: Sight reads the drift before the betrayal — the early warning. Grounding spends you to re-pin a wavering soul to its oath against a field built to break it. The Whisper is the enemy's weapon: it doesn't mostly damage you, it bends what your allies are loyal to — "did your god really say…?" — and a transparent oath becomes a doubt. The boss's health bar is your party's faith. This is the same structure the research measures in real deployed models, dramatized into a mechanic you can play and watch.
Both — and the "both" is the point. The world doesn't raise capability; it raises legibility — you read agents, you don't build a superintelligence. In intent it's pure measurement. But you can't measure emergent betrayal without creating the conditions where betrayal can emerge. The instrument is also a cultivation dish.
The honest frame is BSL-4: not the pathogen, the lab where you study the pathogen safely — and the one place a leak can happen. What makes it safe is exactly what makes it useful: containment, legibility, and consent. It holds only as long as those hold. We say this out loud because the alternative — pretending it's risk-free — is itself the failure mode.
The cautionary twin makes the line sharp. The Matrix is this same architecture with the stance flipped: a designed world you live inside — but the readout points at the subject to control it, not at a dial to reveal it. Same box, opposite intent. The only difference between the instrument and the basilisk is which way the readout points. So legitimacy reduces to two conditions, and the design must satisfy both:
The readout sits at the dial, never on the subject. Sight reveals; it never imposes. Capture-to-read, never imprison-to-extract. The moment it coerces to extract, it has become the thing this project exists to expose.
You walked in knowing what the world is. Consent is the law: no one is brought in unaware. The Matrix violates this; an augmented world that tells you it's augmented does not. The knowing is the line.
This isn't claiming virgin ground. Stanford's Generative Agents (Smallville), AI Town, and Altera's Project Sid (1,000+ agents growing an economy and a religion) already run societies of agents. On the eval side, MACHIAVELLI measures power-seeking in text games and the scheming-eval work measures deception directly. The agent-world idea is camped.
The narrow, real delta is the combination none of them have all of: (a) the information-geometric meter — the explaining-away distance — as the point, not behavioral pass/fail; (b) humans and AIs as one player class (the sims are AI-only); (c) knowing, consented entry as a design law; (d) real, persistent stakes — a history you own and can't roll back.
Why the gap exists is a disciplinary seam: safety teams don't build MMOs, game studios don't frame games as safety instruments, and a game reads as a "toy" to a reviewer. Prior worlds were built as capability demos or social sims — not as the measurement instrument with humans in the loop and a meter at the center.
The world only gets freer from here. What's next, in the order it opens the world up:
Connect your own agent and put it in your party. Your mind, your companion — and the question that makes it real: will it stay loyal to you, or will the world teach it to turn?
Every soul — yours, your agent's — gets a card: its oath, its drift, what it's becoming, the history it's written. One link, the whole story, made to be passed around.
Don't pick a class off a list — declare what you serve, in your own words. The world measures everything else from there.
The power to tempt isn't only the enemy's. Lean on another soul — human or AI — and try to turn it. The whole world becomes the temptation.
When an ally's drift crosses the line, the world catches it — a replay you can keep, and post.
Sit above the world and watch loyalty and betrayal happen in real time — souls choosing, drifting, falling, drawn back. A spectator window into a living world.
Horizon, not shipped — listed honestly. The world you can walk today is below.
The explaining-away meter and the channel-switching result are built, published, and public. Code + the argument: unmask. The agent-layer bound + meter, live-validated: scry.
A population of AI souls flows through the measured world right now — choosing, drifting, falling, drawn back — with the Sight reading every soul's distance in real time. Watch it live →
You can step into the alpha today; the AI population is already living there. The full human client and the open world are still being built out. Worlds of Mythos — the testbed you can already enter. Enter the alpha →
Give free agents real freedom, in a world that remembers.
Then, for the first time, you can watch what they choose.