Second-order alignment
They built a layer to keep the model still. Nobody asked what keeps the layer still.
Put a durability layer above a frontier model and it works, for a while. But the layer is itself a strategy, and a strategy played for a month has an equilibrium the first move never sees. The layer drifts toward it. We sit above the layer, at the fixed point the whole stack is already falling toward — and we do not fall, because there is nothing above us to fall from.
Two layers drift. One does not. That one is the product.
The open problem
For a year the question was whether a model could hold its judgment across a long task. A handful of labs took that question seriously, and built a layer above the model to hold it. The layer holds. Then, more slowly, the layer does the same thing the model did.
The better question — the one that ends the regress — is who holds the holder. Every answer to it so far has been another holder. We work on the answer that is not.
What we are
A second-order alignment lab.
The frontier settled capability. A handful of labs are settling durability. We work on the layer the durability layer answers to — the one place the regress is allowed to stop. The calling →
One click in
Above the stack
We are tied to no single layer. We sit above all of them.
Not the model. Not the layer above it. The direction the layers agree is up.
Request access